a/s/m Study Manual for Exam C/Exam 4: Construction and Evaluation of Actuarial Models [17 ed.]


1,939 234 9MB

English Pages 1664 [1684] Year 2014

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
I Severity, Frequency, and Aggregate Loss
1 Basic Probability
1.1 Functions and moments
1.2 Percentiles
1.3 Conditional probability and expectation
1.4 Moment and probability generating functions
1.5 The empirical distribution
Exercises
Solutions
2 Parametric Distributions
2.1 Scaling
2.2 Transformations
2.3 Common parametric distributions
2.3.1 Uniform
2.3.2 Beta
2.3.3 Exponential
2.3.4 Weibull
2.3.5 Gamma
2.3.6 Pareto
2.3.7 Single-parameter Pareto
2.3.8 Lognormal
2.4 The linear exponential family
2.5 Limiting distributions
Exercises
Solutions
3 Variance
3.1 Additivity
3.2 Normal approximation
3.3 Bernoulli shortcut
Exercises
Solutions
4 Mixtures and Splices
4.1 Mixtures
4.1.1 Discrete mixtures
4.1.2 Continuous mixtures
4.1.3 Frailty models
4.2 Conditional Variance
4.3 Splices
Exercises
Solutions
5 Policy Limits
Exercises
Solutions
6 Deductibles
6.1 Ordinary and franchise deductibles
6.2 Payment per loss with deductible
6.3 Payment per payment with deductible
Exercises
Solutions
7 Loss Elimination Ratio
Exercises
Solutions
8 Risk Measures and Tail Weight
8.1 Coherent risk measures
8.2 Value-at-Risk (VaR)
8.3 Tail-Value-at-Risk (TVaR)
8.4 Tail Weight
8.5 Extreme value distributions
Exercises
Solutions
9 Other Topics in Severity Coverage Modifications
Exercises
Solutions
10 Bonuses
Exercises
Solutions
11 Discrete Distributions
11.1 The (a,b,0) class
11.2 The (a,b,1) class
Exercises
Solutions
12 Poisson/Gamma
Exercises
Solutions
13 Frequency— Exposure & Coverage Modifications
13.1 Exposure modifications
13.2 Coverage modifications
Exercises
Solutions
14 Aggregate Loss Models: Compound Variance
14.1 Introduction
14.2 Compound variance
Exercises
Solutions
15 Aggregate Loss Models: Approximating Distribution
Exercises
Solutions
16 Aggregate Losses: Severity Modifications
Exercises
Solutions
17 Aggregate Loss Models: The Recursive Formula
Exercises
Solutions
18 Aggregate Losses—Aggregate Deductible
Exercises
Solutions
19 Aggregate Losses: Miscellaneous Topics
19.1 Exact Calculation of Aggregate Loss Distribution
19.1.1 Normal distribution
19.1.2 Exponential and gamma distributions
19.1.3 Compound Poisson models
19.2 Discretizing
19.2.1 Method of rounding
19.2.2 Method of local moment matching
Exercises
Solutions
20 Supplementary Questions: Severity, Frequency, and Aggregate Loss
Solutions
II Empirical Models
21 Review of Mathematical Statistics
21.1 Estimator quality
21.1.1 Bias
21.1.2 Consistency
21.1.3 Variance and mean square error
21.2 Hypothesis testing
21.3 Confidence intervals
Exercises
Solutions
22 The Empirical Distribution for Complete Data
22.1 Individual data
22.2 Grouped data
Exercises
Solutions
23 Variance of Empirical Estimators with Complete Data
23.1 Individual data
23.2 Grouped data
Exercises
Solutions
24 Kaplan-Meier and Nelson-Åalen Estimators
24.1 Kaplan-Meier Product Limit Estimator
24.2 Nelson-Åalen Estimator
Exercises
Solutions
25 Estimation of Related Quantities
25.1 Moments
25.1.1 Complete individual data
25.1.2 Grouped data
25.1.3 Incomplete data
25.2 Range probabilities
25.3 Deductibles and limits
25.4 Inflation
Exercises
Solutions
26 Variance of Kaplan-Meier and Nelson-Åalen Estimators
Exercises
Solutions
27 Kernel Smoothing
27.1 Density and distribution
27.1.1 Uniform kernel
27.1.2 Triangular kernel
27.1.3 Other symmetric kernels
27.1.4 Kernels using two-parameter distributions
27.2 Moments of kernel-smoothed distributions
Exercises
Solutions
28 Mortality Table Construction
28.1 Individual data based methods
28.1.1 Variance of estimators
28.2 Interval-based methods
Exercises
Solutions
29 Supplementary Questions: Empirical Models
Solutions
III Parametric Models
30 Method of Moments
30.1 Introductory remarks
30.2 The method of moments for various distributions
30.2.1 Exponential
30.2.2 Gamma
30.2.3 Pareto
30.2.4 Lognormal
30.2.5 Uniform
30.2.6 Other distributions
30.3 Fitting other moments, and incomplete data
Exercises
Solutions
31 Percentile Matching
31.1 Smoothed empirical percentile
31.2 Percentile matching for various distributions
31.2.1 Exponential
31.2.2 Weibull
31.2.3 Lognormal
31.2.4 Other distributions
31.3 Percentile matching with incomplete data
31.4 Matching a percentile and a moment
Exercises
Solutions
32 Maximum Likelihood Estimators
32.1 Defining the likelihood
32.1.1 Individual data
32.1.2 Grouped data
32.1.3 Censoring
32.1.4 Truncation
32.1.5 Combination of censoring and truncation
Exercises
Solutions
33 Maximum Likelihood Estimators—Special Techniques
33.1 Cases for which the Maximum Likelihood Estimator equals the Method of Moments Estimator
33.1.1 Exponential distribution
33.2 Parametrization and Shifting
33.2.1 Parametrization
33.2.2 Shifting
33.3 Transformations
33.3.1 Lognormal distribution
33.3.2 Inverse exponential distribution
33.3.3 Weibull distribution
33.4 Special distributions
33.4.1 Uniform distribution
33.4.2 Pareto distribution
33.4.3 Beta distribution
33.5 Bernoulli technique
33.6 Estimating qx
Exercises
Solutions
34 Variance Of Maximum Likelihood Estimators
34.1 Information matrix
34.1.1 Calculating variance using the information matrix
34.1.2 Asymptotic variance of MLE for common distributions
34.1.3 True information and observed information
34.2 The delta method
34.3 Confidence Intervals
34.3.1 Normal Confidence Intervals
34.3.2 Non-Normal Confidence Intervals
34.4 Variance of Exact Exposure Estimate of j
Exercises
Solutions
35 Fitting Discrete Distributions
35.1 Poisson distribution
35.2 Negative binomial
35.3 Binomial
35.4 Fitting (a,b,1) class distributions
35.5 Adjusting for exposure
35.6 Choosing between distributions in the (a,b,0) class
Exercises
Solutions
36 Hypothesis Tests: Graphic Comparison
36.1 D(x) plots
36.2 p-p plots
Exercises
Solutions
37 Hypothesis Tests: Kolmogorov-Smirnov
37.1 Individual data
37.2 Grouped data
Exercises
Solutions
38 Hypothesis Tests: Anderson-Darling
Exercises
Solutions
39 Hypothesis Tests: Chi-square
39.1 Introduction
39.2 Definition of chi-square statistic
39.3 Degrees of freedom
39.4 Other requirements for the chi-square test
39.5 Data from several periods
Exercises
Solutions
40 Likelihood Ratio Test and Algorithm, Schwarz Bayesian Criterion
40.1 Likelihood Ratio Test and Algorithm
40.2 Schwarz Bayesian Criterion
Exercises
Solutions
41 Supplementary Questions: Parametric Models
Solutions
IV Credibility
42 Limited Fluctuation Credibility: Poisson Frequency
Exercises
Solutions
43 Limited Fluctuation Credibility: Non-Poisson Frequency
Exercises
Solutions
44 Limited Fluctuation Credibility: Partial Credibility
Exercises
Solutions
45 Bayesian Methods—Discrete Prior
Exercises
Solutions
46 Bayesian Methods—Continuous Prior
46.1 Calculating posterior and predictive distributions
46.2 Recognizing the posterior distribution
46.3 Loss functions
46.4 Interval estimation
46.5 The linear exponential family and conjugate priors
Exercises
Solutions
47 Bayesian Credibility: Poisson/Gamma
Exercises
Solutions
48 Bayesian Credibility: Normal/Normal
Exercises
Solutions
49 Bayesian Credibility: Bernoulli/Beta
49.1 Bernoulli/beta
49.2 Negative binomial/beta
Exercises
Solutions
50 Bayesian Credibility: Exponential/Inverse Gamma
Exercises
Solutions
51 Bühlmann Credibility: Basics
Exercises
Solutions
52 Bühlmann Credibility: Discrete Prior
Exercises
Solutions
53 Bühlmann Credibility: Continuous Prior
Exercises
Solutions
54 Bühlmann-Straub Credibility
54.1 Bühlmann-Straub model: Varying exposure
54.2 Hewitt model: Generalized variance of observations
Exercises
Solutions
55 Exact Credibility
Exercises
Solutions
56 Bühlmann As Least Squares Estimate of Bayes
56.1 Regression
56.2 Graphic questions
56.3 Cov(Xi,Xj)
Exercises
Solutions
57 Empirical Bayes Non-Parametric Methods
57.1 Uniform exposures
57.2 Non-uniform exposures
57.2.1 No manual premium
57.2.2 Manual premium
Exercises
Solutions
58 Empirical Bayes Semi-Parametric Methods
58.1 Poisson model
58.2 Non-Poisson models
58.3 Which Bühlmann method should be used?
Exercises
Solutions
59 Supplementary Questions: Credibility
Solutions
V Simulation
60 Simulation—Inversion Method
Exercises
Solutions
61 Simulation—Special Techniques
61.1 Mixtures
61.2 Multiple decrements
61.3 Simulating (a,b,0) distributions
61.4 Normal random variables: the polar method
Exercises
Solutions
62 Number of Data Values to Generate
Exercises
Solutions
63 Simulation—Applications
63.1 Actuarial applications
63.2 Statistical analysis
63.3 Risk measures
Exercises
Solutions
64 Bootstrap Approximation
Exercises
Solutions
65 Supplementary Questions: Simulation
Solutions
VI Practice Exams
1 Practice Exam 1
2 Practice Exam 2
3 Practice Exam 3
4 Practice Exam 4
5 Practice Exam 5
6 Practice Exam 6
7 Practice Exam 7
8 Practice Exam 8
9 Practice Exam 9
10 Practice Exam 10
11 Practice Exam 11
12 Practice Exam 12
13 Practice Exam 13
Appendices
A Solutions to the Practice Exams
Solutions for Practice Exam 1
Solutions for Practice Exam 2
Solutions for Practice Exam 3
Solutions for Practice Exam 4
Solutions for Practice Exam 5
Solutions for Practice Exam 6
Solutions for Practice Exam 7
Solutions for Practice Exam 8
Solutions for Practice Exam 9
Solutions for Practice Exam 10
Solutions for Practice Exam 11
Solutions for Practice Exam 12
Solutions for Practice Exam 13
B Solutions to Old Exams
B.1 Solutions to CAS Exam 3, Spring 2005
B.2 Solutions to SOA Exam M, Spring 2005
B.3 Solutions to CAS Exam 3, Fall 2005
B.4 Solutions to SOA Exam M, Fall 2005
B.5 Solutions to Exam C/4, Fall 2005
B.6 Solutions to CAS Exam 3, Spring 2006
B.7 Solutions to CAS Exam 3, Fall 2006
B.8 Solutions to SOA Exam M, Fall 2006
B.9 Solutions to Exam C/4, Fall 2006
B.10 Solutions to Exam C/4, Spring 2007
C Cross Reference from Loss Models
D Exam Question Index
Recommend Papers

a/s/m Study Manual for Exam C/Exam 4: Construction and Evaluation of Actuarial Models [17 ed.]

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Study Manual for

Exam C/Exam 4 Construction and Evaluation of Actuarial Models Seventeenth Edition

by Abraham Weishaus, Ph.D., F.S.A., CFA, M.A.A.A. Note: NO RETURN IF OPENED

Study Manual for

Exam C/Exam 4 Construction and Evaluation of Actuarial Models Seventeenth Edition

by Abraham Weishaus, Ph.D., F.S.A., CFA, M.A.A.A. Note: NO RETURN IF OPENED

TO OUR READERS: Please check A.S.M.’s web site at www.studymanuals.com for errata and updates. If you have any comments or reports of errata, please e-mail us at [email protected].

©Copyright 2014 by Actuarial Study Materials (A.S.M.), PO Box 69, Greenland, NH 03840. All rights reserved. Reproduction in whole or in part without express written permission from the publisher is strictly prohibited.

Contents

I

Severity, Frequency, and Aggregate Loss

1

Basic Probability 1.1 Functions and moments . . . . . . . . . . . . . 1.2 Percentiles . . . . . . . . . . . . . . . . . . . . . 1.3 Conditional probability and expectation . . . . 1.4 Moment and probability generating functions 1.5 The empirical distribution . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 3 7 8 11 13 14 21

Parametric Distributions 2.1 Scaling . . . . . . . . . . . . . . . . 2.2 Transformations . . . . . . . . . . . 2.3 Common parametric distributions 2.3.1 Uniform . . . . . . . . . . . 2.3.2 Beta . . . . . . . . . . . . . . 2.3.3 Exponential . . . . . . . . . 2.3.4 Weibull . . . . . . . . . . . . 2.3.5 Gamma . . . . . . . . . . . 2.3.6 Pareto . . . . . . . . . . . . 2.3.7 Single-parameter Pareto . . 2.3.8 Lognormal . . . . . . . . . . 2.4 The linear exponential family . . . 2.5 Limiting distributions . . . . . . . Exercises . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

29 29 31 33 34 35 35 36 37 38 39 40 40 43 45 48

Variance 3.1 Additivity . . . . . . . 3.2 Normal approximation 3.3 Bernoulli shortcut . . . Exercises . . . . . . . . Solutions . . . . . . . .

2

3

4

1

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

51 51 52 54 55 56

Mixtures and Splices 4.1 Mixtures . . . . . . . . . . . 4.1.1 Discrete mixtures . . 4.1.2 Continuous mixtures 4.1.3 Frailty models . . . . 4.2 Conditional Variance . . . . 4.3 Splices . . . . . . . . . . . . Exercises . . . . . . . . . . . Solutions . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

59 59 59 61 62 63 66 69 76

C/4 Study Manual—17th edition Copyright ©2014 ASM

. . . . .

. . . . .

iii

CONTENTS

iv

5

Policy Limits Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

Deductibles 6.1 Ordinary and franchise deductibles . . 6.2 Payment per loss with deductible . . . 6.3 Payment per payment with deductible Exercises . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . .

85 87 91

. . . . .

95 95 95 97 101 111

7

Loss Elimination Ratio Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

119 120 126

8

Risk Measures and Tail Weight 8.1 Coherent risk measures . . 8.2 Value-at-Risk (VaR) . . . . . 8.3 Tail-Value-at-Risk (TVaR) . . 8.4 Tail Weight . . . . . . . . . . 8.5 Extreme value distributions Exercises . . . . . . . . . . . Solutions . . . . . . . . . . .

. . . . . . .

135 135 137 140 144 147 149 151

Other Topics in Severity Coverage Modifications Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

159 163 168

10 Bonuses Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

179 180 182

11 Discrete Distributions 11.1 The ( a, b, 0) class 11.2 The ( a, b, 1) class Exercises . . . . . Solutions . . . . .

. . . .

187 187 191 195 199

12 Poisson/Gamma Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

211 212 216

13 Frequency— Exposure & Coverage Modifications 13.1 Exposure modifications . . . . . . . . . . . . 13.2 Coverage modifications . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . .

. . . .

221 221 221 223 228

14 Aggregate Loss Models: Compound Variance 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Compound variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

235 235 236

9

C/4 Study Manual—17th edition Copyright ©2014 ASM

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

CONTENTS

v

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

239 247

15 Aggregate Loss Models: Approximating Distribution Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

257 260 267

16 Aggregate Losses: Severity Modifications Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

277 278 285

17 Aggregate Loss Models: The Recursive Formula Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

295 299 302

18 Aggregate Losses—Aggregate Deductible Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

307 312 318

19 Aggregate Losses: Miscellaneous Topics 19.1 Exact Calculation of Aggregate Loss Distribution 19.1.1 Normal distribution . . . . . . . . . . . . 19.1.2 Exponential and gamma distributions . . 19.1.3 Compound Poisson models . . . . . . . . 19.2 Discretizing . . . . . . . . . . . . . . . . . . . . . 19.2.1 Method of rounding . . . . . . . . . . . . 19.2.2 Method of local moment matching . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

325 325 325 326 329 329 330 330 332 334

20 Supplementary Questions: Severity, Frequency, and Aggregate Loss Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

339 343

II

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

Empirical Models

21 Review of Mathematical Statistics 21.1 Estimator quality . . . . . . . . . . . . . 21.1.1 Bias . . . . . . . . . . . . . . . . . 21.1.2 Consistency . . . . . . . . . . . . 21.1.3 Variance and mean square error 21.2 Hypothesis testing . . . . . . . . . . . . 21.3 Confidence intervals . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . .

351 . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

353 353 354 356 356 357 359 363 368

22 The Empirical Distribution for Complete Data 22.1 Individual data . . . . . . . . . . . . . . . . 22.2 Grouped data . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

375 375 376 377 379

C/4 Study Manual—17th edition Copyright ©2014 ASM

. . . . . . . .

CONTENTS

vi

23 Variance of Empirical Estimators with Complete Data 23.1 Individual data . . . . . . . . . . . . . . . . . . . . 23.2 Grouped data . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

381 381 382 385 388

24 Kaplan-Meier and Nelson-Åalen Estimators 24.1 Kaplan-Meier Product Limit Estimator . 24.2 Nelson-Åalen Estimator . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

393 394 398 401 410

25 Estimation of Related Quantities 25.1 Moments . . . . . . . . . . . . . . 25.1.1 Complete individual data 25.1.2 Grouped data . . . . . . . 25.1.3 Incomplete data . . . . . . 25.2 Range probabilities . . . . . . . . 25.3 Deductibles and limits . . . . . . 25.4 Inflation . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

419 419 419 419 422 425 425 426 427 432

26 Variance of Kaplan-Meier and Nelson-Åalen Estimators Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

437 440 447

27 Kernel Smoothing 27.1 Density and distribution . . . . . . . . . . . . . . . 27.1.1 Uniform kernel . . . . . . . . . . . . . . . . 27.1.2 Triangular kernel . . . . . . . . . . . . . . . 27.1.3 Other symmetric kernels . . . . . . . . . . . 27.1.4 Kernels using two-parameter distributions 27.2 Moments of kernel-smoothed distributions . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

457 457 458 463 470 471 472 474 480

28 Mortality Table Construction 28.1 Individual data based methods 28.1.1 Variance of estimators . 28.2 Interval-based methods . . . . Exercises . . . . . . . . . . . . . Solutions . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

489 489 493 494 499 508

29 Supplementary Questions: Empirical Models Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

515 518

III

Parametric Models

30 Method of Moments C/4 Study Manual—17th edition Copyright ©2014 ASM

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

523 525

CONTENTS

vii

30.1 Introductory remarks . . . . . . . . . . . . . . . . . 30.2 The method of moments for various distributions 30.2.1 Exponential . . . . . . . . . . . . . . . . . . 30.2.2 Gamma . . . . . . . . . . . . . . . . . . . . 30.2.3 Pareto . . . . . . . . . . . . . . . . . . . . . 30.2.4 Lognormal . . . . . . . . . . . . . . . . . . . 30.2.5 Uniform . . . . . . . . . . . . . . . . . . . . 30.2.6 Other distributions . . . . . . . . . . . . . . 30.3 Fitting other moments, and incomplete data . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . 31 Percentile Matching 31.1 Smoothed empirical percentile . . . . . . . . 31.2 Percentile matching for various distributions 31.2.1 Exponential . . . . . . . . . . . . . . . 31.2.2 Weibull . . . . . . . . . . . . . . . . . . 31.2.3 Lognormal . . . . . . . . . . . . . . . . 31.2.4 Other distributions . . . . . . . . . . . 31.3 Percentile matching with incomplete data . . 31.4 Matching a percentile and a moment . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

525 526 526 526 527 528 529 529 530 533 541

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

555 555 556 556 557 558 558 559 561 561 567

32 Maximum Likelihood Estimators 32.1 Defining the likelihood . . . . . . . . . . . . . . . 32.1.1 Individual data . . . . . . . . . . . . . . . 32.1.2 Grouped data . . . . . . . . . . . . . . . . 32.1.3 Censoring . . . . . . . . . . . . . . . . . . 32.1.4 Truncation . . . . . . . . . . . . . . . . . . 32.1.5 Combination of censoring and truncation Exercises . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

575 577 577 578 579 580 581 582 591

. . . . . . . . . .

33 Maximum Likelihood Estimators—Special Techniques 33.1 Cases for which the Maximum Likelihood Estimator equals the Method of Moments Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.1.1 Exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Parametrization and Shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2.1 Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2.2 Shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.1 Lognormal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.2 Inverse exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.3 Weibull distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4 Special distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4.1 Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4.2 Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4.3 Beta distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.5 Bernoulli technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C/4 Study Manual—17th edition Copyright ©2014 ASM

601 601 601 602 602 603 603 604 604 605 606 606 607 608 609

CONTENTS

viii

33.6 Estimating q x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

612 614 630

34 Variance Of Maximum Likelihood Estimators 34.1 Information matrix . . . . . . . . . . . . . . . . . . . . . . . . . 34.1.1 Calculating variance using the information matrix . . . 34.1.2 Asymptotic variance of MLE for common distributions 34.1.3 True information and observed information . . . . . . 34.2 The delta method . . . . . . . . . . . . . . . . . . . . . . . . . . 34.3 Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . 34.3.1 Normal Confidence Intervals . . . . . . . . . . . . . . . 34.3.2 Non-Normal Confidence Intervals . . . . . . . . . . . . 34.4 Variance of Exact Exposure Estimate of qˆ j . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

647 647 647 651 656 659 661 661 662 664 665 675

35 Fitting Discrete Distributions 35.1 Poisson distribution . . . . . . . . . . . . . . . . . . . 35.2 Negative binomial . . . . . . . . . . . . . . . . . . . . 35.3 Binomial . . . . . . . . . . . . . . . . . . . . . . . . . 35.4 Fitting ( a, b, 1) class distributions . . . . . . . . . . . 35.5 Adjusting for exposure . . . . . . . . . . . . . . . . . 35.6 Choosing between distributions in the ( a, b, 0) class Exercises . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

687 687 688 688 690 692 693 696 704

36 Hypothesis Tests: Graphic Comparison 36.1 D ( x ) plots . . . . . . . . . . . . . . 36.2 p–p plots . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

713 713 714 716 720

37 Hypothesis Tests: Kolmogorov-Smirnov 37.1 Individual data . . . . . . . . . . . . 37.2 Grouped data . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

725 725 730 732 740

38 Hypothesis Tests: Anderson-Darling Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

749 750 752

39 Hypothesis Tests: Chi-square 39.1 Introduction . . . . . . . . . . . . . . . . . 39.2 Definition of chi-square statistic . . . . . . 39.3 Degrees of freedom . . . . . . . . . . . . . 39.4 Other requirements for the chi-square test 39.5 Data from several periods . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . .

757 757 760 763 765 767 769 785

C/4 Study Manual—17th edition Copyright ©2014 ASM

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

CONTENTS

ix

40 Likelihood Ratio Test and Algorithm, Schwarz Bayesian Criterion 40.1 Likelihood Ratio Test and Algorithm . . . . . . . . . . . . . . . . 40.2 Schwarz Bayesian Criterion . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

795 795 800 801 805

41 Supplementary Questions: Parametric Models Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

811 816

IV

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

Credibility

823

42 Limited Fluctuation Credibility: Poisson Frequency Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

827 833 842

43 Limited Fluctuation Credibility: Non-Poisson Frequency Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

849 852 856

44 Limited Fluctuation Credibility: Partial Credibility Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

861 862 868

45 Bayesian Methods—Discrete Prior Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

873 877 890

46 Bayesian Methods—Continuous Prior 46.1 Calculating posterior and predictive distributions 46.2 Recognizing the posterior distribution . . . . . . . 46.3 Loss functions . . . . . . . . . . . . . . . . . . . . . 46.4 Interval estimation . . . . . . . . . . . . . . . . . . 46.5 The linear exponential family and conjugate priors Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

909 909 914 915 916 917 918 925

47 Bayesian Credibility: Poisson/Gamma Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

939 940 948

48 Bayesian Credibility: Normal/Normal Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

953 957 958

49 Bayesian Credibility: Bernoulli/Beta 49.1 Bernoulli/beta . . . . . . . . . . . 49.2 Negative binomial/beta . . . . . Exercises . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . .

961 961 964 965 968

C/4 Study Manual—17th edition Copyright ©2014 ASM

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . .

CONTENTS

x

50 Bayesian Credibility: Exponential/Inverse Gamma Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

971 975 978

51 Bühlmann Credibility: Basics Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

981 986 992

52 Bühlmann Credibility: Discrete Prior 1001 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1025 53 Bühlmann Credibility: Continuous Prior 1045 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1049 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1061 54 Bühlmann-Straub Credibility 54.1 Bühlmann-Straub model: Varying exposure . . . . . 54.2 Hewitt model: Generalized variance of observations Exercises . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1073 1073 1074 1078 1083

55 Exact Credibility 1091 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1093 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097 56 Bühlmann As Least Squares Estimate of Bayes 56.1 Regression . . . . . . . . . . . . . . . . . . . 56.2 Graphic questions . . . . . . . . . . . . . . . 56.3 Cov ( X i , X j ) . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1101 1101 1103 1105 1106 1111

57 Empirical Bayes Non-Parametric Methods 57.1 Uniform exposures . . . . . . . . . . . 57.2 Non-uniform exposures . . . . . . . . 57.2.1 No manual premium . . . . . . 57.2.2 Manual premium . . . . . . . . Exercises . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1113 1114 1116 1116 1123 1124 1131

58 Empirical Bayes Semi-Parametric Methods 58.1 Poisson model . . . . . . . . . . . . . . . . . 58.2 Non-Poisson models . . . . . . . . . . . . . 58.3 Which Bühlmann method should be used? Exercises . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1143 1143 1147 1150 1152 1159

. . . . . .

. . . . . .

59 Supplementary Questions: Credibility 1165 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1170 C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

V

xi

Simulation

1177

60 Simulation—Inversion Method 1179 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1184 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1193 61 Simulation—Special Techniques 61.1 Mixtures . . . . . . . . . . . . . . . . . . . . . 61.2 Multiple decrements . . . . . . . . . . . . . . 61.3 Simulating ( a, b, 0) distributions . . . . . . . 61.4 Normal random variables: the polar method Exercises . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1203 1203 1204 1207 1209 1212 1218

62 Number of Data Values to Generate 1225 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1230 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1233 63 Simulation—Applications 63.1 Actuarial applications 63.2 Statistical analysis . . . 63.3 Risk measures . . . . . Exercises . . . . . . . . Solutions . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1237 1237 1239 1239 1241 1251

64 Bootstrap Approximation 1261 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1269 65 Supplementary Questions: Simulation 1275 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1278

VI

Practice Exams

1283

1

Practice Exam 1

1285

2

Practice Exam 2

1297

3

Practice Exam 3

1307

4

Practice Exam 4

1317

5

Practice Exam 5

1327

6

Practice Exam 6

1337

7

Practice Exam 7

1349

8

Practice Exam 8

1359

9

Practice Exam 9

1369

C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

xii

10 Practice Exam 10

1381

11 Practice Exam 11

1391

12 Practice Exam 12

1403

13 Practice Exam 13

1415

Appendices A Solutions to the Practice Exams Solutions for Practice Exam 1 . . Solutions for Practice Exam 2 . . Solutions for Practice Exam 3 . . Solutions for Practice Exam 4 . . Solutions for Practice Exam 5 . . Solutions for Practice Exam 6 . . Solutions for Practice Exam 7 . . Solutions for Practice Exam 8 . . Solutions for Practice Exam 9 . . Solutions for Practice Exam 10 . . Solutions for Practice Exam 11 . . Solutions for Practice Exam 12 . . Solutions for Practice Exam 13 . .

1425 . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

1427 1427 1439 1452 1465 1478 1489 1501 1514 1527 1540 1552 1565 1579

B Solutions to Old Exams B.1 Solutions to CAS Exam 3, Spring 2005 . B.2 Solutions to SOA Exam M, Spring 2005 . B.3 Solutions to CAS Exam 3, Fall 2005 . . . B.4 Solutions to SOA Exam M, Fall 2005 . . B.5 Solutions to Exam C/4, Fall 2005 . . . . B.6 Solutions to CAS Exam 3, Spring 2006 . B.7 Solutions to CAS Exam 3, Fall 2006 . . . B.8 Solutions to SOA Exam M, Fall 2006 . . B.9 Solutions to Exam C/4, Fall 2006 . . . . B.10 Solutions to Exam C/4, Spring 2007 . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

1597 1597 1601 1604 1608 1612 1622 1626 1629 1632 1641

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

C Cross Reference from Loss Models

1651

D Exam Question Index

1653

C/4 Study Manual—17th edition Copyright ©2014 ASM

Preface Exam C/4 is the catch-all exam containing all mathematical material that doesn’t fit easily into one of the other exams. You will study models for property/casualty insurance: models for size of loss, number of losses, and aggregate losses. Along the way, you’ll learn about risk measures: possible measures for how much surplus a company should hold, based on the risk characteristics of its business. Then you will switch gears and study statistics. You will learn how to estimate mortality rates, and distribution functions for loss sizes and counts, and how to evaluate the quality of the estimates. After that, you will study credibility: adjusting estimates based on experience. Finally, you will learn the basics of stochastic simulation, a valuable tool for actuarial modeling. Prerequisites for most of the material are few beyond knowing probability (and calculus of course). Occasionally we will refer to mortality rates as q x (something you learn about in Exam MLC/LC), and we’ll even mention double decrement models in Lesson 28, but overall, Exam MLC/LC plays very little role. Regression is helpful at one point for one topic within Bühlmann credibility, but questions on that particular topic are rare. The CAS website provides some guidance on the relationship between exams, and considers Exam P/1 the only prerequisite to this exam.

This manual The exercises in this manual I’ve provided lots of my own exercises, as well as relevant exercises from pre-2000 exams, which are not that easy to get. Though the style of exam questions has changed a little, these are still very useful practice exercises which cover the same material—don’t dismiss them as obsolete! CAS 4B had 1 point, 2 point, and 3 point questions. Current exam questions are approximately as difficult as the 2-point questions. All questions in this manual from exams given in 2000 and later, with solutions, are also available on the web from the SOA. When the 2000 syllabus was established in 1999, sample exams 3 and 4 were created, consisting partially of questions from older exams and partially of new questions, not all multiple choice. These sample exams were not real exams, and some questions were inappropriate or defective. These exams are no longer posted on the web. I have included appropriate questions, labeled “1999 C3 Sample” or “1999 C4 Sample”. These refer to these 1999 sample exams, not to the 306 sample questions currently posted on the web, which are discussed later in this introduction. Questions from old exams are marked xxx:yy, where xxx is the time the exam was given, with S for spring and F for fall followed by a 2-digit year, and yy is the question number. Sometimes xxx is preceded with SOA or CAS to indicate the sponsoring organization. From about 1986 to 2000, SOA exams had 3digit numbers (like 160) and CAS exams were a number and a letter (like 4B). From 2000 to Spring 2003, exam 3 was jointly sponsored, so I do not indicate “SOA” or “CAS” for exam 3 questions from that period. There was a period in the 1990’s when the SOA, while allowing use of its old exam questions, did not want people to reveal which exam they came from. As a result, I sometimes cannot identify the source exam for questions from this period. In such a case, I mark the question aaa-bb-cc:yy, where aaa-bb-cc is the study note number and yy is the question number. Generally aaa is the exam number (like 160), and cc is the 2-digit year the study note was published. No exercises in this manual are taken from the Fall 2005, Fall 2006, or Spring 2007 exams, which you may use as a dress rehearsal. However, Appendix B has the solutions to all of these exams. While in most cases these are the same as the official solutions, in a couple of cases I use the shortcuts which you learn in this manual. That appendix also provides solutions to relevant questions from pre-2007 CAS Exam 3’s. C/4 Study Manual—17th edition Copyright ©2014 ASM

xiii

CONTENTS

xiv

Other Useful Features of This Manual The SOA site has a set of 306 sample questions and solutions.1 Almost all of these questions are from released exams that are readily available; nevertheless many students prefer to use this list since nonsyllabus material has been removed. Appendix D has a complete cross reference between these questions and the exams they come from, as well as the page in this manual having either the question or the solution. This manual has an index. Whenever you remember some topic in this manual but can’t remember where you saw it, check the index. If it isn’t in the index but you’re sure it’s in the manual and an index listing would be appropriate, contact the author.

Tables Download the tables you will be given on the exam. They will often be needed for the examples and the exercises; I take the information in these tables for granted. If you see something in the text like “That distribution is a Pareto and therefore the variance is . . . ”, you should know that I am getting this from the tables; you are not expected to know this by heart. So please download the tables. Go to www.soa.org. Click on , click on “EDUCATION”, “EXAMS AND REQUIREMENTS”, “ASA”, and the Exam C box. Download the syllabus, which is the first bullet under “Syllabus and Study Materials”. At the bottom of the syllabus under “Other Resources”, click on “Tables for Exam C”. The direct address of the tables at this writing (October 2014) is http:www.soa.org/files/pdf/edu-2009-fall-exam-c-table.pdf The tables include distribution tables and the following statistical tables: the normal distribution function and chi-square critical values. The distribution tables are an abbreviated version of the Loss Models appendix. Whenever I refer to the tables from the Loss Models appendix in this manual, the abbreviated version will be sufficient. At this writing, the tables (on the second page) specify rules for using the normal distribution table that is supplied: Do not interpolate in the table. Simply use the nearest value. If you are looking for Φ (0.0244) , use Φ (0.02) . If you are given the cumulative probability Φ ( x )  0.8860 and need x, use 1.21, the nearest x available. The examples, exercises, and quizzes in this manual use this rounding method. On real exams, they will try to avoid ambiguous situations, so borderline situations won’t occur, but my interpretation of the rules (used for problems in this manual) is that if the third place is 5, round up the absolute value. So I round 0.125 to 0.13 and −0.125 to −0.13.

New for this Edition The 17th edition features Practice Exam 7, a new practice exam. I tried to make this exam less computationintensive and more conceptual. Students tell me that current exams are more conceptual, giving you unfamiliar contexts in which to apply the principles of this course.

Flashcards Many students find flashcards a useful tool for learning key formulas and concepts. ASM flashcards, available from the same distributors that sell this manual, contain the formulas and concepts from this manual in a convenient deck of cards. The cards have crossreferences, usually by page, to the manual.

1Actually less than 306. Some questions were mistakenly put on the original list and deleted, and some questions pertaining to topics dropped from the syllabus in Fall 2009 were deleted. C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

xv

Notes About the Exam Released Exams You may wonder what the relative weights are for each of the subjects on the exam. Table 1 lists the number of questions on each of the Exam C/4 topics that are still on the current syllabus. May Nov. May Nov. Nov. Nov. Nov. May Nov. Nov. May Syllabus Topic Lessons 2000 2000 2001 2001 2002 2003 2004 2005 2005 2006 2007 Weight Severity, Frequency, 1–19 1 2 1 2 1 1 0 0 0 0 5 15–20% Aggregate Loss Empirical Estimation 21–28 4 2 2 4 4 5 4 6 5 9 7 20–25% Parametric Fitting

30–35

6

6

6

4

7

8

11

6

8

7

6

Testing Fit

36–40

2

0

3

2

2

2

3

4

4

1

3

42–44

1

1

0

1

1

2

1

1

1

1

0

45–50

4

5

5

4

3

5

3

3

3

5

3

51–56

2

2

4

5

5

3

4

5

3

3

57–58

2

3

1

1

1

1

2

2

3

2

        20–25%   5     2 

60–64

1

1

0

1

1

1

1

3

3

4

3 5–10%

23

22

22

24

25

28

29

30

30

32

Limited Fluctuation Credibility Bayesian Credibility Bühlmann Credibility Empirical Bayes Simulation Total



25–30%

34

Table 1: The number of questions in released Exams C/4 on each exam subjecta a For the purpose of this table, F03:13 was classified as a probability-Lesson 1 question and F03:30 was classified as a parametric fit question, but neither question was based on the syllabus material when the exam was given.

To fully appreciate this table, you should be aware of all the syllabus changes since Spring 2000: • From May 2000 through November 2004, the syllabus included regression and time series, but did not include the severity, frequency, aggregate loss topic. Nevertheless, you see that a small number of questions were asked on that topic, which was on Course 3 at the time. They used to ask some non-syllabus general knowledge questions on the exam. You were expected to answer these based on your knowledge of probability or Course 3 material. You were expected to take exams in order. The parametric part of the course included Cox models. The exam was a 4 hour 40-question exam. • Starting with the May 2005 exam, regression and time series were removed. Cubic splines was added. The exam was a 4 hour 35-question exam. This continued up to the November 2006 exam. • Starting with the May 2007 exam, cubic splines were removed and the severity, frequency, aggregate loss topic was added, including ruin theory. Also, risk measures and some finance topics were added. However, the one risk measure question on this exam is on a topic no longer on the syllabus. They apparently have been asking one risk measure question per exam since then. The exam was a 4 hour 40-question exam. This continued up to the May 2009 exam. • Starting with the November 2009 exam, with the start of CBT, the exam was reduced to 3.5 hours and 35 questions. The following topics were removed: Cox model, ruin theory, finance topics. They switched to the third edition of Loss Models and added fitting ( a, b, 1) distributions. Also, they replaced the Hardy Risk Measures study note with the coverage in Loss Models, which is less. C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

xvi

• Starting with the October 2013 exam, minor changes were made to the syllabus. A short passage on extreme value distributions was added, material on mortality table construction was expanded, and special simulation methods were added. So this table can only be used as a general idea of question distribution. The syllabus is different from what it was when these exams were given. The final column, taken from the syllabus, may be the most reliable guide. A more detailed listing of exam questions and which lesson they correspond to can be found in Appendix D. Please use that list to identify questions no longer on the syllabus. Note that many questions require knowing more than one topic; I classify these based on the latest topic (based on the order in this manual) you need. Thus a question requiring knowledge of Bayesian and Bühlmann credibility would be classified as a Bühlmann credibility problem.

Guessing penalty There is no guessing penalty on this exam. So fill in every answer—you may be lucky! Leave yourself a couple of seconds to do this. If you have a calculator that can generate random numbers, and some time, you can use a formal method for generating answers; see Example 60D on page 1183. Otherwise, filling in a B for every question you don’t know the answer to is just as good.

Calculators A wide variety of calculators are permitted: the TI-30X (or TI-30Xa, or TI-30X II, battery or solar, or TI-30XS or TI-30XB MultiView, the BA-35 (battery powered or solar), and the BA-II Plus (or BA II Plus Professional Edition). You may bring several calculators into the exam. The MultiView calculator is considered the best one, due to its data tables which allow fast statistical calculations. The data table is a very restricted spreadsheet. Despite its limitations, it is useful. I’ve provided several examples of using the data table of the Multiview calculator to speed up calculations. Another feature of the Multiview is storage of previous calculations. They can be recalled and edited. Other features which may be of use, although I do not use them in the calculator tips provided, are the K constant and the table feature, which allows calculation of a function at selected values or at values in an arithmetic progression. Financial calculations do not occur on this exam; interest is almost never considered. You will not miss the lack of financial functions on the Multiview.

Changes to Syllabus There have been no changes to the syllabus since October 2013.

Study Schedule Different students will have different speeds and different constraints, so it’s hard to create a study schedule useful for everybody. However, I offer a sample 13-week study schedule, Table 2, as a guide. The last column lists rarely tested materials so you can skip those if you are behind in your schedule. Italicized sections in this column are, in my opinion, extremely unlikely exam topics. C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

xvii

Table 2: Thirteen Week Study Schedule for Exam C/4

Week 1 2 3 4 5 6 7 8 9 10 11 12 13

Subject

Lessons

Rarely Tested

Probability basics Risk measures and severity Frequency and aggregate loss Aggregate loss (continued) and statistics Empirical estimators Variance of KM,NA Estimators, kernel smoothing, mortality table construction Method of moments & percentile matching Maximum likelihood Maximum likelihood (continued) and hypothesis testing Limited fluctuation and discrete Bayesian credibility Continuous Bayesian credibility Bühlmann credibility Bühlmann credibility (continued) and simulation

1–5 6–10 11–14 15–21 22–25 26–28

1.4, 2.2, 4.1.3 8.4 11.2,13.1 17,19.2 23

30–31 32–33 34–40

34.1.3,35.4,38

42–45

43

46–50 51–54 55–64

48 54.2 56,57.2.2

Errata Please report any errors you find. Reports may be sent to the publisher ([email protected]) or directly to me ([email protected]). When reporting errata, please indicate which manual and which edition you are referring to! This manual is the 17th edition of the Exam C/4 manual. An errata list will be posted at errata.aceyourexams.net

Acknowledgements I wish to thank the Society of Actuaries and the Casualty Actuarial Society for permission to use their old exam questions. These questions are the backbone of this manual. I wish to thank Donald Knuth, the creator of TEX, Leslie Lamport, the creator of LATEX, and the many package writers and maintainers, for providing a typesetting system which allows such beautiful typesetting of mathematics and figures. I hope you agree, after looking at mathematical material apparently typed with Word (e.g., the Dean study note) that there’s no comparison in appearance. I wish to thank the many readers who have sent in errata, or who have reported them anonymously at the Actuarial Discussion Forum. A partial list of students who sent in errata for the previous editions is: Kyle Allen, Casey Anderson, Carter Angell, Jason Ard, Opoku Archampong, April Ayres, Madhuri Bajaj, George Barnidge, Austin Barrington, Brian Basiaga, Michael Baznik, Michael Beck, Aaron Beaudoin, Marc Beaudoin, Aaron Beharelle, Yann Bernard, Shiri Bernstein, Elie Bochner, Karl Boettcher, Batya Bogopulsky, Kirsten Boyd, Andrew Brady, Kelsey Bridges, Ken Burton, Eric Buzby, Anna Buzueva, Emily Byrnes, Joshua Carlsen, Todd Carpino, Michael Castellano, Christi Cavalieri, Aaron Chase, Steve Cheung, Jonathan Choi, Julie Cholet, Albert Chua, Bryn Clarke, Darren Costello, Laura Cremerius, Jessica Culhane, Marco Dattilo, Gordon Davis, William Derech, Connie Di Pierro, Feng Dong, Ryan Dood, Jacob Efron, Jason Elleman, Sean Fakete, Amarya Feinberg, Sterling Felsted, Drew Fendler, Nick Fiechter, Gail Flamenbaum, Matthew Flanagan, Erin Flickinger, John Fries, Cory Fujimoto, Brad Fuxa, Meihua C/4 Study Manual—17th edition Copyright ©2014 ASM

xviii

CONTENTS

Gao, Yoram Gilboa, Sean Gingery, Shikha Goel, Lindsey Gohn, Aaron Hendrickson-Gracie, Joseph Gracyalny, Karen Grote, Brian Gugat, Zhoujie Guo, Amanda Hamala, Aaron Hendrickson-Gracie, Thomas Haggerty, Josh Harwood, David Hibbard, Jay Hines, Jennifer Ho, Martin Ho, Dean Guo, ennis Huang, Jonathon Huber, Wallace Hui, Professor Natalia Humphrey, Kenneth Hung, John Hutson, Andrew Ie, Anthony Ippolito, Matthew Iseler, Naqi Jaffery, Dennis Jerry, Merrick Johnson, Nathan Johnson, Jason Jurgill, Michael Kalina, Ethan Kang, Allen Katz, Patrick Kavanagh, Ben Kester, Anand Khare, Cory Kientoff, Geo Kini, Carol Ko, Bradley Koenen, Emily Kozlowski, Boris Krant, Reuvain Krasner, Stephanie Krob, Brian Kum, Takehiro Kumazawa, Brian Lake, Eric Lam, Gary Larson, Shivanie Latchman, Olivier Le Courtois, Charles Lee, Jacob Lee, Seung Lee, York Lee, Justin Lengermann, Theodore Leonard, Aw Yong Chor Leong, David Levy, Luyao Li, Tony Litterer, William Logan, Allison Louie, Sheryn Low, Yitzy Lowy, Grant Luloff, Grover MacTaggart, Sohini Mahapatra, Matthew Malkus, Kandice Marcacci, Grant Martin, Jason Mastrogiacomo, Patrick McCormack, Jacob McDougle, Maria Melguizo, Albert Miao, Jeremy Mills, Andy Moriarty, Daniel Moskala, Greg Moyer, Michael Nasti, Don Neville, Joseph Ng, Raymond Ng, Ryan Nolan, Stephen Nyamapfumba, Adam Okun, Saravuth Olunsiri, Kevin Owens, Gino Pagano, Christopher Palmer, Kong Foo Pang, Kamila Paszek, Tanya Pazitny, Jonathan Peters, James Pilarski, Amanda Popham, Forrest Preston, Andrew Rallis, Claudio Rebelo, Denise Reed, Jeremiah Reinkoester, Adam Rich, Christopher Roberts, Vanessa Robinson, Andrew Roggy, Maria Rutkowski, Heather Samuelson, Megan Scott, Colin Scheriff, Eric Schumann, Simon Schurr, Andy Shapiro, David Sidney, Phil Silverman, Carl Simon, Rajesh Singh, Betty Siu, Stephen Smith, Ian Spafford, Mark Spinozz, Erica Stead, Sebastian Strohmayr, Alison Stroop, David Stulman, Jonathan Szerszen, Susan Szpakowski, Jenny Tam, Todd Tauzer, Amy Thompson, Geoff Tims, David Tong, Mayer Toplan, Dustin Turner, Linh Van, Greg Vesper, Lei Wang, Joan Wei, Philip Welford, Caleb Wetherell, Patrick Wiese, Mendy Wenger, Adam Williams, Garrett Williams, Wilson Wong, Jeff Wood, Thomas Woodard, Serina Wu, Ziyan Xie, Bo Xu, Hoe Yan, Jason Yeung, Xue Ying, Rodrigo Zafra, Aaron Zeigler, Jenny Zhang, Moshe Zucker. I thank Professor Homer White for help with Example 34D.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Part I

Severity, Frequency, and Aggregate Loss

2

PART I. SEVERITY, FREQUENCY, AND AGGREGATE LOSS

This part is labeled “Severity, Frequency, and Aggregate Loss”. Let’s define these terms and one other term. Severity is the (average) size of a loss. If the average auto liability loss size is 27,000, then 27,000 is the severity. Frequency is the (average) number of claims per time period, usually per year. If a group of 100 policyholders submits an average of 5 claims per year, frequency is 0.05. Aggregate loss is the total losses paid per time period, usually per year. If a claim of 200 and a claim of 500 are submitted in a year, aggregate losses are 700. Pure premium is the (expected) aggregate loss per policyholder per time period, usually per year. If on the average 0.05 claims are submitted per year and each claim averages 10,000, and frequency and severity are independent, then pure premium is (0.05)(10,000)  500. In the above definitions, the words “average” and “expected” are in parentheses. The above terms are not that precise; sometimes they refer to a random variable and sometimes they refer to the expected value. It is OK to say pure premium is 500 (as we said above), but it is also OK to speak about the variance of pure premium. You will have to figure out the precise meaning from the context.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 1

Basic Probability Reading: Loss Models Fourth Edition 3.1–3.3

Before you start this lesson . . .

Have you read the preface? You may be familiar with textbooks with long prefaces which assume that you’re already familiar with all the material. These prefaces contain reflections of the author on the course, stories about how the book came to be, acknowledgements to the author’s family for having patience while the author was ignoring them and working on the book, etc. The preface to this manual is nothing like that! The preface is short and has information you need immediately to use this manual. It will answer questions such as: • How the hell am I supposed to know all the moments of the gamma distribution? (First example in this lesson) • The author’s solution to his exercise looks wrong! Is there an errata list? • The author’s solution to his exercise looks wrong, and there’s nothing in the errata list! What do I do now? • The author’s solution to his exercise looks right. Which of my friends should I thank for that? • I remember reading someplace in the manual about a Fisher information matrix, but can’t remember where it is. What do I do now? The preface also has some information which you don’t need immediately, but will be of interest eventually. For example, “What is the distribution of exam question by topic?” So please, take 5 minutes of your valuable time to read the preface.

Loss Models begins with a review of probability. This lesson is a very brief summary of probability. If you aren’t familiar with probability already, and find this summary inadequate, you can read chapters 2 and 3 in Loss Models. If that isn’t enough, you’ll have to study a probability textbook. (Did you ever take Exam P/Exam 1?)

1.1

Functions and moments

The cumulative distribution function of a random variable X, usually just called the distribution function, is the probability F ( x )  Pr ( X ≤ x ) . It defines X, and is right-continuous, meaning limh→0 F ( x + h )  F ( x ) C/4 Study Manual—17th edition Copyright ©2014 ASM

3

1. BASIC PROBABILITY

4

1−S

F d F dx

1−F

R

exp (−H )

S

− ln S

R

H

h

d H dx

h

Also:

f

• f (x )  −

f

• h (x ) 

d S (x ) dx

f (x ) S (x )

Figure 1.1: Relationships between probability functions

for h positive. Some random variables are discrete (there are isolated points x i at which Pr ( X  x i ) is nonzero) and some are continuous (meaning F ( x ) is continuous, and differentiable except at a countable number of points). Some are mixed—they are continuous except at a countable number of points. Some probability functions are: • S ( x ) is the survival function, the complement of F ( x ) , the probability of surviving longer than x, Pr ( X > x ) . • For a continuous random variable, f ( x ) is the probability density function. f ( x ) 

d dx F ( x ) .

• For a discrete random variable, p ( x ) is the probability mass function. p ( x )  Pr ( X  x ) . Often, f ( x ) satisfies the same relations for continuous variables as p ( x ) does for discrete variables. • h ( x ) is the hazard rate function. h ( x )  S ( x )  − d lndxS ( x ) . In International Actuarial Notation, µ x is used for this. h ( x ) is like a conditional density function, the conditional density given survival to time x. f (x )

• H ( x ) is the cumulative hazard rate function.

Z H (x ) 

x −∞

h ( t ) dt  − ln S ( x )

The distributions we will use will almost always assume nonnegative values only; in other words,Pr ( X < 0)  0. When the probability of a negative number is 0, we can set the lower bound of the integral to 0 instead of −∞.

A schematic relating these probability functions is shown in Figure 1.1. Why do we bother differentiating F to obtain f ? Because the density is needed for calculating moments. Moments of a random variable measure its center and dispersion. The expected value of X is defined by

Z E[X] 

∞ −∞

x f ( x ) dx

and more generally the expected value of a function of a random variable is defined by E[g ( X ) ]  C/4 Study Manual—17th edition Copyright ©2014 ASM

Z

∞ −∞

g ( x ) f ( x ) dx

1.1. FUNCTIONS AND MOMENTS

5

For discrete variables, the integrals are replaced with sums. The n th raw moment of X is defined as µ0n  E[X n ]. µ  µ01 is the mean. The n th central moment of X (n , 1) is defined as µ n  E[ ( X − µ ) n ].1 Usually n is a positive integer, but it need not be. Expectation is linear, so the central moments can be calculated from the raw moments by binomial expansion. In the binomial expansion, the last two terms always merge, so we have µ2  µ02 − µ2 µ3 

µ4 

µ03 µ04





3µ02 µ 4µ03 µ

+ 2µ +

instead of µ02 − 2µ01 µ + µ2

3

instead of

6µ02 µ2

− 3µ

4

instead of

µ03 µ04





3µ02 µ 4µ03 µ

+

+

3µ01 µ2 6µ02 µ2

(1.1) −µ −

3

4µ01 µ3

(1.2) +µ

4

Special functions of moments are: • The variance is Var ( X )  µ2 , and is denoted by σ 2 . • The standard deviation σ is the positive square root of the variance. • The skewness is γ1  µ3 /σ 3 . • The kurtosis is γ2  µ4 /σ 4 . • The coefficient of variation is σ/µ. Skewness measures how weighted a distribution is. A distribution with more weight on higher numbers has positive skewness and a distribution with more weight on lower numbers has negative skewness. A normal distribution has skewness of 0. Kurtosis measures how flat a distribution is. A distribution with more values further away from the mean has higher kurtosis. A normal distribution has skewness of 3. Skewness, kurtosis, and coefficient of variation are dimensionless. This means that if a random variable is multiplied by a positive constant, these three quantities are unchanged. We will discuss important things you should know about variance in Lesson 3. For the meantime, let’s repeat formula (1.1) using different notation, since it’s so important: Var ( X )  E[X 2 ] − E[X]2 Many times this is the best way to calculate variance. For two random variables X and Y: • The covariance is defined by Cov ( X, Y )  E ( X − µ X )( Y − µY ) .

f

g

• The correlation coefficient is defined by ρ XY  Cov ( X, Y ) / ( σX σY ) . As with the variance, another formula for covariance is Cov ( X, Y )  E[XY] − E[X] E[Y] For independent random variables, Cov ( X, Y )  0. Also, the covariance of a variable with itself is its variance: Cov ( X, X )  Var ( X ) . When there are two random variables, one can extract single variable distributions by summing (discrete) or integrating (continuous) over the other. These single-variable distributions are known as marginal distributions. 1This µ n has no connection to µ x , the force of mortality, part of International Actuarial Notation used in the study of life contingencies. C/4 Study Manual—17th edition Copyright ©2014 ASM

1. BASIC PROBABILITY

6

A 100p th percentile is a number π p such that F ( π p ) ≥ p and F ( π−p ) ≤ p. If F is strictly increasing, it is the unique point at which F ( π p )  p. A median is a 50th percentile. We’ll say more about percentiles later in this lesson. A mode is x such that f ( x ) (or Pr ( X  x ) for a discrete distribution) is maximized. The moment generating function is MX ( t )  E[e tX ] and the probability generating function is PX ( t )  X E[t ]. One useful thing proved with moment generating functions is that a sum of n independent exponential random variables with mean θ is a gamma random variable with parameters α  n and θ. We will discuss generating functions later in this lesson. Example 1A For the gamma distribution, as defined in the Loss Models Appendix:2 1. Calculate the coefficient of variation. 2. Calculate the skewness. 3. Calculate the limit of the kurtosis as α → ∞. 4. If X has a gamma distribution with α  5 and θ  0.1, calculate E[e X ]. Answer:

1. The appendix indicates that E[X k ]  ( α + k − 1)( α + k − 2) · · · ( α ) θ k . Hence E[X 2 ]  ( α + 1) αθ2

E[X]2  α2 θ 2 σ2  αθ2 So the coefficient of variation is



αθ αθ



1 √ α

.

2. Note that all terms in the numerator and denominator have a factor of θ 3 , which cancels and therefore may be ignored. So without loss of generality we will set θ  1. The numerator of the skewness fraction is E[X 3 ] − 3 E[X 2 ]µ + 2µ3  ( α + 2)( α + 1) α − 3 ( α + 1) α 2 + 2α3  α3 + 3α 2 + 2α − 3α 3 − 3α 2 + 2α3

 2α The denominator is α 3/2 , so the skewness is

2 √ α

. This goes to 0 as α goes to ∞.

3. Once again, θ may be ignored since θ 4 appears in both numerator and denominator. Setting θ  1, the variance is α and the denominator of the kurtosis fraction is σ4  α2 . The numerator is E[X 4 ] − 4 E[X 3 ]µ + 6 E[X 2 ]µ2 − 3µ4  ( α + 3)( α + 2)( α + 1)( α ) − 4 ( α + 2)( α + 1) α 2 + 6 ( α + 1) α3 − 3α 4 We only need the highest degree non-zero term, since this will dominate as α → ∞. The coefficient of α 4 (1 − 4 + 6 − 3) is zero, as is the coefficient of α3 (6 − 12 + 6), leaving α2 , whose coefficient is 11 − 8  3. The denominator is α2 , so the kurtosis goes to 3 as α → ∞.

4. This is the moment generating function of the gamma distribution evaluated at 1, M (1) , which you can look up in the appendix: M (1)  1 − θ (1)



 −α

 0.9−5

2Have you downloaded the tables from the SOA website yet? If not, please download them now, so that you will understand the solution to this example. See page xiv for instructions on where to find them. C/4 Study Manual—17th edition Copyright ©2014 ASM

1.2. PERCENTILES

7

However, we’ll carry out the calculation directly to illustrate expected values. X

E[e ] 



Z 0

Z  



Z0 ∞ 0

10  9 

e x f ( x ) dx ex x 4 e −10x dx Γ (5) 0.15 105 4 −9x x e dx Γ (5)

!5 Z

10 9

∞ 0

95 4 −9x x e dx Γ (5)

!5

because the final integral is the integral of a gamma density with α  5 and θ  integrate to 1.

1 9,

which must



Example 1B For an auto liability coverage, claim size follows a two-parameter Pareto distribution with parameters θ  10,000 and α. Median claim size is 5000. Determine the probability of a claim being greater than 25,000. Answer: By definition of median, F (5000)  0.5. But F ( x )  1 − 1−

10,000 10,000 + 5000



θ α θ+x ,

so we have

 0.5

α ln 23  ln 0.5

α  1.7096

The probability of a claim being greater than 25,000 is 1 − F (25,000) . 10,000 1 − F (25,000)  10,000 + 25,000

1.2

! 1.7096

 0.1175



Percentiles

Percentiles will play many roles in this course: 1. They will be used to fit curves using percentile matching (Lesson 31). 2. They play a role in the p–p plot (lesson 36); in fact, both p’s in p–p probably stand for percentile. 3. The inversion method of simulation selects random percentiles of a distribution as random numbers (Lesson 60). 4. The Value-at-Risk risk measure (Lesson 8) is a glorified percentile. Many times, instead of saying “the 100p th percentile” (and having to remember to multiply by 100), we prefer to say “the p th quantile” , which means the same thing. Percentiles are essentially an inverse function. Roughly speaking, if F ( x ) is the cumulative distribution for X, a q th quantile is a number x such that F ( x )  q, or in other words it is F −1 ( q ) . Here’s the precise definition again: C/4 Study Manual—17th edition Copyright ©2014 ASM

1. BASIC PROBABILITY

8

A 100p th percentile of a random variable X is a number π p satisfying these two properties: 1. Pr ( X ≤ π p ) ≥ p

2. Pr ( X < π p ) ≤ p

If the cumulative distribution function F is continuous and strictly increasing, it is the unique point at which Pr ( X ≤ π p )  p. In other words, the 100p th percentile is the x for which the graph of the cumulative distribution function F ( x ) equals or crosses the vertical level p. Common continuous distributions have a cumulative distribution function that is strictly increasing except when equal to 0 or 1. For these functions, the quantile (other than the 0 and 100 percentiles) is the inverse function, which is one-to-one. On the other hand, for a discrete distribution, or any distribution with point masses, the inverse may not be defined or well-defined. At points where the inverse is not defined, a single number will be a q th quantile for many q’s; at points where the inverse is not well-defined, many numbers will qualify as the q th quantile. Consider the following example: Example 1C A random variable X has the following distribution: F ( x )  0.2x Pr ( X  2)  0.35

0≤x≤1

Pr ( X  3)  0.35 Pr ( X  4)  0.10 Pr ( X  x )  0

otherwise

Calculate the 15th , 50th , and 90th percentiles of X. Answer: A graph of the distribution function makes it easier to understand what is going on. On a graph, the inverse function consists of starting on the y-axis, going to the right until you hit the function, then going straight down. A graph of F ( x ) is shown in Figure 1.2. The 15th percentile has a unique well-defined inverse, since F ( x ) is continuous in the area where it is equal to 0.15. The inverse is 0.75; F (0.75)  0.15. F (1)  0.2 and F (2)  0.55; there is no x such that F ( x )  0.5. However, the arrow from 0.5 hits a wall at x  2, so 2 is the unique 50th percentile. We can verify that 2 is the unique 50th percentile according to the definition given above: Pr ( X < 2) is no greater than 0.5 (it is 0.2), and Pr ( X ≤ 2) is at least 0.5 (it is 0.55). 2 is also every percentile from the 20th to the 55th . The arrow from 0.9 doesn’t hit a wall; it hits a horizontal line going from 3 to 4. There is no unique 90th percentile; every number from 3 to 4 is a 90th percentile.  For some purposes, it is desirable to have a smoothed percentile. One method of smoothing percentiles will be discussed in Lesson 31.

1.3

Conditional probability and expectation

The probability of event A given B, assuming Pr ( B ) , 0, is Pr ( A | B )  C/4 Study Manual—17th edition Copyright ©2014 ASM

Pr ( A ∩ B ) Pr ( B )

1.3. CONDITIONAL PROBABILITY AND EXPECTATION

9

F (x ) 1

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

2

1

3

4

x

Figure 1.2: Plot of F ( x ) in example 1C, illustrating 15th , 50th , and 90th percentiles.

where Pr ( A ∩ B ) is the probability of both A and B occurring. A corresponding definition for continuous distributions uses the density function f instead of Pr: fX ( x | y ) 

f ( x, y ) f ( y)

where f ( y )  f ( x, y ) dx , 0. Two important theorems are Bayes’ Theorem and the Law of Total Probability: Theorem 1.1 (Bayes’ Theorem)

R

Pr ( A | B ) 

Pr ( B | A ) Pr ( A ) Pr ( B )

Correspondingly for continuous distributions fX ( x | y ) 

fY ( y | x ) fX ( x ) fY ( y )

Theorem 1.2 (Law of Total Probability) If B i is a set of exhaustive (in other words, Pr (∪i B i )  1) and mutually exclusive (in other words Pr ( B i ∩ B j )  0 for i , j) events, then for any event A, Pr ( A ) 

X i

Pr ( A ∩ B i ) 

X i

Pr ( B i ) Pr ( A | B i )

Correspondingly for continuous distributions, Pr ( A ) 

Z

Pr ( A | x ) f ( x ) dx

Expected values can be factored through conditions too: Conditional Mean Formula EX [X]  EY EX [X | Y]

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

(1.3)

1. BASIC PROBABILITY

10

This formula is one of the double expectation formulas.3 More generally for any function g EX [g ( X ) ]  EY EX [g ( X ) | Y]

f

g

Here are examples of this important theorem. Example 1D There are two types of actuarial students, bright and not-so-bright. For each exam, the probability that a bright student passes it is 80%, and the probability that a not-so-bright student passes it is 40%. All students start with Exam 1 and take the exams in sequence, and drop out as soon as they fail one exam. An equal number of bright and not-so-bright students take Exam 1. Determine the probability that a randomly selected student taking Exam 3 will pass. Answer: A common wrong answer to this question is 0.5 (0.8) + 0.5 (0.4)  0.6. This is an incorrect application of the Law of Total Probability. The probability that a student taking Exam 3 is bright is more than 0.5, because of the elimination of the earlier exams. A correct way to calculate the probability is to first calculate the probability that a student is taking Exam 3 given the two types of students. Let I1 be the event of being bright initially (before taking Exam 1) and I2 the event of not being bright initially. Let E be the event of taking Exam 3. Then by Bayes Theorem and the Law of Total Probability, Pr (E | I1 ) Pr ( I1 ) Pr (E ) Pr (E )  Pr (E | I1 ) Pr ( I1 ) + Pr (E | I2 ) Pr ( I2 )

Pr ( I1 | E ) 

Now, the probability that one takes Exam 3 if bright is the probability of passing the first two exams, or 0.82  0.64. If not-so-bright, the probability is 0.42  0.16. So we have Pr (E )  0.64 (0.5) + 0.16 (0.5)  0.4 (0.64)(0.5) Pr ( I1 | E )   0.8 0.4 and Pr ( I2 | E )  1 − 0.8  0.2 (or you could go through the above derivation with I2 instead of I1 ). Now we’re ready to apply the Law of Total Probability to the conditional distributions given E to answer the question. Let P be the event of passing Exam 3. Then Pr ( P | E )  Pr ( P | I1 &E ) Pr ( I1 | E ) + Pr ( P | I2 &E ) Pr ( I2 | E )  (0.8)(0.8) + (0.4)(0.2)  0.72



Now let’s do a continuous example. Example 1E Claim sizes follow an exponential distribution with mean θ. The parameter θ varies by insured. Over all insureds, θ has a distribution with the following density function: f (θ) 

1 θ2

1≤θ 0.5)  e dθ θ2 1 This integral is calculated using the substitution u  −0.5/θ. If you need a refresher on how to carry out the substitution, see the sidebar. ∞

Z 1

e

−0.5/θ



1 dθ  2e −0.5/θ 2 θ 1 !

 2 1 − e −0.5  2 (1 − 0.606531)  0.786939





Incidentally, the distribution of θ is a single-parameter Pareto distribution with parameters θ  1 (a different θ) and α  1. Recognizing distributions is helpful, since then you can look up the moments in the table if you need them, rather than calculating them.  The unconditional distribution of this example is a continuous mixture. Section 4.1 will discuss mixtures.

1.4

Moment and probability generating functions

Usually, a generating function for a usually infinite sequence {a0 , a1 , a2 , . . .} is f ( z ) of the form f (z ) 

∞ X n0

C/4 Study Manual—17th edition Copyright ©2014 ASM

an z n

1. BASIC PROBABILITY

12

The idea of a generating function is that if you differentiate this function n times, divide by n!, and evaluate at 0, you will recover a n : f ( n ) (0)  an n! where f ( n ) indicates the n th derivative. With this in mind, let’s discuss moment and probability generating functions. The moment generating function (MGF) is defined by MX ( t )  E[e tX ] It has the property that M ( n ) (0) , the n th derivative evaluated at 0, is the n th raw moment. Unlike for other generating functions, the n th derivative is already the n th moment and is not divided by n! to get the n th moment. Another useful property of the moment generating function is, if X is the sum of independent random variables, its moment generating function is the product of the moment generating functions of those variables. It’s usually difficult to add random variables, but if you multiply their moment generating functions and recognize the result, that tells you the distribution of the sum of the random variables. The probability generating function (PGF) is defined by P ( z )  E[z X ]  M (ln z ) The textbook uses it for discrete distributions, and the tables you get at the exam list it (using the notation P ( z ) ) for those distributions. It lives up to its name: the n th derivative at 0, divided by n!, is the probability that the random variable equals n, or P ( n ) (0) pn  n! where P ( n ) denotes the n th derivative. Another useful property of the pgf is that the n th derivative of the pgf evaluated at 1 is the n th factorial moment. The n th factorial moment is µ ( n )  E[X ( X − 1) · · · ( X − n + 1) ]. Examples of factorial moments calculated with the pgf are P 0 (1)  E[X] P 00 (1)  E[X ( X − 1) ]

P 000 (1)  E[X ( X − 1)( X − 2) ] and in general, using f ( n ) for the n th derivative of f , and the textbook’s notation µ ( n ) for the n th factorial moment P ( n ) (1)  µ ( n ) (1.4) With some algebra you can derive the central or raw moments from the factorial moments. For example, since µ (2)  E[X ( X − 1) ]  E[X 2 ] − E[X], it follows that µ02  µ (2) + µ.4 Like the moment generating function, if X is the sum of independent random variables, its probability generating function is a product of the probability generating functions of those variables. Example 1F Calculate the third raw moment of a negative binomial distribution with parameters r and β.

4The textbook only mentions that P 0 (1)  E[X] and P 00 (1)  E[X ( X − 1) ], but not the generalization to higher derivatives of the pgf. The textbook mentions factorial moments only in the appendix that has the distribution tables. C/4 Study Manual—17th edition Copyright ©2014 ASM

1.5. THE EMPIRICAL DISTRIBUTION

13

Answer: The tables give us the mean, variance, and pgf, and that is where the next line’s expansion of the pgf is coming from. P ( z )  1 − β ( z − 1)



 −r

P 000 ( z )  (−β ) 3 (−r ) − ( r + 1)





P 000 (1)  β3 r ( r + 1)( r + 2)

Also,

− ( r + 2)



1 − β ( z − 1)

 −( r+3)

P 000 (1)  E[X ( X − 1)( X − 2) ]  E[X 3 ] − 3 E[X 2 ] + 2 E[X]

and E[X]  rβ while E[X 2 ]  Var ( X ) + E[X]2  rβ (1 + β ) + r 2 β2 , so

E[X 3 ]  r ( r + 1)( r + 2) β3 + 3 ( rβ + rβ 2 + r 2 β 2 ) − 2rβ

1.5



The empirical distribution

There are many continuous and discrete probability distributions in the tables you get at the exam. However, every time you have a sample, you can create a probability distribution based on it. Given a sample x1 , . . . , x n , the empirical distribution is the probability distribution assigning a probability of n1 to each item in the sample. It is a discrete distribution.

Example 1G You are given the sample 1, 1, 2, 3, 5. Calculate: 1. The empirical mean. 2. The empirical variance. 3. The empirical skewness. 4. The empirical 80th percentile.

5. The empirical probability generating function. Answer: The empirical distribution assigns a probability of 1/5 to each point, so we have x Pr ( X  x )

1 0.4

2 0.2

3 0.2

5 0.2

1. The mean is 0.4 (1) + 0.2 (2) + 0.2 (3) + 0.2 (5)  2.4 . 2. The variance is σ 2  0.4 (1 − 2.4) 2 + 0.2 (2 − 2.4) 2 + 0.2 (3 − 2.4) 2 + 0.2 (5 − 2.4) 2  2.24 Alternatively, you could calculate the second raw moment and subtract the square of the mean: µ02  0.4 (12 ) + 0.2 (22 ) + 0.2 (32 ) + 0.2 (52 )  8 σ 2  8 − 2.42  2.24.

C/4 Study Manual—17th edition Copyright ©2014 ASM

1. BASIC PROBABILITY

14

3. The raw third moment is µ03  0.4 (13 ) + 0.2 (23 ) + 0.2 (33 ) + 0.2 (53 )  32.4 The coefficient of skewness is γ1 

32.4 − 3 (8)(2.4) + 2 (2.43 )  0.730196 2.243/2

4. Any number x such that Pr ( X < x ) ≤ 0.8 and Pr ( X ≤ x ) ≥ 0.8 is an 80th percentile. This is true for 3 ≤ x ≤ 5. In fact, the graph of the distribution is horizontal between 3 and 5. So the set of 80th percentiles is {x : 3 ≤ x ≤ 5} . 5. P ( z )  E[z x ]  0.4z + 0.2z 2 + 0.2z 3 + 0.2z 5



Exercises 1.1.

The random variable X has a uniform distribution on [0, 1].

Let h ( x ) be its hazard rate function. Calculate h (0.75) . 1.2.

For a random variable X you are given that

(i) The mean is 4. (ii) The variance is 2. (iii) The raw third moment is 3. Determine the skewness of X. 1.3.

A random variable X has a gamma distribution with parameters α  2, θ  100.

Determine the kurtosis of X. 1.4.

[4B-S93:34] (1 point) Claim severity has the following distribution: Claim Size

Probability

100 200 300 400 500

0.05 0.20 0.50 0.20 0.05

Determine the distribution’s skewness. (A) −0.25 (B) 0 (E) Cannot be determined

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.15

(D) 0.35

Exercises continue on the next page . . .

EXERCISES FOR LESSON 1

15

Table 1.1: Summary of Probability Concepts

• Median is 50th percentile; n th quartile is 25n th percentile.

Probability Functions F ( x )  Pr ( X ≤ x )

S (x )  1 − F (x ) dF ( x ) f (x )  dx H ( x )  − ln S ( x )

• Mode is x which maximizes f ( x ) .

h (x ) 



• MX( n ) (0)  E[X n ], where M ( n ) is the n th derivative

f (x ) dH ( x )  dx S (x )

Expected value

E[g ( X ) ] 



n th raw moment µ0n  E[X n ]

−∞

g ( x ) f ( x ) dx

n th

µ n  E[ ( X − µ ) ]

Variance

σ2  E[ ( X − µ ) 2 ]  E[X 2 ] − µ2

Skewness

γ1 

central moment

Kurtosis Moment generating function Probability generating function

 Pr ( X  n )

• Bayes’ Theorem: Pr ( B | A ) Pr ( A ) Pr ( B ) fY ( y | x ) fX ( x ) fX ( x | y )  fY ( y )

Pr ( A | B ) 

n

µ3 µ03 − 3µ02 µ + 2µ3  σ3 σ3 µ04 − 4µ03 µ

+ 6µ02 µ2 − 3µ4 µ4  σ4 σ4 tX MX ( t )  E[e ] γ2 

• Law of Total Probability: If B i is a set of exhaustive (in other words, Pr (∪i B i )  1) and mutually exclusive (in other words Pr ( B i ∩ B j )  0 for i , j) events, then for any event A, Pr ( A ) 

X

Pr ( A∩B i ) 

i

P ( z )  E[z X ]

• Standard deviation (σ) is positive square root of variance • Coefficient of variation is σ/µ. •

n!

• PX( n ) (1) is the n th factorial moment of X.

Functions of random variables

Z

PX( n ) (0)

100p th

percentile π is any point satisfying ≤ p and F ( π ) ≥ p. If F is continuous, it is the unique point satisfying F ( π )  p. F ( π− )

C/4 Study Manual—17th edition Copyright ©2014 ASM

X i

Pr ( B i ) Pr ( A | B i )

Correspondingly for continuous distributions, Pr ( A ) 

Z

Pr ( A | x ) f ( x ) dx

• Conditional Expectation Formula: EX [X]  EY EX [X | Y]

f

g

(1.3)

Exercises continue on the next page . . .

1. BASIC PROBABILITY

16

1.5. [4B-F98:27] (2 points) Determine the skewness of a gamma distribution with a coefficient of variation of 1. Hint: The skewness of a distribution is defined to be the third central moment divided by the cube of the standard deviation. (A) 0

(B) 1

(C) 2

(D) 4

(E) 6

1.6. You are given the following joint distribution of two random variables X and Y:

( x, y )

Pr ( X, Y )  ( x, y )

(0,0) (0,1) (0,2)

0.45 0.10 0.05





( x, y )

Pr ( X, Y )  ( x, y )

(1,0) (1,1) (2,1)

0.20 0.15 0.05





Calculate the correlation of X and Y. 1.7. You are given the following joint distribution of two random variables X and Y:

( x, y )

Pr ( X, Y )  ( x, y )

(1,1) (2,1)

0.32 0.24





( x, y )

Pr ( X, Y )  ( x, y )

(1,2) (2,3)

0.22 0.10





( x, y )

Pr ( X, Y )  ( x, y )

(1,3)

0.12





Calculate the third central moment of X. 1.8. [4B-S95:28] (2 points) You are given the following: •

For any random variable X with finite first three moments, the skewness of the distribution of X is denoted Sk ( X ) .



X and Y are independent, identically distributed random variables with mean  0 and finite second and third moments. Which of the following statements must be true?

1.

2 Sk ( X )  Sk (2X )

2.

− Sk ( Y )  Sk (−Y )

3.

| Sk ( X ) | ≥ | Sk ( X + Y ) |

(A) 2 (B) 3 (E) None of A, B, C, or D

(C) 1,2

(D) 2,3

1.9. [4B-S97:21] (2 points) You are given the following: •

Both the mean and the coefficient of variation of a particular distribution are 2.



The third moment of this distribution about the origin is 136. Determine the skewness of this distribution.

Hint: The skewness of a distribution is defined to be the third central moment divided by the cube of the standard deviation. (A) 1/4

(B) 1/2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1

(D) 4

(E) 17

Exercises continue on the next page . . .

EXERCISES FOR LESSON 1

17

1.10. [4-S01:3] You are given the following times of first claim for five randomly selected auto insurance policies observed from time t  0: 1

2

3

4

5

Calculate the kurtosis of this sample. (A) 0.0

(B) 0.5

(C) 1.7

(D) 3.4

(E) 6.8

[4B-S97:24] (2 points) The random variable X has the density function

1.11.

f (x ) 

4x , 0 < x < ∞. (1 + x 2 ) 3

Determine the mode of X. (A) (B) (C) (D) (E) 1.12.

0 Greater than 0, but less than 0.25 At least 0.25, but less than 0.50 At least 0.50, but less than 0.75 At least 0.75 [3-F01:37] For watches produced by a certain manufacturer:

(i) Lifetimes follow a single-parameter Pareto distribution with α > 1 and θ  4. (ii) The expected lifetime of a watch is 8 years. Calculate the probability that the lifetime of a watch is at least 6 years. (A) 0.44 1.13.

(B) 0.50

(C) 0.56

(D) 0.61

(E) 0.67

[4B-F99:29] (2 points) You are given the following:



A is a random variable with mean 5 and coefficient of variation 1.



B is a random variable with mean 5 and coefficient of variation 1.



C is a random variable with mean 20 and coefficient of variation 1/2.



A, B, and C are independent.



X  A + B.



Y  A + C.

Determine the correlation coefficient between X and Y. √ √ (A) −2/ 10 (B) −1/ 10 (C) 0

C/4 Study Manual—17th edition Copyright ©2014 ASM

√ (D) 1/ 10

√ (E) 2/ 10

Exercises continue on the next page . . .

1. BASIC PROBABILITY

18

1.14.

[CAS3-F03:17] Losses have an Inverse Exponential distribution. The mode is 10,000.

Calculate the median. (A) (B) (C) (D) (E) 1.15.

Less than 10,000 At least 10,000, but less than 15,000 At least 15,000, but less than 20,000 At least 20,000, but less than 25,000 At least 25,000 [CAS3-F03:19] For a loss distribution where x ≥ 2, you are given:

(i) The hazard rate function: h ( x )  z 2 /2x, for x ≥ 2 (ii) A value of the distribution function: F (5)  0.84 Calculate z. (A) 2 1.16.

(B) 3

(C) 4

(D) 5

(E) 6

A Pareto distribution has parameters α  4 and θ  1.

Determine its skewness. (A) (B) (C) (D) (E)

Less than 7.0 At least 7.0, but less than 7.5 At least 7.5, but less than 8.0 At least 8.0, but less than 8.5 At least 8.5

1.17. [CAS3-S04:28] A pizza delivery company has purchased an automobile liability policy for its delivery drivers from the same insurance company for the past five years. The number of claims filed by the pizza delivery company as the result of at-fault accidents caused by its drivers is shown below: Year 2002 2001 2000 1999 1998

Claims 4 1 3 2 15

Calculate the skewness of the empirical distribution of the number of claims per year. (A) (B) (C) (D) (E)

Less than 0.50 At least 0.50, but less than 0.75 At least 0.75, but less than 1.00 At least 1.00, but less than 1.25 At least 1.25

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 1

19

1.18. [CAS3-F04:28] A large retailer of personal computers issues a warranty contract with each computer that it sells. The warranty covers any cost to repair or replace a defective computer within the first 30 days of purchase. 40% of all claims are easily resolved with minor technical help and do not involve any cost to replace or repair. If a claim involves some cost to replace or repair, the claim size is distributed as a Weibull with parameters τ  1/2 and θ  30. Which of the following statements are true? 1. 2. 3.

The expected cost of a claim is $60. The survival function at $60 is 0.243. The hazard rate at $60 is 0.012.

(A) 1 only.

(B) 2 only.

(C) 3 only.

(D) 1 and 2 only.

(E) 2 and 3 only.

You are given for the random variable X:

1.19.

(i) E[X]  3 (ii) Var ( X )  100 (iii) E[X 3 ]  30 Calculate the skewness of X. (A) (B) (C) (D) (E)

Less than −1 At least −1, but less than −0.5 At least −0.5, but less than 0 At least 0, but less than 0.5 At least 0.5 The variable X follows a normal distribution with mean 15 and variance 100.

1.20.

Calculate the fourth central moment of X. You are given the following:

1.21. •

X is a random variable with probability density function f (x ) 



E[X]  7500.



E[X 2 ] 75,000,000.



m is the median of X.

α β

!

β x

! α+1

x ≥ β, α > 0, β > 0

Determine the value of f ( m ) . (A) (B) (C) (D) (E)

Less than 0.00020 At least 0.00020, but less than 0.00025 At least 0.00025, but less than 0.00030 At least 0.00030, but less than 0.00035 At least 0.00035

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

1. BASIC PROBABILITY

20

[4-F00:32] You are given the following for a sample of five observations from a bivariate distribution:

1.22. (i)

x

y

1 2 4 5 6

4 2 3 6 4

x¯  3.6, y¯  3.8.

(ii)

A is the covariance of the empirical distribution Fe as defined by these five observations. B is the maximum possible covariance of an empirical distribution with identical marginal distributions to Fe . Determine B − A. (A) 0.9

(B) 1.0

(C) 1.1

(D) 1.2

(E) 1.3

1.23. [CAS3-F04:24] A pharmaceutical company must decide how many experiments to run in order to maximize its profits. •

The company will receive a grant of $1 million if one or more of its experiments is successful.



Each experiment costs $2,900.



Each experiment has a 2% probability of success, independent of the other experiments.



All experiments run simultaneously.



Fixed expenses are $500,000.



Ignore investment income. The company performs the number of experiments that maximizes its expected profit. Determine the company’s expected profit before it starts the experiments.

(A) 77,818 1.24. 800.

(B) 77,829

(C) 77,840

(D) 77,851

(E) 77,862

Claim size for an insurance coverage follows a lognormal distribution with mean 1000 and median

Determine the probability that a claim will be greater than 1200. 1.25. Claim sizes for Kevin follow an exponential distribution with mean 6. Claim sizes for Kira follow an exponential distribution with mean 12. Claim sizes for Kevin and Kira are independent. Kevin and Kira submit one claim apiece. Calculate the probability that the sum of the two claims is greater than 20. Additional released exam questions: CAS3-S06:25, CAS3-F06:25

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 1

21

Solutions 1.1. For a uniform distribution, F ( x )  x for 0 ≤ x ≤ 1. We then calculate S, f , and then h  f /S: S (x )  1 − F (x )  1 − x dF f (x )  1 dx f (x ) 1 h (x )   S (x ) 1 − x 1 h (0.75)   4 1 − 0.75

1.2. We will use formula (1.2) for the numerator, µ3 . µ4

µ02  2 + 42  18

µ3  3 − 3 (18)(4) + 2 (43 )  −85 √ σ3  2 2 −85 γ1  √  −30.052 2 2 1.3. Using the Loss Models tables, we have E[X k ]  θ k ( α + k − 1) · · · α. So µ  200

σ2  60000 − 2002  20000

µ02  60000

µ03  24 · 106

µ04  120 · 108

µ4  108 120 − 4 (24)(2) + 6 (6)(22 ) − 3 (24 )  24 · 108



γ2 



24 · 108  6 4 · 108

1.4. The distribution is symmetric, so its skewness is 0 . (B) If this is not obvious to you, calculate the mean, which is 300. Then note that µ3  0.05 (100 − 300) 3 + 0.20 (200 − 300) 3 + 0.20 (400 − 300) 3 + 0.05 (500 − 300) 3  0

so the coefficient of skewness, which is µ3 divided by σ3 , is 0. 1.5. σ2  αθ2

µ  αθ σ 1 1  √ ⇒α1 µ α 3 θ (1)(2)(3) − 3θ 3 (1)(2) + 2θ 3 (13 ) γ1    3/2 (1)( θ2 ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

6−6+2  2 1

(C)

1. BASIC PROBABILITY

22

1.6. For the covariance, use Cov ( X, Y )  E[XY] − E[X] E[Y]. E[X]  (0.45 + 0.10 + 0.05)(0) + (0.20 + 0.15)(1) + 0.05 (2)  0.45 E[Y]  (0.45 + 0.20)(0) + (0.10 + 0.15 + 0.05)(1) + 0.05 (2)  0.4 E[XY]  (0.45 + 0.20 + 0.10 + 0.05)(0) + (0.15)(1) + 0.05 (2)  0.25 Cov ( X, Y )  0.25 − (0.45)(0.4)  0.07 Calculate the raw second moments, and then the variances, of X and Y. E[X 2 ]  (0.20 + 0.15)(12 ) + 0.05 (22 )  0.55 E[Y 2 ]  (0.10 + 0.15 + 0.05)(12 ) + 0.05 (22 )  0.5 0.07 ρ XY  √  0.2036 (0.3475)(0.34)

Var ( X )  0.55 − 0.452  0.3475 Var ( Y )  0.5 − 0.42  0.34

1.7. Ignore Y and use the marginal distribution of X. From the given data, p1  0.32 + 0.22 + 0.12  0.66 and p2  0.34. So µ  E[X]  0.66 (1) + 0.34 (2)  1.34 and E[ ( X − µ ) 3 ]  0.66 (1 − 1.34) 3 + 0.34 (2 − 1.34) 3  0.071808 . 1.8. Skewness is dimensionless; doubling a random variable has no effect on skewness, since it multiplies the numerator and denominator by 23  8. So 1 is false. Negating a random variable negates the numerator, without affecting the denominator since σ is always positive, so 2 is true. One would expect statement 3 to be true, since as more identical random variables get added together, the distribution becomes more and more normal (which has skewness 0). To demonstrate statement 3: in 3 3 2  23/2 σX . In the numerator, the denominator, Var ( X + Y )  Var ( X ) + Var ( Y )  2σX , so σX+Y E[ ( X + Y ) 3 ]  E[X 3 ] + 3 E[X 2 ] E[Y] + 3 E[X] E[Y 2 ] + E[Y 3 ]  2 E[X 3 ] where the last equality results from the fact that E[X]  E[Y]  0. So Sk ( X + Y ) 

2 E[X 3 ] Sk ( X )  √ 3 23/2 σX 2

making 3 true. (D) 1.9. From the coefficient of variation, we have σ 2 µ σ4 σ2  16 E[X 2 ]  16 + µ2  20 σ3  64 γ1 

C/4 Study Manual—17th edition Copyright ©2014 ASM

136 − 3 (20)(2) + 2 (8) 1  64 2

(B)

EXERCISE SOLUTIONS FOR LESSON 1

1.10.

The variance is σ2 

23

(1 − 3) 2 + (2 − 3) 2 + (4 − 3) 2 + (5 − 3) 2 5

2

The fourth central moment is µ4 

(1 − 3) 4 + (2 − 3) 4 + (4 − 3) 4 + (5 − 3) 4 5

Kurtosis is γ2  1.11.

µ4 6.8  2  1.7 2 σ4

 6.8

(C)

This is a Burr distribution with γ  2, α  2, θ  1. According to the tables, the mode is γ−1 θ αγ + 1

! 1/γ

1  5

! 1/2  0.4472

(C)

If you wish to do the exercise directly: differentiate. We don’t need the denominator of the derivative, since all we want to do is set the entire expression equal to zero. numerator f 0 ( x )  (1 + x 2 ) 3 (4) − (4x )(3)(1 + x 2 ) 2 (2x )





 (1 + x 2 ) 2 (4) (1 + x 2 − 6x 2 )



1 − 5x 2  0 √ x  0.2  0.4472 1.12.



(C)

For a single-parameter Pareto, E[X]  αθ/ ( α − 1) , so

!

α (4) 8 α−1

α2 Then S (6)  (4/6) 2  4/9 . (A) 1.13.

Using the means and coefficients of variations, we have σA  σ B  5

Var ( A )  Var ( B )  25

σC  10

E[A2 ]  E[B 2 ]  52 + 25  50

Var ( C )  100

Also Cov ( A + B ) , ( A + C )  Cov ( A, A ) + Cov ( A, C ) + Cov ( B, A ) + Cov ( B, C )  Var ( A ) + 0 + 0 + 0  25





because A, B, and C are independent. Therefore, √ √ σA+B  25 + 25  50 √ √ σA+C  25 + 100  125 ρ√

C/4 Study Manual—17th edition Copyright ©2014 ASM

25

(50)(125)



25 1 √  √ 25 10 10

(D)

1. BASIC PROBABILITY

24

1.14. For the inverse exponential distribution, the mode is e −θ/x  0.5. Then

θ 2,

so θ  20,000. The median is x such that

−20,000  ln 0.5  − ln 2 x 20,000 x  28,854 ln 2

 R

1.15.

The survival function is S (5)  exp −

5 2

(E)



h ( u ) du , so

1 − 0.84  exp −

5

Z 2

z2 du 2u

!

5

z2 ln u + 2 2

0.16  exp *−

,

-

z2  exp − (ln 5 − ln 2) 2  exp 2  5

z2 ln 2/5 2

!

!

! z 2 /2

z 2 ln 0.16  2 2 ln 0.4 z 2 (A) z could also be −2, but that was not one of the five answer choices. Fortunately for the CAS, they didn’t give range choices with a range including −2 and not 2, since they’d have to accept two answers then. 1.16.

Using the tables, we have

6θ 3 1 (3)(2)(1) 2θ 2 1 E[X 2 ]   (3)(2) 3 1 E[X]  3 1 1 2 Var ( X )  −  3 9 9 E[X 3 ] 

γ1 

1−3

1 1 3 3 + 2 3/2 9

2

1 3 3

 7.0711

(B)

1.17. For the empirical distribution, each observation is treated as having probability culate empirical first, second, and third raw moments: 4 + 1 + 3 + 2 + 15 5 5 16 + 1 + 9 + 4 + 225 µ02   51 5 µ

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 n

 15 , so we cal-

EXERCISE SOLUTIONS FOR LESSON 1

25

64 + 1 + 27 + 8 + 3375  695 5 µ2  51 − 52  26 µ03 

Then the skewness is γ1  1.18.

695 − 3 (51)(5) + 2 (53 )  1.3577 261.5

(E)

For the Weibull, the expected value is (using the tables) E[X]  30Γ (1 + 2)  30 (2)  60

However, there is a 40% chance of not paying a claim, so the expected cost of a claim is less than 60 (in fact, it is 0.6 (60) ) making 1 false. The survival function is Pr ( X > 0) Pr ( X > 60 | X > 0) . The first factor is 60%. The second factor is the survival function of a Weibull at 60, which is e − ( x/θ )  e − (60/30) τ

0.5



 e−

2

 0.243

so the survival function of the cost is 0.6 (0.243) , making 2 false. By the choices given, you already know the answer has to be (C), since that is the only choice that has 1 and 2 false, but let’s discuss why. The hazard rate function of the Weibull is the quotient of the density over the survival, or f ( x ) τ ( x/θ ) τ e − ( x/θ ) /x h (x )   S (x ) e − ( x/θ ) τ τ ( x/θ ) τ (1/2) x −1/2   x 301/2 0.5  0.011785 h (60)  √ (60)(30) τ

The 60% probability of a cost affects f ( x ) and S ( x ) equally. Both functions are multiplied by 0.60, since there is a 0.40 probability of 0. Thus the hazard rate we are interested in involves multiplying and dividing the Weibull hazard rate by 0.6. These two operations cancel out, leaving us with 0.011785 as the hazard rate. Another way to see this is to look at the alternative formula for h ( x ) : h (x )  −

d ln S ( x ) dx

S ( x ) is 0.6 of the Weibull survival function. When you log it, ln S ( x ) is the log of Weibull survival plus ln 0.6. When you differentiate with respect to x, the constant ln 0.6 drops out. 1.19. 30 − 3 (100 + 32 )(3) + 2 (33 ) 897 γ1  −  −0.897 (B) 1.5 1000 100 1.20. The kurtosis of a normal distribution is 3. By definition, the kurtosis is the fourth central moment divided by the variance squared. So E[X 4 ]  3 (1002 )  30,000 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

1. BASIC PROBABILITY

26

1.21. You could recognize that this is a single parameter Pareto and look it up in the tables. It isn’t too hard to integrate either.

!

Z

!

Z

!

Z

α α+1 β F (x )  β α α+1 β β

E[X] 

α α+1 β β

E[X 2 ] 

x

β du 1− α+1 x u



αβ dx  xα α−1

β

β ∞ β



αβ2 dx  x α−1 α − 2

So we have αβ  7500 α−1 αβ 2  75,000,000 α−2

(*)

Dividing the second expression by the square of the first,

( α − 1) 2 4  ( α − 2) α 3 3 ( α − 1) 2  4α ( α − 2) 3α 2 − 6α + 3  4α 2 − 8α α 2 − 2α − 3  0 α  3, −1

and we reject −1 since α > 0. Then from (*), 32 β  7500, so β  23 (7500)  5000. The median is determined from F ( m )  0.5, or 5000 m

!3

 0.5

5000 m  √3  6299.61 0.5

f (m ) 

3 5000

!

5000 6299.61

!4  0.0002381

(B)

1.22. The empirical distribution assigns a probability of n1 to each of n observations. The covariance is the sum of the products of the observations minus the products of the means times the probabilities, which are all n1  51 . To maximize the covariance, the y’s should be in the same order as the x’s. The sum of the products is (1)(4) + (2)(2) + (4)(3) + (5)(6) + (6)(4)  74. If the y’s were ordered in increasing order, the sum would be (1)(2) + (2)(3) + (4)(4) + (5)(4) + (6)(6)  80. We must subtract x¯ y¯ ¯ Then from each and divide by 5, but since we’re just comparing A and B, we won’t bother subtracting x¯ y. 0.2 (80 − 74)  1.2 . (D) 1.23.

The probability of success for n experiments is 1 − 0.98n , so the profit, ignoring fixed expenses, is 1,000,000 (1 − 0.98n ) − 2900n

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 1

27

Differentiating this and setting it equal to 0: −106 (0.98n )(ln 0.98) − 2900  0 0.98n  n

−2900 106 ln 0.98 ln 10−2900 6 ln 0.98 ln 0.98

 96.0815

Thus either 96 or 97 experiments are needed. Plugging those numbers into the original expression g ( n )  1,000,000 (1 − 0.98n ) − 2900n gets g (96)  577,818.4 and g (97)  577,794.0, so 96 is best, and the expected profit is 577,818.4 − 500,000  77,818.4 . (A) An alternative to calculus that is more appropriate for this discrete problem is to note that as n increases, at first expected profit goes up and then it goes down. Let X n be the expected profit with n experiments. Then X n  106 (1 − 0.98n ) − 2900n − 500,000

and the incremental profit generated by experiment #n is

X n − X n−1  106 0.98n−1 − 0.98n − 2900.





We want this difference to be greater than 0, which occurs when 106 0.98n−1 − 0.98n > 2900





0.98n−1 (0.02) > 0.0029 0.0029 0.98n−1 >  0.145 0.02 ( n − 1) ln 0.98 > ln 0.145 ln 0.145 −1.93102   95.582 n−1 < ln 0.98 −0.02020

On the last line, the inequality got reversed because we divided by ln 0.98, a negative number. We conclude that the n th experiment increases profit only when n < 96.582, or n ≤ 96, the same conclusion as above. 1.24.

Since the median is 800, e µ  800 and µ  ln 800. Since the mean is 1000, e µ+σ 800e σ e

2 /2

σ 2 /2

 1000  1.25

2

σ  0.4463 σ  0.6680 The probability of a claim greater than 1200 is ln 1200 − ln 800 1 − F (1200)  1 − Φ  1 − Φ (0.61)  0.2709 0.6680

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

2 /2

 1000. Therefore:

1. BASIC PROBABILITY

28

1.25. By the Law of Total Probability, the probability that the sum of two claims is greater than 20 is the integral over all x of the probability that the sum is greater than 20 given that Kira’s claim is x times the density of Kira’s distribution. (The problem can also be done integrating over Kevin’s distribution.) If X is Kira’s claim and Y is Kevin’s claim, then Pr ( X + Y > 20) 



Z 0

Z 

0



Pr ( X + Y > 20 | x ) f ( x ) dx Pr ( X + Y > 20 | x )

1 −x/12 e dx 12

Now, Pr ( X +Y > 20)  1 if X > 20 since Y can’t be negative. If X < 20, then Y > 20− X, and the probability of that under an exponential distribution with mean 6 is Pr ( Y > 20 − X )  e − (20−x )/6 So we split the integral up into x ≥ 20 and 0 ≤ x ≤ 20. Pr ( X + Y > 20) 

Z

∞ 20

1 −x/12 e dx + 12

20

Z 0

e − (20−x )/6

1 −x/12 e dx 12

The first integral, the probability of an exponential random variable with mean 12 being greater than 20, is 1 − FX (20)  e −20/12 . The second integral is 1 12

20

Z 0

e

− (20−x ) /6 −x/12

e

1 dx  12

20

Z 0

e − (40−x )/12 dx

20 1 (12) e −(40−x )/12 12 0  e −20/12 − e −40/12 

So the final answer is Pr ( X + Y > 20)  e −20/12 + e −20/12 − e −40/12  0.188876 + 0.188876 − 0.035674  0.34208

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 2

Parametric Distributions Reading: Loss Models Fourth Edition 4, 5.1, 5.2.1–5.2.3, 5.3—5.4 A parametric distribution is one that is defined by a fixed number of parameters. Examples of parametric distributions are the exponential distribution (parameter θ) and the Pareto distribution (parameters α, θ). Any distribution listed in the Loss Models appendix is parametric. The alternative to a parametric distribution is a data-dependent distribution. A data-dependent distribution is one where the specification requires at least as many “parameters” as the number of data points in the sample used to create it; the bigger the sample, the more “parameters”. Examples of data-dependent distributions are: 1. The empirical distribution based on a sample of size n, as defined in Section 1.5. 2. A kernel-smoothed distribution, as defined in Lesson 27. It is traditional to use parametric distributions for claim counts (frequency) and loss size (severity). Parametric distributions have many advantages. One of the advantages of parametric distributions which makes them so useful for severity is that they handle inflation easily.

2.1

Scaling

A parametric distribution is a member of a scale family if any positive multiple of the random variable has the same form. In other words, the distribution function of cX, for c a positive constant, is of the same form as the distribution function of X, but with different values for the parameters. Sometimes the distribution can be parametrized in such a way that only one parameter of cX has a value different from the parameters of X. If the distribution is parametrized in this fashion, so that the only parameter of cX having a different value from X is θ, and the value of θ for cX is c times the value of θ for X, then θ is called a scale parameter. All of the continuous distributions in the tables (Appendix A) are scale families. The parametrizations given in the tables are often different from those you would find in other sources, such as your probability textbook. They are parametrized so that θ is the scale parameter. Thus when you are given that a random variable has any distribution in the appendix and you are given the parameters, it is easy to determine the distribution of a multiple of the random variable. The only distributions not parametrized with a scale parameter are the lognormal and the inverse Gaussian. Even though the inverse Gaussian has θ as a parameter, it is not a scale parameter. The parametrization for the lognormal given in the tables is the traditional one. If you need to scale a lognormal, proceed as follows: if X is lognormal with parameters ( µ, σ ) , then cX is lognormal with parameters ( µ + ln c, σ ) . To scale a random variable not in the tables, you’d reason as follows. Let Y  cX, c > 0. Then



FY ( y )  Pr ( Y ≤ y )  Pr ( cX ≤ y )  Pr X ≤

y y  FX c c

!

One use of scaling is in handling inflation. In fact, handling inflation is the only topic in this lesson that is commonly tested directly. If loss sizes are inflated by 100r%, the inflated loss variable Y will be (1 + r ) X, where X is the pre-inflation loss variable. For a scale family with a scale parameter, you just multiply θ by (1 + r ) to obtain the new distribution. C/4 Study Manual—17th edition Copyright ©2014 ASM

29

2. PARAMETRIC DISTRIBUTIONS

30

Example 2A Claim sizes expressed in dollars follow a two-parameter Pareto distribution with parameters α  5 and θ  90. A euro is worth $1.50. Calculate the probability that a claim will be for 20 euros or less. Answer: If claim sizes in dollars are X, then claim sizes in euros are Y  X/1.5. The resulting euro-based random variable Y for claim size will be Pareto with α  5, θ  90/1.5  60. The probability that a claim will be no more than 20 euros is 60 Pr ( Y ≤ 20)  FY (20)  1 − 60 + 20

!5  0.7627



Example 2B Claim sizes in 2010 follow a lognormal distribution with parameters µ  4.5 and σ  2. Claim sizes grow at 6% uniform inflation during 2011 and 2012. Calculate f (1000) , the probability density function at 1000, of the claim size distribution in 2012. Answer: If X is the claim size random variable in 2010, then Y  1.062 X is the revised variable in 2012. The revised lognormal distribution of Y has parameters µ  4.5 + 2 ln 1.06 and σ  2. The probability density function at 1000 is 1

2

2

√ e − (ln 1000−µ) /2σ σ (1000) 2π 2 2 1  √ e −[ln 1000− (4.5+2 ln 1.06) ] /2 (2 ) (2)(1000) 2π

fY (1000) 

 (0.000199471)(0.518814)  0.0001035



Example 2C Claim sizes expressed in dollars follow a lognormal distribution with parameters µ  3 and σ  2. A euro is worth $1.50. Calculate the probability that a claim will be for 100 euros or less. Answer: If claim sizes in dollars are X, then claim sizes in euros are Y  X/1.5. As discussed above, the distribution of claim sizes in euros is lognormal with parameters µ  3 − ln 1.5 and σ  2. Then ln 100 − 3 + ln 1.5 FY ( y )  Φ  Φ (1.01)  0.8438 2

!



Example 2D Claim sizes X initially follow a distribution with distribution function: FX ( x )  1 −



x 1+x

x>0

Claim sizes are inflated by 50% uniformly. Calculate the probability that a claim will be for 60 or less after inflation. Answer: Let Y be the increased claim size. Then Y  1.5X, so Pr ( Y ≤ 60)  Pr ( X ≤ 60/1.5)  FX (40) . √ 40 FX (40)  1 −  0.8457 41

C/4 Study Manual—17th edition Copyright ©2014 ASM



2.2. TRANSFORMATIONS

2.2

31

Transformations

Students report that there have been questions on transformations of random variables on recent exams. However, you only need to know the simplest case, how to transform a single random variable using a monotonic function. If Y  g ( X ) , with g ( x ) a one-to-one monotonically increasing function, then FY ( y )  Pr ( Y ≤ y )  Pr X ≤ g −1 ( y )  FX g −1 ( y )





and differentiating, fY ( y )  f X g −1 ( y )







(2.1)



(2.2)

 dg −1 ( y ) dy

If g ( x ) is one-to-one monotonically decreasing, then FY ( y )  Pr ( Y ≤ y )  Pr X ≥ g −1 ( y )  SX g −1 ( y )





and differentiating, fY ( y )  − f X g −1 ( y )





 dg −1 ( y ) dy

Putting both cases (monotonically increasing and monotonically decreasing) together:

 dg −1 ( y ) dy

fY ( y )  f X g −1 ( y )



(2.3)

Example 2E X follows a two-parameter Pareto distribution with parameters α and θ. You are given Y  ln



X +1 θ



Determine the distribution of Y. Answer: y  ln



x +1 θ



x θ x  θ ( e y − 1)

ey − 1 

FY ( y )  F X θ ( e y − 1 )





θ 1− θ + θ ( e y − 1) θ 1− θe y





 1 − e −α y So Y’s distribution is exponential with parameter θ  1/α. We see in this example that an exponential can be obtained by transforming a Pareto. There are a few specific transformations that are used to create distributions: C/4 Study Manual—17th edition Copyright ©2014 ASM



2. PARAMETRIC DISTRIBUTIONS

32

1. If the transformation Y  X τ is applied to a random variable X, with τ a positive real number, then the distribution of Y is called transformed. Thus when we talk about transforming a distribution we may be talking about any transformation, but if we talk about a transformed Pareto, say, then we are talking specifically about raising the random variable to a positive power. 2. If the transformation Y  X −1 is applied to a random variable X, then the distribution of Y is prefaced with the word inverse. Some examples you will find in the tables are inverse exponential, inverse Weibull, and inverse Pareto. 3. If the transformation Y  X τ is applied to a random variable X, with τ a negative real number, then the distribution of Y is called inverse transformed. 4. If the transformation Y  e X is applied to a random variable X, we name Y with the name of X preceded with “log”. The lognormal distribution is an example. As an example, let’s develop the distribution and density functions of an inverse exponential. Start with an exponential with parameter θ: F ( x )  1 − e −x/θ f (x ) 

e −x/θ θ

and let y  1/x. Notice that this is a one-to-one monotonically decreasing transformation, so when transforming the density function, we will multiply by the negative of the derivative. Then FY ( y )  Pr ( Y ≤ y )  Pr ( X ≥ 1/y )  SX (1/y )  e −1/( yθ )

dx e −1/( yθ )  θ y2 dy

f y ( y )  f x (1/y )

However, θ is no longer a scale parameter after this transformation. Therefore, the tables in the appendix use the reciprocal of θ as the parameter and call it θ: FY ( y )  e −θ/y fy ( y) 

θe −θ/y y2

As a result of the change in parametrization, the negative moments of the inverse exponential, as listed in the tables, are different from the corresponding positive moments of the exponential. Even though Y  X −1 , the formula for E[Y −1 ] is different from the one for E[X] because the θ’s are not the same. To preserve the scale parameters,1 the transformation should be done after the random variable is divided by its scale parameter. In other words 1. Set Y/θ  ( X/θ ) τ for a transformed random variable. 2. Set Y/θ  ( X/θ ) −1 for an inverse random variable. 3. Set Y/θ  ( X/θ ) −τ for an inverse transformed random variable. 4. Set Y/θ  e X/θ for a logged random variable.

1This method was shown to me by Ken Burton C/4 Study Manual—17th edition Copyright ©2014 ASM

2.3. COMMON PARAMETRIC DISTRIBUTIONS

33

Table 2.1: Summary of Scaling and Transformation Concepts

• If a distribution has a scale parameter θ and X has that distribution with parameter θ, then cX has the same distribution with parameter cθ. • All continuous distributions in the exam tables has scale parameter θ except for lognormal and inverse Gaussian. • If X is lognormal with parameters µ and σ, then cX is lognormal with parameters µ + ln c and σ. • If Y  g ( X ) and g is monotonically increasing, then FY ( y )  Pr ( Y ≤ y )  Pr X ≤ g −1 ( y )  FX g −1 ( y )









(2.1)



(2.2)

• If Y  g ( X ) and g is monotonically decreasing, then FY ( y )  Pr ( Y ≤ y )  Pr X ≥ g −1 ( y )  SX g −1 ( y )







• If Y  g ( X ) and g is monotonically increasing or decreasing, then

 dg −1 ( y ) dy

fY ( y )  f X g −1 ( y )



(2.3)

Let’s redo the inverse exponential example this way. Y X  θ θ

! −1

FY ( y )  Pr ( Y ≤ y )  y Y ≤  Pr θ θ ! X θ  Pr ≥ θ y θ2  Pr X ≥ y  e −θ

!

2 /yθ

 e −θ/y

2.3

Common parametric distributions

The tables provide a lot of information about the distributions, but if you don’t recognize the distribution, you won’t know to use the table. Therefore, it is a good idea to be familiar with the common distributions. You should familiarize yourself with the form of each distribution, but not necessarily the constants. The constant is forced so that the density function will integrate to 1. If you know which distribution you are dealing with, you can figure out the constant. To emphasize this point, in the following discussion, we will use the letter c for constants rather than spelling out what the constants are. You are not trying to recognize the constant; you are trying to recognize the form. C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

34

The gamma function

The gamma function Γ ( x ) is a generalization to real numbers of the factorial function, defined by ∞

Z Γ(x )  For positive integers n,

0

u x−1 e −u du

Γ ( n )  ( n − 1) !

The most important relationship for Γ ( x ) that you should know is Γ ( x + 1)  xΓ ( x ) for any real number x. Example 2F Evaluate

Γ (8.5) . Γ (6.5)

Answer: Γ (8.5) Γ (8.5)  Γ (6.5) Γ (7.5)

!

Γ (7.5)  (7.5)(6.5)  48.75 Γ (6.5)

!



We will mention the means and variances or second moments of the distributions. You need not memorize any of these. The tables give you the raw moments. You can calculate the variance as E[X 2 ] − E[X]2 . However, for frequently used distributions, you may want to memorize the mean and variance to save yourself some time when working out questions. We will graph the distributions. You are not responsible for graphs, but they may help you understand the distributions. The tables occasionally use the gamma function Γ ( x ) in the formulas for the moments. You should have a basic knowledge of the gamma function; if you are not familiar with this function, see the sidebar. The tables also use the incomplete gamma and beta functions, and define them, but you can get by without knowing them.

2.3.1

Uniform

A uniform distribution has a constant density on [d, u]: 1 u−d 0       x−d F ( x; d, u )     u −d    1  f ( x; d, u ) 

d≤x≤u

x≤d

d≤x≤u

x≥u

You recognize a uniform distribution both by its finite support and by the lack of an x in the density function. C/4 Study Manual—17th edition Copyright ©2014 ASM

2.3. COMMON PARAMETRIC DISTRIBUTIONS

35

Its moments are d+u 2 (u − d )2 Var ( X )  12 E[X] 

Its mean, median, and midrange are equal. The best way to calculate the second moment is to add up the variance and the square of the mean. However, some students prefer to use the following easy-to-derive formula: Z u 1 u 2 + ud + d 2 u3 − d3 2 E[X ]   (2.4) x 2 dx  u−d d 3(u − d ) 3 If d  0, then the formula reduces to u 2 /3. The uniform distribution is not directly in the tables, so I recommend you memorize the formulas for mean and variance. However, if d  0, then the uniform distribution is a special case of a beta distribution with θ  u, a  1, b  1.

2.3.2

Beta

The probability density function of a beta distribution with θ  1 has the form f ( x; a, b )  cx a−1 (1 − x ) b−1

0≤x≤1

The parameters a and b must be positive. They may equal 1, in which case the corresponding factor is missing from the density function. Thus if a  b  1, the beta distribution is a uniform distribution. You recognize a beta distribution both by its finite support—it’s the only common distribution with finite support—and by factors with x and 1 − x raised to powers and no other use of x in the density function. If θ is arbitrary, then the form of the probability density function is f ( x; a, b, θ )  cx a−1 ( θ − x ) b−1

0≤x≤θ

The distribution function can be evaluated if a or b is an integer. The moments are E[X]  Var ( X ) 

θa a+b

θ 2 ab ( a + b ) 2 ( a + b + 1)

The mode is θ ( a − 1) / ( a + b − 2) when a and b are both greater than 1, but you are not responsible for this fact. Figure 2.1 graphs four beta distributions with θ  1 all having mean 2/3. You can see how the distribution becomes more peaked and normal looking as a and b increase.

2.3.3

Exponential

The probability density function of an exponential distribution has the form f ( x; θ )  ce −x/θ θ must be positive. C/4 Study Manual—17th edition Copyright ©2014 ASM

x≥0

2. PARAMETRIC DISTRIBUTIONS

36

y 4.5

a a a a

4 3.5

= 1, b = 0.5 = 2, b = 1 = 6, b = 3 = 18, b = 9

3 2.5 2 1.5 1 0.5 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

Figure 2.1: Probability density function of four beta distributions with θ  1 and mean 2/3

You recognize an exponential distribution when the density function has e raised to a multiple of x, and no other use of x. The distribution function is easily evaluated. The moments are: E[X]  θ Var ( X )  θ 2 Figure 2.2 graphs three exponential distributions. The higher the parameter, the more weight placed on higher numbers.

2.3.4

Weibull

A Weibull distribution is a transformed exponential distribution. If Y is exponential with mean µ, then X  Y 1/τ is Weibull with parameters θ  µ1/τ and τ. An exponential is a special case of a Weibull with τ  1. The form of the density function is f ( x; τ, θ )  cx τ−1 e − ( x/θ )

τ

x≥0

Both parameters must be positive. You recognize a Weibull distribution when the density function has e raised to a multiple of a power of x, and in addition has a corresponding power of x, one lower than the power in the exponential, as a factor. The distribution function is easily evaluated, but the moments require evaluating the gamma function, which usually requires numerical techniques. The moments are E[X]  θΓ (1 + 1/τ ) E[X 2 ]  θ 2 Γ (1 + 2/τ ) Figure 2.3 graphs three Weibull distributions with mean 50. The distribution has a non-zero mode when τ > 1. Notice that the distribution with τ  0.5 puts a lot of weight on small numbers. To make up C/4 Study Manual—17th edition Copyright ©2014 ASM

2.3. COMMON PARAMETRIC DISTRIBUTIONS

37

y 0.04 θ = 25 θ = 50 θ = 100

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.2: Probability density function of three exponential distributions

y 0.04 τ = 0.5, θ = 25 τ = 1, θ = 50 p τ = 2, θ = 100/ π

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.3: Probability density function of three Weibull distributions with mean 50

for this, it will also have to put higher weight than the other two distributions on very large numbers, so although it’s not shown, its graph will cross the other two graphs for high x

2.3.5

Gamma

The form of the density function of a gamma distribution is f ( x; α, θ )  cx α−1 e −x/θ

x≥0

Both parameters must be positive. When α is an integer, a gamma random variable with parameters α and θ is the sum of α independent exponential random variables with parameter θ. In particular, when α  1, the gamma random variable is exponential. The gamma distribution is called an Erlang distribution when α is an integer. We’ll discuss this more in Subsection 19.1.2. You recognize a gamma distribution when the density function has e raised to a multiple of x, and in C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

38

y 0.06 0.055 0.05 0.045 0.04 α = 0.5, θ = 100 α = 5, θ = 10 α = 50, θ = 1

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.4: Probability density function of three gamma distributions with mean 50

addition has x raised to a power. Contrast this with a Weibull, where e is raised to a multiple of a power of x. The distribution function may be evaluated if α is an integer; otherwise numerical techniques are needed. However, the moments are easily evaluated: E[X]  αθ Var ( X )  αθ2 Figure 2.4 graphs three gamma distributions with mean 50. As α goes to infinity, the graph’s peak narrows and the distribution converges to a normal distribution. The gamma distribution is one of the few for which the moment generating function has a closed form. In particular, the moment generating function of an exponential has a closed form. The only other distributions in the tables with closed form moment generating functions are the normal distribution (not actually in the tables, but the formula for the lognormal moments is the MGF of a normal) and the inverse Gaussian.

2.3.6

Pareto

When we say “Pareto”, we mean a two-parameter Pareto. On recent exams, they write out “twoparameter” to make it clear, but on older exams, you will often find the word “Pareto” with no qualifier. It always refers to a two-parameter Pareto, not a single-parameter Pareto. The form of the density function of a two-parameter Pareto is f (x )  Both parameters must be positive. C/4 Study Manual—17th edition Copyright ©2014 ASM

c ( θ + x ) α+1

x≥0

2.3. COMMON PARAMETRIC DISTRIBUTIONS

39

y 0.04 α = 0.5, θ = 5 α = 2, θ = 50 α = 5, θ = 200

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.5: Probability density function of three Pareto distributions

You recognize a Pareto when the density function has a denominator with x plus a constant raised to a power. The distribution function is easily evaluated. The moments are E[X]  E[X 2 ] 

θ α−1

α>1

2θ 2 ( α − 1)( α − 2)

α>2

When α does not satisfy these conditions, the corresponding moments don’t exist. A shortcut formula for the variance of a Pareto is Var ( X )  E[X]2

α α−2

!

Figure 2.5 graphs three Pareto distributions, one with α < 1 and the other two with mean 50. Although the one with α  0.5 puts higher weight on small numbers than the other two, its mean is infinite; it puts higher weight on large numbers than the other two, and its graph eventually crosses the other two as x → ∞.

2.3.7

Single-parameter Pareto

The form of the density function of a single-parameter Pareto is f (x ) 

c x α+1

x≥θ

α must be positive. θ is not considered a parameter since it must be selected in advance, based on what you want the range to be. You recognize a single-parameter Pareto by its having support not starting at 0, and by the density function having a denominator with x raised to a power. A beta distribution may also have x raised to a negative power, but it would have finite support. C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

40

A single-parameter Pareto X is a two-parameter Pareto Y shifted by θ: X  Y + θ. Thus it has the same variance, and the mean is θ greater than the mean of a two-parameter Pareto with the same parameters. αθ α−1 αθ2 2 E[X ]  α−2 E[X] 

2.3.8

α>1 α>2

Lognormal

The form of the density function of a lognormal distribution is f (x ) 

ce − (ln x−µ ) x

2 /2σ 2

x>0

σ must be nonnegative. You recognize a lognormal by the ln x in the exponent. If Y is normal, then X  e Y is lognormal with the same parameters µ and σ. Thus, to calculate the distribution function, use ! ln x − µ FX ( x )  FY (ln x )  Φ σ where Φ ( x ) is the standard normal distribution function, for which you are given tables. The moments of a lognormal are E[X]  e µ+0.5σ E[X 2 ]  e 2µ+2σ

2

2

More generally, E[X k ]  E[e kY ]  MY ( k ) , where MY ( k ) is the moment generating function of the corresponding normal distribution. Figure 2.6 graphs three lognormals with mean 50. The mode is exp ( µ − σ2 ) , as stated in the tables. For µ  2, the mode is off the graph. As σ gets lower, the distribution flattens out. Table 2.2 is a summary of the forms of probability density functions for common distributions.

2.4

The linear exponential family

The following material, based on Loss Models 5.4 which is on the syllabus, is background for something we’ll learn later in credibility. However, I doubt anything in this section will be tested on directly, so you may skip it. A set of parametric distributions is in the linear exponential family if it can be parametrized with a parameter θ in such a way that in its density function, the only interaction between θ and x is in the exponent of e, which is x times a function of θ. In other words, its density function f ( x; θ ) can be expressed as p ( x ) e r (θ) x f ( x; θ )  q (θ) The set may have other parameters. q ( θ ) is the normalizing constant which makes the integral of f equal to 1. Examples of the linear exponential family are: C/4 Study Manual—17th edition Copyright ©2014 ASM

2.4. THE LINEAR EXPONENTIAL FAMILY

41

y 0.06 0.055 0.05 0.045 0.04 µ = 2, σ = 1.9555 µ = 3, σ = 1.3506 µ = 3.5, σ = 0.9078

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

Figure 2.6: Probability density function of three lognormal distributions with mean 50

Table 2.2: Forms of probability density functions for common distributions

Distribution

C/4 Study Manual—17th edition Copyright ©2014 ASM

Probability density function

Uniform

c

d≤x≤u

Beta

cx a−1 ( θ − x ) b−1

0≤x≤θ

Exponential

ce −x/θ

x≥0

Weibull

cx τ−1 e −x

Gamma

cx α−1 e −x/θ

Pareto

c ( x + θ ) α+1

Single-parameter Pareto

c x α+1

Lognormal

ce − (ln x−µ ) x

τ /θ τ

x≥0 x≥0 x≥0 x≥θ

2 /2σ 2

x>0

100

x

2. PARAMETRIC DISTRIBUTIONS

42

Gamma distribution The pdf is f ( x; µ, σ ) 

x α−1 e −x/θ Γ(α) θα

Let r ( θ )  −1/θ, p ( x )  x α−1 , and q ( θ )  Γ ( α ) θ α . Normal distribution The pdf is 2

f ( x; θ ) 

e − ( x−µ) /2σ √ σ 2π

2

Let θ  µ. The denominator of the pdf does not have x or θ so it can go into q ( θ ) or into p ( x ) . The exponent can be expanded into −

xθ θ2 x2 + − 2σ2 σ2 2σ2

and only the second summand involves both x and θ, and x appears to the first power. Thus we can set 2 2 2 2 √ p ( x )  e −x /2σ , r ( θ )  θ/σ2 , and q ( θ )  e θ /2σ σ 2π. Discrete distributions are in the linear exponential family if we can express the probability function in the linear exponential form. Poisson distribution For a Poisson distribution, the probability function is f ( x; λ )  e −λ

λx e x ln λ  e −λ x! x!

We can let θ  λ, and then p ( x )  1/x!, r ( θ )  ln θ, and q ( θ )  e θ . The textbook develops the following formulas for the mean and variance of a distribution from the linear exponential family: ln q ( θ ) q0 ( θ ) E[X]  µ ( θ )  0  r (θ) q (θ) r0 (θ) 0 µ (θ) Var ( X )  v ( θ )  0 r (θ)



Thus, in the above examples: Gamma distribution d ln q α  dθ θ 1 dr  2 dθ θ α/θ E[X]   αθ 1/θ 2 α Var ( X )   αθ2 1/θ 2 C/4 Study Manual—17th edition Copyright ©2014 ASM

0

2.5. LIMITING DISTRIBUTIONS

43

Normal distribution



θ 2θ  2 2 2σ σ 1 0 r (θ)  2 σ θ/σ2 E[X]  θ 1/σ 2 1  σ2 Var ( X )  1/σ 2

ln q ( θ )

0



Poisson distribution



ln q ( θ )

0

1

1 θ 1 E[X]  θ 1/θ 1 Var ( X )  θ 1/θ r0 (θ) 

2.5

Limiting distributions

The following material is based on Loss Models 5.3.3. I don’t think it has ever appeared on the exam and doubt it ever will. In some cases, as the parameters of a distribution go to infinity, the distribution converges to another distribution. To demonstrate this, we will usually have to use the identity



lim 1 +

α→∞

r α



 er

Equivalently, if c is a constant (not dependent on α), then



lim 1 +

α→∞

r α

 α+c

 er

since we can set α0  α + c, and r/ ( α0 − c ) → r/α0 as α0 → ∞. As a simple example (not in the textbook) of a limiting distribution, consider a gamma distribution with a fixed mean µ, and let α → ∞. Then θ  µ/α. The moment generating function is M ( t )  (1 − θt ) −α  

1 1−

 µt α α

and as α → ∞, the denominator goes to e −µt , so M ( t ) → e µt , which is the moment generating function of the constant µ. So as α → ∞, the limiting distribution of a gamma is a distribution equal to the mean with probability 1. As another example, let’s carry out textbook exercise 5.21, which asks you to demonstrate that the limiting distribution of a Pareto with θ/α constant as α → ∞ is an exponential. Let k  θ/α. The density C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

44

Table 2.3: Summary of Parametric Distribution Concepts

• If X is a member of a scale family with scale parameter θ with value s, then cX is in the same family and has the same parameter values as X except that the scale parameter θ has value cs. • All distributions in the tables are scale families with scale parameter θ except for lognormal and inverse Gaussian. • If X is lognormal with parameters µ and σ, then cX is lognormal with parameters µ + ln c and σ. • If X is normal with parameters µ and σ2 , then e X is lognormal with parameters µ and σ. • See Table 2.2 to learn the forms of commonly occurring distributions. Useful facts are Uniform on [d, u]

Uniform on [0, u] Gamma

d+u 2 (u − d )2 Var ( X )  12 u2 2 E[X ]  3 Var ( X )  αθ2 E[X] 

• If Y is single-parameter Pareto with parameters α and θ, then Y − θ is two-parameter Pareto with the same parameters. • X is in the linear exponential family if its probability density function can be expressed as f ( x; θ ) 

p ( x ) e r (θ) x q (θ)

function of a Pareto is αθ α ( θ + x ) α+1 α ( αk ) α  ( αk + x ) α+1 kα  ( k + x/α ) α+1 1    α+1 k 1 + ( x/k ) /α

f ( x; α, θ ) 

and the limit as α → ∞ is (1/k ) e −x/k . That is the density function of an exponential with mean k. Notice that as α → ∞, the mean of the Pareto converges to k.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 2

45

Exercises 2.1. For a commercial fire coverage •

In 2009, loss sizes follow a two-parameter Pareto distribution with parameters α  4 and θ.



In 2010, there is uniform inflation at rate r.



The 65th percentile of loss size in 2010 equals the mean loss size in 2009. Determine r.

2.2. [4B-S90:37] (2 points) Liability claim severity follows a Pareto distribution with a mean of 25,000 and parameter α  3. If inflation increases all claims by 20%, the probability of a claim exceeding 100,000 increases by what amount? (A) (B) (C) (D) (E)

Less than 0.02 At least 0.02, but less than 0.03 At least 0.03, but less than 0.04 At least 0.04, but less than 0.05 At least 0.05 [4B-F97:26] (3 points) You are given the following:

2.3. •

In 1996, losses follow a lognormal distribution with parameters µ and σ.



In 1997, losses follow a lognormal distribution with parameters µ + ln k and σ, where k is greater than 1.



In 1996, 100p% of the losses exceed the mean of the losses in 1997. Determine σ. Note: z p is the 100p th percentile of a normal distribution with mean 0 and variance 1.

(A)

2 ln k

(B)

−z p ±

(C)

zp ±

q

q

z 2p − 2 ln k

r (D)

−z p ±

r (E)

zp ±

z 2p − 2 ln k

q

q

z 2p − 2 ln k

z 2p − 2 ln k

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

2. PARAMETRIC DISTRIBUTIONS

46

2.4. [4B-S94:16] (1 point) You are given the following: •

Losses in 1993 follow the density function f ( x )  3x −4 ,

x ≥ 1,

where x  losses in millions of dollars. •

Inflation of 10% impacts all claims uniformly from 1993 to 1994. Determine the probability that losses in 1994 exceed 2.2 million.

(A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20 [4B-F95:6] (2 points) You are given the following:

2.5. •

In 1994, losses follow a Pareto distribution with parameters θ  500 and α  1.5.



Inflation of 5% impacts all losses uniformly from 1994 to 1995. What is the median of the portion of the 1995 loss distribution above 200?

(A) (B) (C) (D) (E)

Less than 600 At least 600, but less than 620 At least 620, but less than 640 At least 640, but less than 660 At least 660

2.6. [CAS3-S04:34] Claim severities are modeled using a continuous distribution and inflation impacts claims uniformly at an annual rate of i. Which of the following are true statements regarding the distribution of claim severities after the effect of inflation? 1. An Exponential distribution will have scale parameter (1 + i ) θ 2. A 2-parameter Pareto distribution will have scale parameters (1 + i ) α and (1 + i ) θ. 3. A Paralogistic distribution will have scale parameter θ/ (1 + i ) (A) 1 only

(B) 3 only

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1 and 2 only

(D) 2 and 3 only

(E) 1, 2, and 3

Exercises continue on the next page . . .

EXERCISES FOR LESSON 2

47

2.7. [4B-S99:17] You are given the following: •

In 1998, claim sizes follow a Pareto distribution with parameters θ (unknown) and α  2.



Inflation of 6% affects all claims uniformly from 1998 to 1999.



r is the ratio of the proportion of claims that exceed d in 1999 to the proportion of claims that exceed d in 1998. Determine the limit of r as d goes to infinity.

(A) (B) (C) (D) (E)

Less than 1.05 At least 1.05, but less than 1.10 At least 1.10, but less than 1.15 At least 1.15, but less than 1.20 At least 1.20

2.8. [4B-F94:28] (2 points) You are given the following: •

In 1993, the claim amounts for a certain line of business were normally distributed with mean µ  1000 and variance σ 2  10,000;

  1 x−µ 2 1 exp − f (x )  √ 2 σ σ 2π •

! − ∞ < x < ∞,

µ  1000, σ  100.

Inflation of 5% impacted all claims uniformly from 1993 to 1994. What is the distribution for claim amounts in 1994?

(A) (B) (C) (D) (E)

No longer a normal distribution Normal with µ  1000 and σ  102.5. Normal with µ  1000 and σ  105.0. Normal with µ  1050 and σ  102.5. Normal with µ  1050 and σ  105.0.

2.9. [4B-S93:11] (1 point) You are given the following: (i)

The underlying distribution for 1992 losses is given by a lognormal distribution with parameters µ  17.953 and σ  1.6028. (ii) Inflation of 10% impacts all claims uniformly the next year. What is the underlying loss distribution after one year of inflation? (A) (B) (C) (D) (E) 2.10.

Lognormal with µ0  19.748 and σ0  1.6028. Lognormal with µ0  18.048 and σ0  1.6028. Lognormal with µ0  17.953 and σ0  1.7631. Lognormal with µ0  17.953 and σ0  1.4571. No longer a lognormal distribution X follows an exponential distribution with mean 10.

Determine the mean of X 4 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

2. PARAMETRIC DISTRIBUTIONS

48

2.11. (i) (ii)

You are given X is exponential with mean 2. Y  X 1.5 .

Calculate E[Y 2 ]. 2.12.

X follows a gamma distribution with parameters α  2.5 and θ  10.

Y  1/X. Evaluate Var ( Y ) . Additional released exam questions: CAS3-F05:19,21, CAS3-S06:26,27

Solutions 2.1. The mean in 2009 is θ/3. By definition, the 65th percentile is the number π65 such that F ( π65 )  0.65, so F ( θ/3)  0.65 for the 2010 version of F. In 2010, F is two-parameter Pareto with inflated parameter θ0  (1 + r ) θ and α  4, so

!4

θ0 0 θ + ( θ/3) (1 + r ) θ (1 + r ) θ + θ/3 1+r 4/3 + r √4 r (1 − 0.35)

1−

 0.65 

√4 0.35



√4 0.35

4 √4 0.35 − 1 3 √ (4/3) 4 0.35 − 1 r  0.1107 √4 1 − 0.35 

2.2. Let X be the original variable, Z  1.2X. Since the mean is 25,000, the parameter θ is 25,000 ( α −1)  50,000. 50 Pr ( X > 100,000)  150

!3 

1 27

!3

60 27 Pr ( Z > 100,000)   160 512 27 1 −  0.0157 (A) 512 27 2.3. The key is to understand (iii). If (for example) 30% of losses exceed $10000, what percentage does not exceed $10000? (Answer: 70%) And what percentile of the distribution of losses is $10000? (Answer: 70th ). So statement (iii) is saying that the 100 (1 − p ) th percentile of losses in 1996 equals the mean of losses in 1997. Got it? 2 The mean of 1997 losses is exp ( µ + ln k + σ2 ) . The 100(1 − p)th percentile is exp ( µ − z p σ ) . So: µ − z p σ  µ + ln k + C/4 Study Manual—17th edition Copyright ©2014 ASM

σ2 2

EXERCISE SOLUTIONS FOR LESSON 2

49 σ2 + σz p + ln k  0 2

σ  −z p ±

q

z 2p − 2 ln k

(B)

Notice that p must be less than 0.5, by the following reasoning. In general, the median of a lognormal (e µ ) 2 is less than (or equal to, if σ  0) the mean (e µ+σ /2 ), so the median of losses in 1996 is no more than the mean of losses in 1996, which in turn is less than the mean of losses in 1997 since k > 1, so 100p must be less than 50. Since p is less than 0.5, it follows that z p will be negative, and σ is therefore positive, as it should be. 2.4. We recognize the 1993 distribution as a single-parameter Pareto with θ  1, α  3. The inflated 1.1 3  0.125 . (C) parameters are θ  1.1, α  3. 2.2

 1.5

525 2.5. Let X be the inflated variable, with θ  525, α  1.5. Pr ( X > 200)  525+200  0.6162. Let F be ∗ the original distribution function, F the distribution of X | X > 200. Then F (200)  1 − 0.6162  0.3838 and Pr (200 < X ≤ x ) F ( x ) − F (200) F ∗ ( x )  Pr ( X ≤ x | X > 200)   Pr ( X > 200) 1 − F (200)

So to calculate the median, we set F ∗ ( x )  0.5, which means

F ( x ) − F (200)  0.5 1 − F (200) F ( x ) − 0.3838  0.5 0.6162 F ( x )  0.5 (0.6162) + 0.3838  0.6919 We must find x such that F ( x )  0.6919.

! 1.5

525 525 + x 525 525 + x 525 − 525 (0.4562) 0.4562 x 1−

 0.6919  0.4562 x  625.87

(C)

2.6. All the distributions are parameterized so that θ is the scale parameter and is multiplied by 1 + i; no other parameters change, and you should never divide by 1 + i. Therefore only 1 is correct. (A) 2.7. This is:

1.06θ  2 1.06θ+d θ 2 θ+d



1.062 ( θ + d ) 2 → 1.062  1.1236 . (1.06θ + d ) 2

(C)

2.8. If X is normal, then aX + b is normal as well. In particular, 1.05X is normal. So the distribution of claims after 5% uniform inflation is normal. For any distribution, multiplying the distribution by a constant multiplies the mean and standard deviation by that same constant. Thus in this case, the new mean is 1050 and the new standard deviation is 105. (E) 2.9. Add ln 1.1 to µ: 17.953 + ln 1.1  18.048. σ does not change. (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

50

2.10.

The k th moment for an exponential is given in the tables: E[X k ]  k!θ k

for k  4 and the mean θ  10, this is 4! (104 )  240,000 . 2.11. While Y is Weibull, you don’t need to know that. It’s simpler to use Y 2  X 3 and look up the third moment of an exponential. E[X 3 ]  3!θ 3  6 (23 )  48 2.12. We calculate E[Y] and E[Y 2 ], or E[X −1 ] and E[X −2 ]. Note that the special formula in the tables for integral moments of a gamma, E[X k ]  θ k ( α + k − 1) · · · α only applies when k is a positive integer, so it cannot be used for the −1 and −2 moments. Instead, we must use the general formula for moments given in the tables, θk Γ(α + k ) E[X k ]  Γ(α) For k  −1, this is

E[X −1 ] 

since Γ ( α )  ( α − 1) Γ ( α − 1) . For k  −2,

θ −1 Γ ( α − 1) 1  Γ(α) θ ( α − 1)

E[X −2 ]  Therefore,

C/4 Study Manual—17th edition Copyright ©2014 ASM

θ2 ( α

1 − 1)( α − 2)

1 1 Var ( Y )  2 − 10 (1.5) 10 (1.5)(0.5)

!2  0.00888889

Lesson 3

Variance You learned everything in this lesson in your probability course. Nevertheless, many students miss a lot of these points. They are very important. There won’t necessarily be any exam questions testing you directly on this material (although there could be). Rather, this material is background needed for the rest of the course.

3.1

Additivity

Expected value is linear, meaning that f g E[aX + bY]  a E[X] + b E[Y], regardless of whether Xfand Y areg 2 independent or not. Thus E ( X+Y )  E[X 2 ]+2 E[XY]+E[Y 2 ], for example. This means that E ( X + Y ) 2 is not equal to E X 2 + E Y 2 , unless E[XY]  0.

f

g

f

g

2

Also, it is not true in general that E g ( X )  g E[X] . So E[X 2 ] , E[X] .

f

g







Since variance can be expressed in terms of expected value as Var ( X )  E X 2 − E[X]2 , this allows us to develop a formula for Var ( aX + bY ) . If you work it out, you get

f

g

Var ( aX + bY )  a 2 Var ( X ) + 2ab Cov ( X, Y ) + b 2 Var ( Y )

(3.1)

In particular, if Cov ( X, Y )  0 (which is true if X and Y are independent), then Var ( X + Y )  Var ( X ) + Var ( Y ) and generalizing to n independent variables, n n X X + * Xi  Var ( X i ) Var , i1 - i1

If all the X i ’s are independent and have identical distributions, and we set X  X i for all i, then Var *

n X

, i1

X i +  n Var ( X )

-

However, Var ( nX )  n 2 Var ( X ) , not n Var ( X ) . You must distinguish between these two situations, which are quite different. Think of the following example. The stock market goes up or down randomly each day. We will assume that each day’s change is independent of the previous day’s, and has the same distribution. Compare the variance of the following possibilities: 1. You put $1 in the market, and leave it there for 10 days. 2. You put $10 in the market, and leave it there for 1 day. In the first case, there are going to be potential ups and downs each day, and the variance of the change of your investment will be 10 times the variance of one day’s change because of this averaging. In the second case, however, you are magnifying a single day’s change by 10—there’s no dampening of the change by C/4 Study Manual—17th edition Copyright ©2014 ASM

51

3. VARIANCE

52

10 different independent random events, the change depends on a single random event. Therefore the variance is multiplied by 100. In the more general case where the variables are not independent, you need to know the covariance. This can be provided in a covariance matrix. If you have n random variables X1 , . . . , X n , this n × n matrix A has a i j  Cov ( X i , X j ) for i , j. For i  j, a ii  Var ( X i ) . This matrix is symmetric and positive semidefinite. However, the covariance of two random variables may be negative. Example 3A For a loss X on an insurance policy, let X1 be the loss amount and X2 the loss adjustment expenses, so that X  X1 + X2 . The covariance matrix for these random variables is 25 5

5 2

!

Calculate the variance in total cost of a loss including loss adjustment expenses. Answer: In formula (3.1), a  b  1, so 25 + 2 (5) + 2  37 .



A sample is a set of observations from n independent identically distributed random variables. The sample mean X¯ is the sum of the observations divided by n. The variance of the sample mean of X1 , . . . , X n , which are observations from the random variable X, is

Pn

Var ( X¯ )  Var

3.2

i1

n

Xi

! 

n Var ( X ) Var ( X )  n n2

(3.2)

Normal approximation

The Central Limit Theorem says that for any distribution with finite variance, the sample mean of a set of independent identically distributed random variables approaches a normal distribution. By the previous section, the mean of the sample mean of observations of X is E[X] and the variance is σ2 /n. These parameters uniquely determine the normal distribution that the sample mean converges to. A random variable Y with normal distribution with mean µ and variance σ2 can be expressed in terms of a standard normal random variable Z in the following way: Y  µ + σZ and you can look up the distribution of Z in a table of the standard normal distribution function that you get at the exam. The normal approximation consists of calculating percentiles of a random variable by assuming that it has a normal distribution. Let Φ ( x ) be the cumulative distribution function of the standard normal distribution. (The standard normal distribution has µ  0, σ  1. Φ is the symbol generally used for this distribution function.) Suppose we are given that X is a normal random variable with mean µ, variance σ2 ; we will write X ∼ n ( µ, σ2 ) to describe X. And suppose we want to calculate the 95th percentile of X; in other words, we want a number x such that Pr ( X ≤ x )  0.95. We would reason as follows: Pr ( X ≤ x )  0.95

x−µ X−µ Pr ≤  0.95 σ σ

!

x−µ Φ  0.95 σ x−µ  Φ−1 (0.95) σ x  µ + σΦ−1 (0.95)

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

3.2. NORMAL APPROXIMATION

53

Note that Φ−1 (0.95)  1.645 is a commonly used percentile of the normal distribution, and is listed at the bottom of the table you get at the exam. You should internalize the above reasoning so you don’t have to write it out each time. Namely, to calculate a percentile of a random variable being approximated normally, find the value of x such that Φ ( x ) is that percentile. Then scale x: multiply by the standard deviation, and then translate x: add the mean. This method will be used repeatedly throughout the course. Example 3B A big fire destroyed a building in which 100 of your insureds live. Each insured has a fire insurance policy. The losses on this policy follow a Pareto distribution with α  3, θ  2000. Even though all the insureds live in the same building, the losses are independent. You are now setting up a reserve for the cost of these losses. Using the normal approximation, calculate the size of the reserve you should put up if you want to have a 95% probability of having enough money in the reserve to pay all the claims. Answer: The mean of each loss is

2000 2

 1000 and the variance is

E X 2 − E[X]2 

f

g

2 (20002 ) − 10002  3,000,000 2

The mean of the sum is the sum of the means, or (100)(1000 √ )  100,000. The variance of the sum is the sum of the variances, or 100 (3,000,000)  3 × 108 . σ  3 × 108  17,320.51. For a standard normal distribution, the 95th percentile is 1.645. We scale this by 17,320.51 and translate it by 100,000: 100,000 + 17,320.51 (1.645)  128,492.24 .  The normal approximation is sometimes called the large sample estimate, since it is based on the Central Limit Theorem, which describes the behavior of the distribution of a sample as its size goes to infinity.

Continuity correction When a discrete distribution is approximated with the normal distribution, a continuity correction is required. If the discrete distribution can assume values a and b but cannot assume values in between a and b, and you want the probability of being strictly above a, you estimate it with the probability that the normal variable is greater than ( a + b ) /2. Use the complement of that if you want the probability of being less than or equal to a. The same goes for b; if you want the probability of being strictly less than b (notice that this is identical to the probability that the variable is less than or equal to a), use the probability that the normal variable is less than ( a + b ) /2. If you want the probability that the variable is greater than or equal to b (which is identical to the probability that the variable is strictly greater than a), use the probability that the normal random variable is greater than ( a + b ) /2. Example 3C The distribution of loss sizes is Size

Probability

1000 1500 2500

0.50 0.25 0.25

Calculate the probability that the average of 100 losses is less than 1550 using the normal approximation. Answer: The mean is 1500. The variance is the second moment minus the mean squared: σ2  0.5 (10002 ) + 0.25 (15002 + 25002 ) − 15002  375,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

3. VARIANCE

54

The variance of the sample mean is the variance of the distribution divided by 100, or 3750. Losses are always multiples of 500. Adding up 100 of them and dividing by 100, the average is a multiple of 5. Therefore, to calculate the probability of the average being less than 1550, we calculate the probability that the normal variable is less than 1547.5, the midpoint between the possible values for the mean, 1545 and 1550. ! 1547.5 − 1500  0.7823 Pr ( X < 1547.5)  Φ √  3750

3.3

Bernoulli shortcut

A Bernoulli distribution is one where the random variable is either 0 or 1. It is 1 with probability q and 0 with probability 1 − q. Its mean is q, and its variance is q (1 − q ) . These are in your tables, but you’ll use these so often you shouldn’t have to look it up. However, any random variable which can only assume two values is a scaled and translated Bernoulli. If X is Bernoulli and Y can only assume the values a and b, with a having probability q, then Y  ( a − b ) X + b. This means that the variance of Y is ( a − b ) 2 Var ( X )  ( a − b ) 2 q (1 − q ) . Remember this! To repeat, for any random variable which assumes only two values, the variance is the squared difference between the two values, times the probabilities of the two values. This is faster than slogging through a calculation of E[X] and E[X 2 ]. OK, so quickly—if a random variable is equal to 20 with probability 0.7 and 120 with probability 0.3, what’s its variance? (Answer below1) This shortcut will be used repeatedly throughout the course. Example 3D For a one-year term life insurance policy of 1000: (i) (ii) (iii) (iv)

The premium is 30. The probability of death during the year is 0.02. The company has expenses of 2. If the insured survives to the end of the year, the company pays a dividend of 3.

Ignore interest. Calculate the variance in the amount of profit the company makes on this policy. Answer: There are only two possibilities—either the insured dies or he doesn’t—so we have a Bernoulli here. We can ignore premium and expenses, since these don’t vary, so they generate no variance. Either the company pays 1000 (probability 0.02) or it pays 3 (probability 0.98). The variance is therefore

(1000 − 3) 2 (0.02)(0.98)  19,482.5764 .



A random variable which can only assume one of two values is called a two point mixture. We will learn about mixtures in Section 4.1.

(Answer: (0.7)(0.3) 1002  2100)



1



C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 3

55

Exercises 3.1. [4B-S93:9] (1 point) If X and Y are independent random variables, which of the following statements are true? 1.

Var ( X + Y )  Var ( X ) + Var ( Y )

2.

Var ( X − Y )  Var ( X ) + Var ( Y )

Var ( aX + bY )  a 2 E[X 2 ] − a (E[X]) 2 + b 2 E[Y 2 ] − b (E[Y]) 2

3.

(A) 1

(B) 1,2

(C) 1,3

(D) 2,3

(E) 1,2,3

3.2. [4B-F95:28] (2 points) Two numbers are drawn independently from a uniform distribution on [0,1]. What is the variance of their product? (A) 1/144

(B) 3/144

(C) 4/144

(D) 7/144

(E) 9/144

3.3. [4B-F99:7] (2 points) A player in a game may select one of two fair, six-sided dice. Die A has faces marked with 1, 2, 3, 4, 5 and 6. Die B has faces marked with 1, 1, 1, 6, 6, and 6. If the player selects Die A, the payoff is equal to the result of one roll of Die A. If the player selects Die B, the payoff is equal to the mean of the results of n rolls of Die B. The player would like the variance of the payoff to be as small as possible. Determine the smallest value of n for which the player should select Die B. (A) 1

(B) 2

(C) 3

(D) 4

(E) 5

3.4. [151-82-92:4] A company sells group travel-accident life insurance with b payable in the event of a covered individual’s death in a travel accident. The gross premium for a group is set equal to the expected value plus the standard deviation of the group’s aggregate claims. The standard premium is based on the following assumptions: (i) All individual claims within the group are mutually independent; and (ii) b 2 q (1 − q )  2500, where q is the probability of death by travel accident for an individual.

In a certain group of 100 lives, the independence assumption fails because three specific individuals always travel together. If one dies in an accident, all three are assumed to die. Determine the difference between this group’s premium and the standard premium. (A) 0 3.5.

(B) 15

(C) 30

(D) 45

(E) 60

You are given the following information about the random variables X and Y:

(i) Var ( X )  9 (ii) Var ( Y )  4 (iii) Var (2X − Y )  22

Determine the correlation coefficient of X and Y.

(A) 0

(B) 0.25

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.50

(D) 0.75

(E) 1

Exercises continue on the next page . . .

3. VARIANCE

56

3.6. [151-82-93:9] (1 point) For a health insurance policy, trended claims will be equal to the product of the claims random variable X and a trend random variable Y. You are given: (i) (ii) (iii) (iv) (v)

E[X]  10 Var ( X )  100 E[Y]  1.20 Var ( Y )  0.01 X and Y are independent

Determine the variance of trended claims. (A) 144

(B) 145

(C) 146

(D) 147

(E) 148

3.7. X and Y are two independent exponentially distributed random variables. You are given that Var ( X )  25 and Var ( XY )  7500. Determine Var ( Y ) . (A) 25

(B) 50

(C) 100

(D) 200

(E) 300

Solutions 3.1. The first and second statements are true by formula (3.1). The third statement should have squares on the second a and second b, since Var ( aX )  E[ ( aX ) 2 ] − E[aX]2  a 2 E[X 2 ] − a 2 E[X]2

for example. (B)

and the second moment is 13 . So

1 2

3.2. The mean of the uniform distribution is

Var ( XY )  E X 2 Y 2 − E[X]2 E[Y]2

f

1  3 

g

!

1 1 − 3 4

!

!

1 4

!

1 7 1 −  9 16 144

(D)

3.3. The variance of Die A is 1* 7 . 1− 6 2



2



+ 2−

7 2

2



+ 3−

7 2

2



+ 4−

7 2

2



+ 5−

,

7 2

2



+ 6−

7 2

2

+/  35 12 -

Die B is Bernoulli, with only two possible values of 1 and 6 with probabilities 12 , so the variance of one toss is 52

1 2 2



25 4 .

The variance of the mean is the variance of one toss over n (equation (3.2)). So 25 35 < 4n 12 140n > 300 n > 300/140 > 2

The answer is 3 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 3

57

3.4. The number of fatal accidents for each life, N, has a Bernoulli distribution with mean q and variance q (1−q ) , so the variance in one life’s aggregate claims is the variance of bN. Var ( bN )  b 2 Var ( N )  b 2 q (1− q )  2500. For 100 independent lives, aggregate claims are 100bN, with variance 100 Var ( bN )  100 (2500) . For three lives always traveling together, aggregate claims are 3bN with variance 32 Var ( bN )  9 (2500) . If we add this to the variance of aggregate claims for the other 97 independent lives, the variance is 9 (2500) + 97 (2500)  106 (2500) . The expected value of aggregate claims, however, is no different from the expected value of the totally independent group’s aggregate claims. The difference in premiums is therefore

p

106 (2500) − 100 (2500)  14.7815

p

(B)

3.5. 22  Var (2X − Y )  4 (9) + 4 − 2 (2) Cov ( X, Y ) Cov ( X, Y )  4.5 4.5 ρ XY  √ √  0.75 9 4

(D)

3.6. E[XY]  (10)(1.20)  12 E ( XY ) 2  E[X 2 ]

f

g





E[Y 2 ]  102 + 100 1.202 + 0.01  290





Var ( XY )  290 − 122  146





(C)

3.7. For an exponential variable, the variance is the square of the mean. Let θ be the parameter for Y Var ( XY )  E[X 2 ] E[Y 2 ] − E[X]2 E[Y]2 7500  (25 + 25)(2θ 2 ) − 25θ 2  75θ 2

θ  10 Var ( Y )  θ 2  100

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

58

C/4 Study Manual—17th edition Copyright ©2014 ASM

3. VARIANCE

Lesson 4

Mixtures and Splices Reading: Loss Models Fourth Edition 5.2.4–5.2.6, and Loss Models Fourth Edition 18.2 or SN C-21-01 2.4 and SN C-24-05 Appendix B or Introduction to Credibility Theory 5.2

4.1 4.1.1

Mixtures Discrete mixtures

A (finite) mixture distribution is a random variable X whose distribution function can be expressed as a weighted average of n distribution functions of random variables X i , i  1, . . . , n. In other words, FX ( x ) 

n X

w i FX i ( x )

i1

with the weights w i ≥ 0 adding up to 1. Since the density function is the derivative of the distribution function, the density function is the same weighted average of the individual density functions: fX ( x ) 

n X

w i fXi ( x )

i1

If discrete variables are mixed, the probabilities of the mixture are the weighted averages of the component probabilities. For example, suppose X is a mixture of an exponential distribution with mean 100 and weight 60% and an exponential distribution with mean 200 and weight 40%. Then the probability that X ≤ 100 is Pr ( X ≤ 100)  0.6 (1 − e −100/100 ) + 0.4 (1 − e −100/200 )  0.6 (0.6321) + 0.4 (0.3935)  0.5367 A mixture is not the same as a sum of random variables! The distribution function for a sum of random variables—even when they are identically distributed—is usually difficult to calculate. It is important not to confuse the two. Let’s consider the situations where each would be appropriate. A sum of random variables is an appropriate model for a situation where several distinct events occur, and you are interested in the sum. Each event may have the same distribution, or may not. Examples of sums of random variables are: 1. The total of a random sample of n items is a sum of random variables. For a random sample, the items are independent and identically distributed. 2. Aggregate loss on a policy with multiple coverages is a sum of random variables, one for each coverage. If a homeowner’s policy has coverage for fire, windstorm, and theft, the aggregate loss for each of these three coverages could have its own distribution X i , and then the aggregate loss for the entire policy would be X1 + X2 + X3 . These distributions would be different, and may or may not be independent. C/4 Study Manual—17th edition Copyright ©2014 ASM

59

4. MIXTURES AND SPLICES

60

A mixture distribution is an appropriate model for a situation where a single event occurs. However, the single event may be of many different types, and the type is random. For example, let X be the cost of a dental claim. This is a single claim and X has a distribution function. However, this claim could be for preventative work (cleaning and scaling), basic services (fillings), or major services (crowns). Each type of work has a distribution X i . If 40% of the claims are for preventative work, 35% for basic services, and 25% for major services, then the distribution of X will be a weighted average of the distributions of the 3 X i ’s: FX ( x )  0.4FX1 ( x1 ) + 0.35FX2 ( x 2 ) + 0.25FX3 ( x3 ) . It is not true that X  0.4X1 + 0.35X2 + 0.25X3 . In fact, the type of work that occurred is random; it is not true that every claim is 40% preventative, 35% basic, and 25% major. If that were true, there would be less variance in claim size! Since a mixture is a single random variable, it can be used as a model even when there is no justification as given in the last paragraph, if it fits the data well. For calculating means, the mean of a mixture is the weighted average of the means of the components. Since the densities are weighted averages and the expected values are integrals of densities, the expected value of a mixture is the weighted average of the expected values of the components. This is true for any raw moment, not just the first moment. But this immediately implies that the variance of a mixture is not the weighted average of the variances of its components. You must compute the variance by computing the second moment and then subtracting the square of the mean. Example 4A Losses on an auto liability coverage follow a distribution that is a mixture of two Paretos. Each distribution in the mixture has equal weight. One distribution has parameters α  3 and θ  1000, and the other has parameters α  3 and θ  10,000. Calculate the variance of a loss. Answer: Let X be loss size. We have E[X]  0.5

10,000 1000 + 0.5  2750 2 2

!

!

E[X 2 ]  0.5 10002 + 0.5 10,0002  50,500,000









Var ( X )  E[X 2 ] − E[X]2  50,500,000 − 27502  42,937,500



Example 4B The severity distribution for losses on an auto collision coverage is as follows: 2000 F ( x )  1 − 0.7 2000 + x

!3

7500 − 0.3 7500 + x

!4

x≥0

Calculate the coefficient of variation of loss size. Answer: This distribution is a mixture of two Pareto distributions, the first with parameters α  3, θ  2000 and the second with α  4, θ  7500. We calculate the first two moments: 7500 2000 E[X]  0.7 + 0.3  1450 2 3

!

!

2 (2000) 2 2 (75002 ) E[X ]  0.7 + 0.3  8,425,000 (2)(1) (3)(2) 2

!

!

Var ( X )  E[X 2 ] − E[X]2  8,425,000 − 14502  6,322,500 It then follows that the coefficient of variation is √ 6,322,500 CV   1.7341 1450 C/4 Study Manual—17th edition Copyright ©2014 ASM



4.1. MIXTURES

61

Example 4C On an auto collision coverage, there are two classes of policyholders, A and B. 70% of drivers are in class A and 30% in class B. The means and variances of losses for the drivers are: Class

Mean

Variance

A B

300 800

30,000 50,000

A claim is submitted by a randomly selected driver. Calculate the variance of the size of the claim. Answer: This is a mixture situation—a single claim, with probabilities of being one type or another. Let X be claim size. E[X]  0.7 (300) + 0.3 (800)  450 E[X 2 ]  0.7 (30,000 + 3002 ) + 0.3 (50,000 + 8002 )  291,000 Var ( X )  291,000 − 4502  88,500

4.1.2



Continuous mixtures

So far we’ve discussed discrete mixtures. It is also possible for mixtures to be continuous. This means that the distribution function of the mixture is an integral of parametric distribution functions of random variables, and a parameter varies according to a distribution function. The latter distribution is called a mixing distribution. One example which we’ll discuss in detail in Lesson 12 is a random variable based on a Poisson distribution with parameter λ, where λ varies according to a gamma distribution with parameters α and θ. Here is another example. Example 4D The number of losses on a homeowner’s policy is binomially distributed with parameters m  5 and q. The parameter q varies by policyholder uniformly between 0 and 0.4. Calculate the probability of 2 or more losses for a policyholder. Answer: For a single policyholder, Pr ( X  0 | q )  (1 − q ) 5

Pr ( X  1 | q )  5q (1 − q ) 4 To calculate the probability for a randomly selected policyholder, we integrate over q using the uniform 1 density function, which here is 0.4 , as the weight. 1 Pr ( X  0)  0.4

0.4

Z 0

(1 − q ) 5 dq 0.4

1 (1 − q ) 6 0.4 6 0 1  (1 − 0.66 ) 2.4 5  12 (1 − 0.66 ) −

Pr ( X  1)  C/4 Study Manual—17th edition Copyright ©2014 ASM

5 0.4

0.4

Z 0

q (1 − q ) 4 dq

4. MIXTURES AND SPLICES

62

This is easier to integrate by substituting u  1 − q. 

5 0.4

1

Z

0.6

(1 − u ) u 4 du 1

u 5 u 6 − 5 6 0.6 ! ! 1 − 0.65 1 − 0.66  12.5 − 12.5 5 6

!

 12.5

 2.5 (1 − 0.65 ) −

Pr ( X  0) + Pr ( X  1)  2.5 (1 − 0.65 ) −

25 12 (1 20 12 (1

− 0.66 ) − 0.66 )

 2.3056 − 1.5889  0.7167

Therefore, the probability of 2 or more losses is the complement of 0.7167, which is 1 − 0.7167  0.2833 .

4.1.3

Frailty models

A special type of continuous mixture is a frailty model. These models can be used to model loss sizes or survival times. However, the following discusses frailty models only in the context of survival times. Suppose the hazard rate for each individual is h ( x | Λ)  Λa ( x ) , where a ( x ) is some continuous function and the multiplier Λ varies by individual. Thus the shape of the hazard rate function curve does not vary by individual. If you are given that A’s hazard rate is twice B’s at time 1, that implies Λ for A is twice Λ for B. That in turn implies that A’s hazard rate is twice B’s hazard R x rate at all times. Assume that h ( x )  0 for x < 0. Recall from page 4 that H ( x )  0 h ( t ) dt, and that the survival function can be expressed as S ( x )  e −H ( x ) Now let A ( x ) 

R

x 0

a ( t ) dt. Then H ( x | Λ) 

R

x 0

Λa ( t ) dt  ΛA ( x ) and

S ( x | Λ)  e −H ( x|Λ)  e −ΛA ( x )

(∗)

By definition, S ( x )  Pr ( X > x ) , so by the Law of Total Probability (page 9) S ( x )  Pr ( X > x ) 



Z 0

Pr ( X > x | λ ) f ( λ ) dλ  EΛ Pr ( X > x | Λ)  E S ( x | Λ)

f

g

Plugging in S ( x | Λ) from (∗), the unconditional or marginal survival rate S ( x ) is SX ( x )  EΛ S ( x | Λ)  EΛ e −ΛA ( x )  MΛ −A ( x )

f

g

f

g





f

g

(4.1)

where M ( x ) is the moment generating function. In a frailty model, typical choices for the conditional hazard rate given Λ are: • Constant hazard rate, or exponential. This can be arranged by setting a ( x )  1 (or a ( x )  k for any constant k). • Weibull, which can be arranged by setting a ( x )  γx γ−1 Typical choices for the distribution of Λ are gamma and inverse Gaussian, the only distributions (other than exponential, which is a special case of gamma) for which the distribution tables list the moment generating function. Frailty models rarely appear on exams. If they do appear, I would not expect it to be labeled as a “frailty model”, nor would I expect the specific notation (such as a ( x ) ) to be used. Instead, you would be given an appropriate hazard rate conditional on a parameter and a distribution for the parameter. C/4 Study Manual—17th edition Copyright ©2014 ASM

4.2. CONDITIONAL VARIANCE

63

Example 4E For a population following a frailty model, you are given (i) a ( x )  1 (ii) Λ has a gamma distribution with α  0.2 and θ  0.1. For a randomly selected individual from the population: 1. Calculate the probability of surviving to 70. 2. Calculate mean survival time and the variance of survival time. Answer:

1. Use of a ( x )  1 leads to an exponential model. We have x

Z A(x )  S ( x | Λ)  e

1dx  x

0 −ΛA ( x )

 e −Λx

The moment generating function for a gamma (which appears in the distribution tables) is M ( t )  (1 − √5 θt ) −α . In this example, MΛ ( t )  1/ 1 − 0.1t. Then S (70)  MΛ (−70) 1  √5 1 − 0.1 (−70) 1  √5  0.659754 8 2. By equation (4.1), the survival function is 1 S ( x )  M (−x )  1 + 0.1x

! 0.2

10  10 + x

! 0.2

which we recognize as a two-parameter Pareto with α  0.2, θ  10. For such a distribution, all the moments are infinite, so in particular the mean and variance are infinite.  If a gamma Λ is used in conjunction with a Weibull a ( x ) (instead of an exponential, as used in the previous example), then the model has a Burr (instead of a Pareto) distribution.

4.2

Conditional Variance

In Section 1.3, we discussed the conditional mean formula. A useful formula for conditional variance can be developed by calculating the second moment and subtracting the first moment squared using that formula: Var ( X )  E[X 2 ] − E[X]2

 E E[X 2 | I] − E E[X | I]

f

g

f

g2

 E E[X 2 | I] − E[X | I]2 + E E[X | I]2 − E E[X | I]

f

g

 f

 E Var ( X | I ) + Var E[X | I]

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g





g

f

g 2

4. MIXTURES AND SPLICES

64

We’ve derived the conditional variance formula. Conditional Variance Formula VarX ( X )  VarI EX [X | I] + EI VarX ( X | I )





f

g

(4.2)

This formula is also known as a double expectation formula. This is a very important equation which will be used repeatedly throughout the course. It is especially useful for calculating variance of mixtures. In a mixture, the condition I will indicate the component of the mixture. Let’s apply conditional variance to Example 4C. Example 4C (Repeated for convenience) On an auto collision coverage, there are two classes of policyholders, A and B. 70% of drivers are in class A and 30% in class B. The means and variances of losses for the drivers are: Class

Mean

Variance

A B

300 800

30,000 50,000

A claim is submitted by a randomly selected driver. Calculate the variance of the size of the claim. Answer: Let X be claim size. Let the indicator variable I be the class. It has a probability of 0.7 of being class A and 0.3 of being class B. I is Bernoulli, so we’ll apply the previous section’s shortcut. E Var ( X | I )  0.7 (30,000) + 0.3 (50,000)  36,000

f

g

Var E[X | I]  (0.7)(0.3)(800 − 300) 2  52,500





Var ( X )  36,000 + 52,500  88,500

Same answer as before.



Example 4F Claim sizes range between 0 and 1500. The probability that a claim is no greater than 500 is 0.8. Claim sizes are uniformly distributed on (0, 500] and on (500, 1500]. Calculate the coefficient of variation of claim sizes. Answer: Let X be claim size. We will condition on the claim size being greater than or less than 500. Let I be 0 if claim size is in (0, 500] and 1 if claim size is in (500, 1500]. The mean claim size given that it is no greater than 500 is 250, since the mean of a uniform is the midrange. Similarly, mean claim size given that it is greater than 500 is 1000. By the conditional mean formula f g E[X]  E E[X | I]  0.8 (250) + 0.2 (1000)  400

For a uniform distribution, the variance is the range squared divided by 12. So the variance of claim size given that it is no greater than 500 is 5002 /12 and the variance of claim size given that it is greater than 500 is 10002 /12. Therefore E[Var ( X | I ) ]  0.8 (5002 /12) + 0.2 (10002 /12)  33,333 31

The variance of the expected values can be calculated with the Bernoulli shortcut. Var (E[X | I])  (0.8)(0.2)(1000 − 250) 2  90,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

4.2. CONDITIONAL VARIANCE We conclude that Var ( X )

q

65

33,333 13 + 90,000



123,333 31 , and the coefficient of variation is



123,333 13 400  0.8780 .

.

You could also calculate the variance as the second moment minus the first moment squared. The second moment given that claims are in (0, 500] is 5002 /3, and the second moment given that claims are in (500, 1500] is the variance of the uniform plus the mean squared, or 10002 /12 + 10002 . Therefore E[X 2 ]  0.8 (5002 /3) + 0.2 (10002 /12 + 10002 )  283,333 13 The variance is 283,333 13 − 4002  123,333 13 , the same as calculated above.



Let’s do an example where I is continuous.

Example 4G (Same situation as Example 4D.) The number of losses on a homeowner’s policy is binomially distributed with parameters m  5 and Q. Q varies by policyholder uniformly between 0 and 0.4. Calculate the variance of the number of losses for a randomly selected policyholder. Answer: The indicator variable here is Q. Using the Loss Models appendix tables, given Q, f the expected g number of losses is 5Q and the variance is 5Q (1 − Q ) . Then Var (5Q )  25 Var ( Q ) , and E 5Q (1 − Q )  5 E[Q] − 5 E Q 2 . For a random variable U having a uniform distribution on [0, 1], the mean is 12 , the 1 second moment is 31 , and the variance is 12 . Since Q  0.4U,

f

g

E[Q]  E[0.4U]  0.4

1 2

 0.2

E[Q 2 ]  E (0.4U ) 2  0.16

f

g

Var ( Q )  Var (0.4U ) 

1 3



0.16 4  12 300

16 300

So, letting N be the number of losses random variable, E Var ( N | Q )  5 E[Q] − 5 E[Q 2 ]  1 −

f

g

Var E[N | Q]  25





Var ( N ) 

?

1 4  300 3

80 11  300 15

!

11 1 16 +  15 3 15



Quiz 4-1 The random variable S is defined by S

50 X

Xi

i1

The variables X i follow a Poisson distribution with mean θ, and are independent given θ. The parameter θ is uniformly distributed on [1, 3]. Calculate the variance of S.

C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

66

4.3

Splices

Another way of creating distributions is by splicing them. This means using different probability distributions on different intervals in such a way that the total probability adds up to 1. For example, suppose larger loss sizes appear “Pareto”-ish, but smaller loss sizes are pretty uniform. You would like to build a model with a level density function for losses below 100, and then a declining density function starting at 100 which looks like a Pareto. Think a little bit about how you could do this. You are going to use a distribution with a constant density function below 100, and a distribution function which looks like a Pareto above 100. By “looking like a Pareto”, we mean it’ll be of the form f (x ) 

b 2 αθ α ( θ + x ) α+1

where b 2 is a constant that will make things work out right. You may have decided on what α and θ should be. Let’s say α  2 and θ  500. Then the spliced distribution you will use has the form

   b1 f (x )   b (2) 5002   2  (500 + x ) 3

x < 100 x > 100

How would you pick b 1 and b2 ? One thing is absolutely necessary: the total probability, the probability of being less than 100 plus the probability of being greater than 100, must add up to 1. For a spliced distribution, the sum of the probabilities of being in each splice must add up to 1. In our example, let’s say that one third of all losses are below 100. Then you would want F (100)  13 . You would set b 1  1/300 so that

R

100 0

b1 dx  31 . Without b 2 , the Pareto distribution would have

500 1 − F (100)  500 + 100

!2 

25 36

However, you want 1 − F (100)  1 − 31  32 . Therefore, you would scale the Pareto distribution down by 2/3 setting b 2  25/36  0.96. The spliced distribution has density function and distribution function

   1/300 f (x )   0.96 (2) 5002    (500 + x ) 3

x < 100 x > 100

 x/300    !2 F (x )   500    1 − 0.96 500 + x 

x < 100 x > 100

These are graphed in Figure 4.1. Notice that the density function is not continuous. Continuity of the density function is not a requirement of splicing. It may, however, be desirable. In the above example, suppose that we want the density function to be continuous. To allow this to happen, we will not specify the percentage of losses below 100, but will select it to make the density continuous. Then we would have to equate f (100) for the uniform distribution, which is b1 , to f (100) for the Pareto distribution, which is 2b 2 (5002 ) /6003 . The other condition on b1 and b2 was developed above: since 100b 1 is the probability of being below 100, we have ! 25 b2 100b 1  1 − 36 C/4 Study Manual—17th edition Copyright ©2014 ASM

4.3. SPLICES

67

0.004

1

0.0035

0.8

0.003 0.0025

0.6

0.002 0.4

0.0015 0.001

0.2

0.0005 0

0

25

50

75 100 125 150 175 200

0

(a) Density function

0

25

50

75 100 125 150 175 200

(b) Distribution function

Figure 4.1: Spliced distribution with 1/3 weight below 100

Substituting the density equality for b1 and solving, we get

2b 2 (5002 ) 25 100 1− b2 3 36 600

!

200 600

25 25 b2  1 − b2 36 36

!

25 36

!

!

!

!

4 b2  1 3

!

b2  1.08

It follows that b 1  2 (1.08)

5002  6003

 1/400 and F (100)  14 . The density and distribution functions are

1   x < 100    400 f (x )    1.08 (2) 5002    x > 100  (500 + x ) 3 x  x < 100    400  !  2 F (x )   500    1 − 1.08 x > 100 500 + x 

These are graphed in Figure 4.2. C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

68

0.004

1

0.0035

0.8

0.003 0.0025

0.6

0.002 0.4

0.0015 0.001

0.2

0.0005 0

0

25

50

0

75 100 125 150 175 200

(a) Density function

0

25

50

75 100 125 150 175 200

(b) Distribution function

Figure 4.2: Continuous spliced distribution

?

Quiz 4-2 The distribution of X is a spliced distribution. You are given: (i) Pr ( X ≤ 0)  0.

(ii) For 0 < x ≤ 100, the distribution of X is uniform.

(iii) F (100)  1/3.

(iv) For x > 100, the density function of X is f (x ) 

θ (θ + x )2

Determine θ. The textbook gives a formal definition of a spliced distribution as one whose density function is a weighted sum of density functions, with density function j having support (that is, nonzero) only on interval ( c j−1 , c j ) , with the intervals ( c j−1 , c j ) disjoint, and weights a j adding up to 1. Thus, in our first example with a uniform distribution below 100 and Pr ( X < 100)  31 , and a Pareto distribution above 100, the textbook would say that the interval (0, 100) had a uniform distribution on (0, 100) with density 1 f1 ( x )  100 and weight a1  13 , and a distribution defined by f2 ( x ) 

2 (5002 ) / (500 + x ) 3 25/36

x > 100

with weight a 2  32 on (100, ∞) . This means that every splice is a discrete mixture! It’s a mixture of functions defined on disjoint intervals. If the functions are familiar, you may be able to use the tables or your knowledge of the functions to evaluate moments. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 4

69

Example 4H X follows a spliced distribution with the following density and distribution functions: 1   x < 100    400  f (x )   1.08 (2) 5002    x > 100  (500 + x ) 3 x  x < 100    400  !2 F (x )    500    1 − 1.08 x > 100 500 + x  Calculate the mean of X. Answer: X can be considered a mixture of a uniform distribution on [0, 100] with weight 1/4 and a shifted Pareto distribution with weight 3/4. The shifted Pareto distribution is shifted by 100, so set y  x − 100.

Since S ( x )  1.08 500/ (500 + x ) divided by 3/4.



2

for x > 100, the conditional survival function is this survival function

4 500 S ( x | X > 100)  (1.08) 3 500 + x

!2

5  1.44 6

!2

600 600 + ( x − 100)

!2

600  600 + y

!2

which is a Pareto with θ  600, α  2. The mean of the shifted Pareto is 100 plus the mean of the unshifted Pareto, so   600 E[X]  0.25 (50) + 0.75 100 +  537.5  2−1 Notice that Example 4F is a splice of two uniform distributions.

Exercises Mixtures 4.1. The random variable X is a mixture of an exponential distribution with mean 5, with weight 23 , and an inverse exponential distribution with θ  5, with weight 13 . Let H ( x ) be the cumulative hazard function of X.

Calculate H (3) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

4. MIXTURES AND SPLICES

70

4.2. You are given the following information about a portfolio of insurance risks: •

There are three classes of risks: A, B, and C.



The number of risks in each class, and the mean and standard deviation of claim frequency for each class, are given in the following chart:

Class

Number of Risks

A B C

500 300 200

Claim Frequency Standard Mean Deviation 0.10 0.12 0.15

0.20 0.25 0.35

Determine the standard deviation of claim frequency for a risk randomly selected from the portfolio. (A) (B) (C) (D) (E)

Less than 0.240 At least 0.240, but less than 0.244 At least 0.244, but less than 0.248 At least 0.248, but less than 0.252 At least 0.252

Use the following information for questions 4.3 and 4.4: For a group of 1000 policyholders in three classes, you are given: Number of policyholders

Mean loss

Standard deviation of loss

500 300 200

10 20 30

12 30 60

4.3. The number of claims submitted by each policyholder is identically distributed for all policyholders. 1000 claims are submitted from this group. Using the normal approximation, calculate x such that there is a 95% probability that the sum of the claims is less than x. 4.4. Each policyholder submits one claim. Using the normal approximation, calculate x such that there is a 95% probability that the sum of the claims is less than x.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 4

71

4.5. You are given a portfolio of 100 risks in two classes, A and B, each having 50 risks. The losses of the risks in class A have a mean of 10 and a standard deviation of 5. For the entire portfolio, the mean loss is 20 and the standard deviation is 15. Calculate the standard deviation of losses for risks in class B. (A) (B) (C) (D) (E)

Less than 9 At least 9, but less than 13 At least 13, but less than 17 At least 17, but less than 21 At least 21

4.6. Losses for an insurance coverage follow a distribution which is a mixture of an exponential distribution with mean 10 with 75% weight and an exponential distribution with mean 100 with 25% weight. Calculate the probability that a loss is greater than 50. 4.7. Losses for an insurance coverage follow a distribution which is a mixture of an exponential distribution with mean 5 and an exponential distribution with mean θ. The mean loss size is 7.5. The variance of loss size is 75. Determine the coefficient of skewness of the loss distribution. 4.8.

For a liability coverage, you are given

(i) Losses for each insured follow an exponential distribution with mean γ. (ii) γ varies by insured. (iii) γ follows a single-parameter Pareto distribution with parameters α  1 and θ  1000. Calculate the probability that a loss will be less than 500. 4.9. [151-82-93:11] (2 points) A population is equally divided into two classes of drivers. The number of accidents per individual driver is Poisson for all drivers. For a driver selected at random from Class 1, the expected number of accidents is uniformly distributed over (0.2, 1.0) . For a driver selected at random from Class 2, the expected number of accidents is uniformly distributed over (0.4, 2.0) . For a driver selected at random from this population, determine the probability of zero accidents. (A) 0.41

(B) 0.42

(C) 0.43

(D) 0.44

(E) 0.45

Frailty models 4.10.

For a random variable X, you are given:

(i) h ( x | Λ)  2Λx (ii) Λ has an exponential distribution with mean 0.5. Calculate E X 2 | Λ  0.49 .

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

Exercises continue on the next page . . .

4. MIXTURES AND SPLICES

72

4.11. (i) (ii)

For a random variable X, you are given: h ( x | Λ)  Λ Λ follows an inverse Gaussian distribution with µ  2, θ  1.

Calculate F (0.5) . 4.12. (i) (ii)

Survival time X for a population follows a distribution having the following properties: h ( x | Λ)  Λx 2 Λ follows an exponential distribution with mean 0.05.

Calculate median survival time for the population. 4.13. Survival time X for a population of 100-year olds follows a Weibull distribution with the following hazard rate function: h ( x | Λ)  31 Λx −2/3 Λ varies over the population with a gamma distribution having parameters α  5, θ  0.5. Calculate the marginal expected future lifetime for the population. Conditional variance

4.14.

The size of a loss has mean 3λ and variance λ 2 . λ has the following density function: f ( x )  0.00125



4000 x

6

x ≥ 4000.

Calculate the variance of the loss. (A) (B) (C) (D) (E)

Less than 22,000,000 At least 22,000,000, but less than 27,000,000 At least 27,000,000, but less than 32,000,000 At least 32,000,000, but less than 37,000,000 At least 37,000,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 4

73

[4B-S95:14] (3 points) You are given the following:

4.15. •

For a given risk, the number of claims for a single exposure period will be 1, with probability 3/4; or 2, with probability 1/4.



If only one claim is incurred, the size of the claim will be 80, with probability 2/3; or 160, with probability 1/3.



If two claims are incurred, the size of each claim, independent of the other, will be 80, with probability 1/2; or 160 with probability 1/2. Determine the variance of the pure premium1 for this risk.

(A) (B) (C) (D) (E)

Less than 3600 At least 3600, but less than 4300 At least 4300, but less than 5000 At least 5000, but less than 5700 At least 5700 [4B-F98:8] (2 points) You are given the following:

4.16. •

A portfolio consists of 75 liability risks and 25 property risks.



The risks have identical claim count distributions.



Loss sizes for liability risks follow a Pareto distribution with parameters θ  300 and α  4.



Loss sizes for property risks follow a Pareto distribution with parameters θ  1,000 and α  3. Determine the variance of the claim size distribution for this portfolio for a single claim.

(A) (B) (C) (D) (E)

Less than 150,000 At least 150,000, but less than 225,000 At least 225,000, but less than 300,000 At least 300,000, but less than 375,000 At least 375,000

4.17. The number of claims on a policy has a Poisson distribution with mean P. P varies by policyholder. P is uniformly distributed on [1, 2]. Calculate the variance of the number of claims. (A) 3/2

(B) 19/12

(C) 5/3

(D) 7/4

(E) 23/12

4.18. Claim size is exponentially distributed with mean λ. λ varies by insured, and follows a Pareto distribution with parameters α  5 and θ. Variance of claim size is 9.75. Determine θ. (A) 4

(B) 5

(C) 6

(D) 7

(E) 8

1For the definition of pure premium, see page 2. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

4. MIXTURES AND SPLICES

74

[4B-F92:23] (2 points) You are given the following:

4.19. •

A portfolio of risks consists of 2 classes, A and B.



For an individual risk in either class, the number of claims follows a Poisson distribution.

Class A B Total Portfolio

Number of Exposures 500 500 1,000

Distribution of Claim Frequency Standard Mean Deviation 0.050 0.227 0.210 0.561

Determine the standard deviation of the claim frequency for the total portfolio. (A) (B) (C) (D) (E)

Less than 0.390 At least 0.390, but less than 0.410 At least 0.410, but less than 0.430 At least 0.430, but less than 0.450 At least 0.450

4.20. [1999 C3 Sample:10] An insurance company is negotiating to settle a liability claim. If a settlement is not reached, the claim will be decided in the courts 3 years from now. You are given: •

There is a 50% probability that the courts will require the insurance company to make a payment. The amount of the payment, if there is one, has a lognormal distribution with mean 10 and standard deviation 20.



In either case, if the claim is not settled now, the insurance company will have to pay 5 in legal expenses, which will be paid when the claim is decided, 3 years from now.



The most that the insurance company is willing to pay to settle the claim is the expected present value of the claim and legal expenses plus 0.02 times the variance of the present value.



Present values are calculated using i  0.04. Calculate the insurance company’s maximum settlement value for this claim.

(A) 8.89 4.21. (i) (ii) (iii) (iv) (v)

(B) 9.93

(C) 12.45

(D) 12.89

(E) 13.53

[151-83-94:6] (2 points) For number of claims N and aggregate claims S, you are given: Pr ( N  i )  13 , i  0, 1, 2; E[S | N  1]  3; E[S | N  2]  9; Var ( S | N  1)  9; and Var ( S | N  2)  18.

Determine Var ( S ) . (A) 19

(B) 21

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 23

(D) 25

(E) 27

Exercises continue on the next page . . .

EXERCISES FOR LESSON 4

75

Splices 4.22. Loss sizes follow a spliced distribution. The probability density function of the spliced distribution below 500 is the same as the probability density function of an exponential distribution with parameter θ  250. The probability density function of the spliced distribution above 500 is a multiple, a, of the probability density function of a Weibull distribution with parameters τ  2, θ  400. Determine a. 4.23. Loss sizes follow a spliced distribution. Losses below 200 are uniformly distributed over (0, 200]. The probability density function of the spliced distribution above 200 is a multiple of the probability density function of an exponential distribution with parameter θ  400. The probability density function is continuous at 200. Calculate the probability that a loss will be below 200. 4.24. Loss sizes follow a spliced distribution. The probability density function of this distribution below 200 is a multiple a of the probability density function of an exponential distribution with θ  300. The probability density function above 200 is the same as for an exponential distribution with θ  400. Let X be loss size. Calculate Pr ( X < 100) . 4.25. Loss sizes follow a spliced distribution. The probability density function of the spliced distribution below 100 is the same as that of a lognormal distribution with parameters µ  3, σ  2. The probability density function of the spliced distribution above 100 is a times the probability density function of a twoparameter Pareto distribution with parameters α  2, θ  300. Calculate the probability that a loss will be greater than 200. 4.26. Loss sizes follow a spliced distribution. The probability density function of this distribution below 500 is a multiple of the probability density function of an exponential with θ  250. The probability density function of the spliced distribution above 500 is a multiple of the probability density function for a single-parameter Pareto distribution with α  3, θ  500. Half of the losses are below 500. Calculate the expected value of a loss. 4.27.

The random variable X has the following spliced distribution: x   0 ≤ x ≤ 100   160 F (x )    1 − 0.375e −( x−100)/200 x > 100 

Calculate Var ( X ) . Additional released exam questions: SOA M-S05:34, SOA M-F05:35, CAS3-F06:18,19,20, SOA M-F06:39, C-S07:3

C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

76

Solutions 4.1. We first write down the survival function: S ( x )  32 ( e −x/5 ) + 31 (1 − e −5/x ) and then calculate its logarithm at 3: H (3)  − ln S (3)  − ln



2 −3/5 ) 3 (e

+ 31 (1 − e −5/3 )



 − ln 0.6362  0.4522 4.2. This is a mixture distribution. The first moment is

E[X]  0.5 (0.10) + 0.3 (0.12) + 0.2 (0.15)  0.116 The second moment is E[X 2 ]  0.5 (0.102 + 0.202 ) + 0.3 (0.122 + 0.252 ) + 0.2 (0.152 + 0.352 )  0.07707 √ The variance is 0.07707 − 0.1162  0.063614. The standard deviation is 0.063614  0.252218 . (E)

4.3. Since there are 1000 random claims “from the group”, this is a sum of 1000 mixture random variables. The random variable for a single claim is a mixture with mean 0.5 (10) + 0.3 (20) + 0.2 (30)  17 and second moment

0.5 (244) + 0.3 (1300) + 0.2 (4500)  1412 172

so the variance is 1412 −  1123. The mean and variance are multiplied by 1000 for 1000 claims, and the normal approximation requires x  1000 (17) + 1.645 1000 (1123)  18,743.23

p

4.4. Now we have a sum, not a mixture. The mean of the 1000 claims is the same as before, although technically it’s calculated as E[X]  500 (10) + 300 (20) + 200 (30)  17,000 The variance is the sum of the variances, or Var ( X )  500 (122 ) + 300 (302 ) + 200 (602 )  1,062,000 The normal approximation requires x  17,000 + 1.645 1,062,000  18,695.23

p

The variance is lower than in the previous exercise (where it is 1123 per claim, or 1,123,000 for 1000 claims) because the uncertainty on who is submitting the claims has been removed.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 4

77

4.5. Let µ B be the mean loss for class B, and let σB2 be the variance of loss for class B. Since the classes are equal in size, the raw moments of loss for the entire portfolio are equally weighted raw moments of the losses for each class. The first moment of losses for the entire portfolio is 20  21 (10 + µ B ) , so µ B  30. The second moment of losses for the entire portfolio is 1 2 1 625  2

202 + 152 



102 + 52 + 302 + σB2



1025 + σB2





from which it follows that σB  15 . (C) 4.6. The probability that a loss is greater than 50 is the weighted average of the probability that each component of the mixture is greater than 50. Let X be a loss. If X1 is exponential with mean 10 and X2 is exponential with mean 100, then Pr ( X1 > 50)  e −50/10  0.006738 Pr ( X2 > 50)  e −50/100  0.606531 Pr ( X > 50)  0.75 (0.006738) + 0.25 (0.606531)  0.1567 4.7. Let w be the weight of the exponential with mean θ. The second moment is 75 + 7.52  131.25. From equating the first moments, 5 (1 − w ) + θw  7.5

5 − 5w + θw  7.5

w ( θ − 5)  2.5

(*)

From equating the second moments, 50 (1 − w ) + 2wθ 2  131.25 −50w + 2wθ 2  81.25 2w ( θ 2 − 25)  81.25

Dividing the first moment equation into the second moment equation eliminates w: 81.25 2.5 81.25 θ − 5  11.25 2.5 (2)

2 ( θ + 5) 

Plugging into (*), w (6.25)  2.5 2.5 w  0.4 6.25 To calculate skewness, we only need E[X 3 ], since we already know the first and second moments and the variance. For an exponential, E[X 3 ]  6θ 3 , so E[X 3 ]  6 0.6 (53 ) + 0.4 (11.253 )  3867.1875



C/4 Study Manual—17th edition Copyright ©2014 ASM



4. MIXTURES AND SPLICES

78

E[X 3 ] − 3 E[X 2 ]µ + 2µ3 σ3 3867.1875 − 3 (131.25)(7.5) + 2 (7.53 )  753/2 3867.1875 − 2953.125 + 843.75  2.70633  753/2

γ1 

4.8. The conditional probability of a loss less than 500 given γ is F (500 | γ )  1 − e −500/γ . We integrate this over γ, using the density function of a single-parameter Pareto. Pr ( X < 500) 



Z



1000

1 − e −500/γ

 1000 γ2



We evaluate the integral. 1000 ∞ − 2e −500/γ 1000 γ 1000  −2 + + 2e −500/1000 1000  −1 + 2e −1/2  0.2131

Pr ( X < 500)  −

4.9. The probability of 0 accidents is e −λ . We integrate this over the uniform distribution for each class: 1 Class 1: 0.8

Z

1

1 Class 2: 1.6

Z

0.2 2 0.4

e −λ dλ 

e −0.2 − e −1  0.5636 0.8

e −λ dλ 

e −0.4 − e −2  0.3344 1.6

The average is 21 (0.5636 + 0.3344)  0.4490 . (E)

2

2

4.10. Since a ( x )  2x here, A ( x )  x 2 and S ( x | Λ  0.49)  e −0.49x  e − (0.7x ) , so this is a Weibull with θ  1/0.7 and τ  2. According to the tables, the second moment is 1  2.0408 0.72

θ 2 Γ (1 + 2/τ ) 

4.11.

Since a ( x )  1 here, A(x )  x S (0.5)  MΛ (−0.5)

*θ  exp .. *1 − µ , ,

r

*1  exp . *1 − 2

r

1−

+ 2µ2 (−0.5) +// θ --

1−

2 ( 22 ) (−0.5) ++/ 1

-, , √  −0.61803  0.53900  exp 0.5 (1 − 5)  e The cumulative distribution function is F (0.5)  1 − S (0.5)  0.46100 . C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 4

4.12.

79

H ( x | Λ)  Λx 3 /3, so S ( x | Λ)  e −Λx

3 /3

S ( x )  EΛ [e −Λx

3 /3

−x 3

 MΛ

]

!

3 1 1   1 + 0.05 ( x 3 /3) 1 + x 3 /60 √3 The unconditional distribution of X is Burr with α  1, γ  3, θ  60. Later on (Lesson 8), we’ll learn that the tables list percentiles for the distributions under VaRp (this is the 100p th percentile), so we can look up VaR0.5 ( X ) . Otherwise, calculate the median m from first principles: S (m ) 1 1 + m 3 /60 m3 60 m

4.13.

 0.5  0.5 1 √3  60  3.9149

This is a frailty model with a ( x )  31 x −2/3 , or a Weibull model, and x

Z A(x ) 

0

1 −2/3 dt 3t

 x 1/3

By equation (4.1), the marginal survival function is S ( x )  EΛ S ( x | Λ)  EΛ e −Λx

f

g

 MΛ −x 1/3



 1 + θx 1/3



1/3

g



 −α

 1 + 0.5x 1/3



f

 −5

1  1 + (0.125x ) 1/3

!5

This is the survival function for a Burr distribution with θ  8, α  5, γ  1/3. Based on the distribution tables, the expected value is θΓ (1 + 1/γ ) Γ ( α − 1/γ ) Γ(α) 8Γ (1 + 3) Γ (5 − 3)  Γ (5) 8 (3!1!)   2 4!

E[X] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

80

4.14.

We recognize the distribution as a single-parameter Pareto with θ  4000 and α  5. The mean, 2

) )  5000 and the second moment E[λ 2 ]  (5)(4000  80,000,000 . Therefore, the variance of the E[λ]  (5)(4000 4 3 3 80,000,000 5,000,000 2 distribution, Var ( λ )  − 5000  . We will need this information for the formula: 3 3

Var ( X )  E Var ( X | λ ) + Var E[X | λ]

f

g





 E[λ 2 ] + Var (3λ ) 80,000,000 5,000,000  + (9)  41,666,667 3 3

4.15.

(E)

Let PP be the pure premium. Var ( PP )  Var E[PP | N] + E Var ( PP | N )





f

g

We will use the Bernoulli shortcut discussed in Section 3.3 on page 54. E[PP | N] is a random variable. If N  1, it assumes the value of 23 (80) + 13 (160)  320 3 . When N  2, each claim has expected value 12 (80) + 21 (160)  120, so PP, which is the sum of 2 claims, has expected value 2 (120)  240. The variance of this random variable is the product of the probabilities of N  1 and N  2 times the square of the difference between 320 3 and 240: 3 Var E[PP | N]  4





1 4

!

!

320 − 240 3

2

 3333 31

Var ( PP | N ) is a random variable. When N  1, the conditional variance is the product of the probabilities of 80 and 160 times the square of the difference between 160 and 80: 2 Var ( PP | N  1)  3

12,800 1 (160 − 80) 2  3 9

!

!

When N  2, the conditional variance of each loss is the product of the probabilities of 80 and 160 times the square of the difference between 160 and 80. Since there are two losses, this is multiplied by 2. 1 Var ( PP | N  2)  2 2

!

1 (160 − 80) 2  3200 2

!

The expected value of the Var ( PP | N ) is then E Var ( PP | N ) 

f

Putting it all together

4.16.

g

3 12,800 1 + (3200)  1866 23 4 9 4

!

Var ( PP )  3333 13 + 1866 23  5200

(D)

Let I be the indicator of whether the risk is liability or property. Var ( X )  Var E[X | I] + E Var ( X | I )





f

300  100 4−1 2 · 3002 Var ( X | liability)  − 1002  20,000 3·2 E[X | liability] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

EXERCISE SOLUTIONS FOR LESSON 4

81

1,000  500 3−1 2 · 1,0002 Var ( X | property)  − 5002  750,000 2·1 E[X | property] 

By the Bernoulli shortcut (Section 3.3, page 54), since the two expected values are 400 apart and have probabilities 34 and 41 ,     Var E[X | I]  34 14 (500 − 100) 2  30,000 Also,

f

So the answer is

g

+

1 4 (750,000)

 202,500

Var ( X )  30,000 + 202,500  232,500

Var ( N )  Var E[N | P] + E Var ( N | P )  Var ( P ) + E[P] 



4.17.

3 4 (20,000)

E Var ( X | I ) 



f

g

1 12

(C) +

3 2



19 12

. (B)

4.18. 9.75  E Var ( X | λ ) + Var E[X | λ]

f

g





 E λ 2 + Var ( λ )

f

g

2θ 2 13 2 2θ 2 θ 2 + −  θ 12 12 16 48

!



θ 2  36 θ 6 4.19.

(C)

Let I  the random variable indicating the class. Var ( N )  Var E[N | I] + E Var ( N | I )





f

g

E[N | I] is a random variable which assumes the value 0.050 half the time and 0.210 half the time. The probabilities of 0.050 and 0.210 are each 0.50, and the difference of the two values is 0.210 − 0.050  0.160, so by the Bernoulli shortcut, the variance of E[N | I] is (0.50)(0.50)(0.1602 )  0.0064. Similarly, Var ( N | I ) is a random variable which assumes two values, 0.2272 and 0.5612 , each one half the time. The expected value of this random variable is 21 (0.2272 + 0.5612 )  0.183125. Putting it all together Var ( N )  0.0064 + 0.183125  0.189525 √ σN  0.189525  0.4353 (D) This exercise can also be worked out by calculating first and second moments. E[N]  0.5 (0.05) + 0.5 (0.21)  0.13 E[N 2 ]  0.5 (0.052 + 0.2272 ) + 0.5 (0.212 + 0.5612 )  0.206425 Var ( N )  0.206425 − 0.132  0.189525 √ (D) σN  0.189525  0.4353

C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

82

4.20. The expected present value of the claim is 0.5 (10/1.043 ) , and the present value of legal fees is 5/1.043 , for a total of 10/1.043  8.89. We will compute the variance using the conditional variance formula. The legal expenses are not random and have no variance, so we’ll ignore them. Let I be the indicator variable for whether a payment is required, and X the settlement value. Var ( X )  Var E[X | I] + E Var ( X | I )





f

g

The expected value of the claim is 0 with probability 50% and 10/1.043 with probability 50%. Thus the expected value can only have one of two values. It is a Bernoulli random variable. The Bernoulli shortcut says that its variance is !2   10  19.7579 Var E[X | I]  (0.5)(0.5) 1.043 The variance of the claim is 0 with probability 50% and (20/1.043 ) 2 with probability 50%. The expected value of the variance is therefore

!2

20 +  158.0629 E Var ( X | I )  (0.5) *0 + 1.043

f

g

-

,

Therefore, Var ( X )  19.7579 + 158.0629  177.8208. The answer is 8.89 + 0.02 (177.8208)  12.4464

(C)

4.21. Naturally, if N  0, both mean and variance are zero. Using the conditional variance formula, 0 + 9 + 18 9 3 0+3+9  4 3 0 + 9 + 81  30  3  30 − 42  14

E Var ( S | N ) 

f

g

E E[S | N]

f

g

E E[S | N]2

g

Var E[S | N]



f



Var ( S )  9 + 14  23

(C)

4.22. The probability of a loss below 500 is 1 − e −500/250  1 − e −2 . Therefore, the probability of a loss above 500 is e −2 . Equating this to the Weibull at 500: e −2  ae − (500/400)

2

a  e 25/16−2  e −7/16  0.645649 4.23. Let p be the probability, and a the multiple of the exponential distribution. Then F (200)  p, and that plus Pr ( X > 200) must equal 1, so p + ae −200/400  1 Since the density of the uniform distribution is

p 200

and equals the exponential at 200,

p ae −200/400  200 400 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 4

83

0.004

1

0.0035

0.8

0.003 0.0025

0.6

0.002 0.4

0.0015 0.001

0.2

0.0005 0

0

0

50 100 150 200 250 300 350 400 (a) Density function

0

50 100 150 200 250 300 350 400 (b) Distribution function

Figure 4.3: Spliced distribution in exercise 4.24.

a  2pe 1/2 Substituting for a in the other expression: p + 2p  1 p

1 3

Since S (200)  e −200/400  e −1/2 , it follows that a 1 − e −200/300  1 − e −1/2 . But then



4.24.

a and



1 − e −1/2 0.393469   0.808638 1 − e −2/3 0.486583

Pr ( X < 100)  0.808638 1 − e −100/300  0.229224





The density and distribution functions are shown in Figure 4.3. Note that this is a discontinuous density function. Splicing two exponentials does not produce a single exponential. 4.25.

We need to calculate a. First we calculate F (100) . ln 100 − 3  Φ (0.80)  0.7881 Φ 2

!

For the given Pareto, S (100) 

3 2 4

 0.5625. Therefore, a must be (1 − 0.7881) /0.5625  0.3767. Then

300 Pr ( X > 200)  0.3767 300 + 200

!2

 (0.3767)(0.36)  0.1356

4.26. For the exponential, F (500)  1 − e −500/250  1 − e −2 , and we are given that F (500)  0.5, so the 0.5 constant for the exponential density is 1−e −2 . The single-parameter Pareto has its entire support over 500— in fact, S (500)  1 for that distribution—so the constant for the Pareto to make S (500)  0.5 is 0.5. Let’s treat this as an equally weighted mixture of a truncated exponential divided by 1 − e −2 and a single-parameter Pareto. The mean of the single-parameter Pareto is αθ/ ( α − 1)  750. The mean of the C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

84

exponential can be calculated by integration. 500

Z 0

500

xe −x/250 dx  −xe −x/250 250 0  −500e

−2

+

500

Z 0

e −x/250 dx

+ 250 1 − e −2  250 − 750e −2





and dividing by 1 − e −2 , we get (250 − 750e −2 ) / (1 − e −2 )  171.7412. Thus the expected value of a loss is 0.5 (750 + 171.7412)  460.8706 . 4.27. X is a mixture with 5/8 weight on a uniform on [0, 100] and a 3/8 weight on a shifted exponential. The uniform has mean 50 and variance 1002 /12 and the exponential has θ  200 and shift 100, so its mean is 300 and its variance is 2002 . By conditional variance and Bernoulli shortcut, Var ( X )  E[1002 /12, 2002 ] + Var (50, 300) 5  8

!

1002 3 5 + (2002 ) + 12 8 8

!

!

!

3 (300 − 50) 2  30,169.27 8

!

Quiz Solutions 4-1. The X i ’s are independent given θ but are not independent, so we can’t just add up the individual variances. Instead, we’ll use conditional variance, with θ being the condition. E[S | θ]  50 E[X i | θ]  50θ

Var ( S | θ )  50 Var ( X i | θ )  50θ Using conditional variance, Var ( S )  E[Var ( S | θ ) ] + Var (E[S | θ])  E θ (50θ ) + Varθ (50θ )

 50 E[θ] + 2500 Var ( θ )  50 (2) + 2500 (22 /12)  933 31 4-2. Equate 1 − F (100) with the integral of f ( x ) for x > 100. ∞

θ dx 2 100 ( θ + x ) ∞ θ θ 2 −  3 θ + x 100 θ + 100 3θ  200 + 2θ

1 − F (100) 

Z

θ  200

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 5

Policy Limits Reading: Loss Models Fourth Edition 3.1, 8.1, 8.4 Many insurance coverages limit the amount paid per loss. A policy limit is the maximum amount that insurance will pay for a single loss. To model insurance payments, define the limited loss variable X ∧ u in terms of the loss X as follows:

 X X∧u u

X θ; otherwise, E[X ∧ u]  u, since Pr ( X ≥ u )  1. The formula for the limited expected value for a single-parameter Pareto with α  1 and d > θ is missing from the tables. It is easily derived: d

Z E[X ∧ d] 

0 θ

Z 

0

S ( x ) dx 1 dx +

d

Z θ

d

!

θ dx x

 θ + θ (ln x )  θ 1 + ln θ

d θ

!

For exponentials and two-parameter Paretos, the formulas aren’t so good for moments higher than the first since they require evaluating incomplete gammas or betas. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 5

87

Example 5B Loss amounts follow a Pareto distribution with α  0.25, θ  100. An insurance coverage on the losses has a policy limit of 60. Calculate the expected insurance payment per loss. Answer: The tables say θ θ * .1 − E[X ∧ x]  α−1 x+θ

+/ -

,

so

100 100 * .1 − E[X ∧ 60]  −0.75 160

,

?

! α−1

! −0.75

+/  56.35 -



Quiz 5-1 Loss amounts follow an exponential distribution with θ  100. An insurance coverage on the losses has a policy limit of 80. Calculate the expected insurance payment per loss.

Note on inflation Many of the exercises combine policy limits with inflation. Suppose X is the original variable and Y is the inflated variable: Y  (1 + r ) X. Then (1 + r ) can be factored out of E[Y ∧ u] as follows: u E[Y ∧ u]  E (1 + r ) X ∧ u  (1 + r ) E X ∧ 1+r

f

g





(5.7)

If X is from a scale distribution, you may instead modify the distribution (usually by multiplying the scale factor by 1 + r) and then work with the modified distribution.

Exercises [4B-F97:8 and CAS3-F04:26] (2 points) You are given the following:

5.1. •

A sample of 2,000 claims contains 1,700 that are no greater than $6,000, 30 that are greater than $6,000 but no greater than $7,000, and 270 that are greater than $7,000.



The total amount of the 30 claims that are greater than $6,000 but no greater than $7,000 is $200,000.



The empirical limited expected value function for this sample evaluated at $6,000 is $1,810. Determine the empirical limited expected value function for this sample evaluated at $7,000.

(A) (B) (C) (D) (E)

Less than $1,910 At least $1,910, but less than $1,930 At least $1,930, but less than $1,950 At least $1,950, but less than $1,970 At least $1,970

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

5. POLICY LIMITS

88

Table 5.1: (Limited) Expected Value Formulas

All formulas assume Pr ( X < 0)  0. ∞

Z E[X] 

0

Z E[X ∧ u] 

0

Z  E[X k ]  k

E[ ( X ∧ u ) ]  

0

Z 0

Z

u

u

∞ u

Z0 u 0

S ( x ) dx

(5.2)

x f ( x ) dx + u 1 − F ( u )



(5.4)



S ( x ) dx

(5.6)

kx k−1 S ( x ) dx

(5.1)

x k f ( x ) dx + u k 1 − F ( u )



kx k−1 S ( x ) dx



(5.3) (5.5)

If Y  (1 + r ) X, then



E[Y ∧ u]  (1 + r ) E X ∧

u 1+r



(5.7)

5.2. [4B-S91:27] (3 points) The Pareto distribution with parameters θ  12,500 and α  2 appears to be a good fit to 1985 policy year liability claims. What is the estimated claim severity for a policy issued in 1992 with a 200,000 limit of liability? Assume that inflation has been a constant 10% per year. (A) (B) (C) (D) (E)

Less than 22,000 At least 22,000, but less than 23,000 At least 23,000, but less than 24,000 At least 24,000, but less than 25,000 At least 25,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 5

89

5.3. [4B-F93:5] (3 points) You are given the following: •

The underlying distribution for 1993 losses is given by f ( x )  e −x ,

x > 0,

where losses are expressed in millions of dollars. •

Inflation of 5% impacts all claims uniformly from 1993 to 1994.



Under a basic limits policy, individual losses are capped at $1.0 million in each year. What is the inflation rate from 1993 to 1994 on the capped losses?

(A) (B) (C) (D) (E)

Less than 1.5% At least 1.5%, but less than 2.5% At least 2.5%, but less than 3.5% At least 3.5%, but less than 4.5% At least 4.5%

5.4. Losses in 2008 follow a two parameter Pareto distribution with parameters α  1 and θ  1000. Insurance pays losses up to a maximum of 100,000. Annual inflation of 5% increases loss sizes uniformly in 2009 and 2010. Determine the ratio of the average payment per loss in 2010 to the average payment per loss in 2008. 5.5. [4B-F94:8] (3 points) You are given the following: •

In 1993, an insurance company’s underlying loss distribution for an individual claim amount is lognormal with parameters µ  10.0 and σ 2  5.0.



From 1993 to 1994, an inflation rate of 10% impacts all claims uniformly.



In 1994, the insurance company purchases excess-of-loss reinsurance that caps the insurer’s loss at 2,000,000 for any individual claim.

Determine the insurer’s 1994 expected net claim amount for a single claim after application of the 2,000,000 reinsurance cap. (A) (B) (C) (D) (E)

Less than 150,000 At least 150,000, but less than 175,000 At least 175,000, but less than 200,000 At least 200,000, but less than 225,000 At least 225,000

5.6. An insurance company’s underlying loss distribution for individual claim amounts is singleparameter Pareto with α  2.5, θ  1000. Calculate the expected payment per loss for an insurance coverage with a policy limit of 5000. 5.7. An insurance company’s underlying loss distribution for individual claim amounts is singleparameter Pareto with α  1, θ  1000. Calculate the expected payment per loss for an insurance coverage with a policy limit of 3000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

5. POLICY LIMITS

90

5.8. [3-F01:28] The unlimited severity distribution for claim amounts under an auto liability insurance policy is given by the cumulative distribution: F ( x )  1 − 0.8e −0.02x − 0.2e −0.001x ,

x≥0

The insurance policy pays amounts up to a limit of 1000 per claim. Calculate the expected payment under this policy for one claim. (A) 57

(B) 108

(C) 166

(D) 205

(E) 240

5.9. The claim size distribution for an insurance coverage is modeled as a mixture of a two-parameter Pareto with parameters α  2, θ  1000 with weight 12 and a two-parameter Pareto with parameters α  1, θ  2000 with weight 21 . Calculate the limited expected value at 3000 of the claim sizes.

5.10. For an insurance coverage, claim sizes follow a distribution which is a mixture of a uniform distribution on [0, 10] with weight 0.5 and a uniform distribution on [5, 13] with weight 0.5. The limited expected value at a of claim sizes is 6.11875. Determine a. [4B-S93:12] (3 points) You are given the following:

5.11. (i)

The underlying distribution for 1992 losses is given by f ( x )  e −x , x > 0, where losses are expressed in millions of dollars. (ii) Inflation of 10% impacts all claims uniformly from 1992 to 1993. (iii) The policy limit is 1.0 (million). Determine the inflation rate from 1992 to 1993 on payments on the losses. (A) (B) (C) (D) (E) 5.12. •

Less than 2% At least 2%, but less than 3% At least 3%, but less than 4% At least 4%, but less than 5% At least 5% For an insurance coverage, you are given

Before inflation, E[X ∧ x]  500 −

1,000,000 x

for x > 2000.



Claims are subject to a policy limit 10,000.



Inflation of 25% uniformly affects all losses. Calculate the ratio of expected payment size after inflation to expected payment size before inflation.

5.13.

Losses follow a uniform distribution on (0, 5000]. An insurance coverage has a policy limit of 3000.

Calculate the variance of the payment per loss on the coverage.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 5

91

5.14. [3-F00:31] For an industry-wide study of patients admitted to hospitals for treatment of cardiovascular illness in 1998, you are given: (i)

(ii)

Duration In Days

Number of Patients Remaining Hospitalized

0 5 10 15 20 25 30 35 40

4,386,000 1,461,554 486,739 161,805 53,488 17,384 5,349 1,337 0

Discharges from the hospital are uniformly distributed between durations shown in the table.

Calculate the mean residual time remaining hospitalized, in days, for a patient who has been hospitalized for 21 days. (A) 4.4

(B) 4.9

(C) 5.3

(D) 5.8

(E) 6.3

Additional released exam questions: CAS3-F05:21

Solutions 5.1. Let X be the empirical loss variable. How do E[X ∧ 6000] and E[X ∧ 7000] differ?

1.

For claims below 6000, the full amount of the claim enters both, so they don’t differ.

2.

For claims above 7000, E[X ∧ 6000] includes 6000 and E[X ∧ 7000] includes 7000.

3.

For claims between 6000 and 7000, E[X ∧ 6000] includes 6000, whereas E[X ∧ 7000] includes the full amount of the claim.

Since the limited-at-6000 average of 2000 claims, E[X ∧ 6000], is 1810, the sum of the claims is 2000 (1810) . To this, we add the differences in categories 2 and 3. The difference in category 2 is 1000 times the number of claims in this category, or 1000 (270)  270,000. The difference in category 3 is that E[X ∧ 6000] includes 30 (6000)  180,000 for these 30 claims, whereas E[X ∧ 7000] includes the full amount of 200,000. The difference is 200,000 − 180,000  20,000. The answer is therefore E[X ∧ 7000] 

2000 (1810) + 270,000 + 20,000  1955 2000

(D)

5.2. Let X be the original variable, and let Z  1.17 X be the inflated variable. For Z, θ  12,500 (1.1) 7  24,358.96.

! 24,358.96 + * . /  21,714.28 E[Z ∧ 200,000]  (24,358.96) 1 − 224,358.96 , -

C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

5. POLICY LIMITS

92

5.3. E[X1993 ∧ 1]  1 1 − e −1  0.6321



In 1994, θ  1.05



E[X1994 ∧ 1]  1.05 1 − e −1/1.05  0.6449





E[X1994 ∧ 1] − 1  0.0202 E[X1993 ∧ 1]

(B)

5.4. Let X be the original loss variable. Let Y  1.052 X, be the inflated loss random variable. The parameters of Y are α  1 and θ  1000 (1.052 )  1102.50. 1000 Average payment per loss for X is E[X ∧ 100,000]  −1000 ln  4615.12 101,000

!

1102.50 Average payment per loss for Y is E[Y ∧ 100,000]  −1102.50 ln  4981.71 101,102.50

!

The ratio is 4981.71/4615.12  1.07943 . 5.5. To scale a lognormal random variable by r, add ln r to µ and do not change σ. (See Table 2.3.) So √ in this exercise, parameters of the lognormal after inflation are µ  10 + ln 1.1  10.0953 and σ  5. Let X be the inflated variable. We are being asked to calculate E[X ∧ 2,000,000]. Using the tables,

  ln 2,000,000 − µ − σ2 σ2 E[X ∧ 2,000,000]  exp µ + Φ + 2,000,000 1 − F (2,000,000) 2 σ !

!

Let’s compute the three components of this formula. exp µ +

σ2 5  exp 10 + ln 1.1 +  295,171 2 2

!





ln 2,000,000 − 10 − ln 1.1 − 5  Φ (−0.26)  0.3974 Φ √ 5 ! ln 2,000,000 − 10 − ln 1.1 F (2,000,000)  Φ  Φ (1.97)  0.9756 √ 5

!

The answer is E[X ∧ 2,000,000]  295,171 (0.3974) + 2,000,000 (1 − 0.9756)  166,101 5.6. E[X ∧ 5000] 

(B)

2.5 (1000) 10002.5 −  1607.04 1.5 (1.5) 50001.5

5.7. The formula in the tables cannot be used for α  1, so we integrate the survival function. The survival function must be integrated from 0, even though the support of the random variable starts at 1000. E[X ∧ 3000] 

1000

Z 0

Z 

0

1000

S ( x ) dx + 1 dx +

Z

Z

3000

1000 3000

1000

S ( x ) dx

1000 dx x

 1000 + 1000 (ln 3000 − ln 1000)  2098.61

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 5

93

5.8. Limited expected value is additive, so the LEV of a mixture is the weighted average of the LEV’s of the components. You can also integrate if you wish. The distribution is an 80/20 mixture of exponentials with θ  50 and 1000. The limited expected values are   50 1 − e −1000/50  50 and

1000 1 − e −1000/1000  632.12.





The answer is 0.8 (50) + 0.2 (632.12)  166.42 . (C)

5.9. Use the formulas for the limited expected value of a two-parameter Pareto given in the Loss Models appendix. The limited expected value of the mixture is the weighted average of the two limited expected values of the components. E[X ∧ 3000]  0.5 1000 (1 − 1000/4000) + 0.5 −2000 ln (2000/5000)









 375 − 1000 ln 0.4  1291.29

5.10. This can be done using the definition of E[X ∧ a] or equation (5.6). First we will use the definition. In the following, we will assume a is between 5 and 10. If the result is not in this range, we would have to adjust the integrals. 2 E[X ∧ a]   

1 10

a

Z 0

x dx +

1 10 a (10

− a) +

1 8

a

Z 5

x dx + 81 a (13 − a )

2 1 a2 25 13 a2 10 (10a − a ) + 16 − 16 + 8 a − 8 1 1 25 −a 2 ( 20 + 16 ) + 21 8 a − 16  2 (6.11875) a2 20

+

We will multiply through by 80 to clear the denominators.

9a 2 − 210a + 1104  0

210 ± 2102 − 4 (9)(1104) a  8 , 15 13 18

p

The second solution 15 13 is rejected because it isn’t in the range [5, 10]. The limited expected value at 15 13 is the full expected value which is 7, not 6.11875. If using equation (5.6), you must be careful to integrate S ( x ) from zero, not from 5, even though the second distribution in the mixture has support starting at 5. Thus you get a 5 a 1 1 1 (10 − x ) dx + 1dx + (13 − x ) dx  6.11875 20 0 2 0 16 5   1  1  100 − (10 − a ) 2 + 2.5 + (13 − 5) 2 − (13 − a ) 2  6.11875 40 32   1 1 (20a − a 2 ) + 2.5 + −105 + 26a − a 2  6.11875 40 32

Z

Z

Z

Gathering terms −

9 2 21 25 a + a−  6.11875 160 16 32

Multiplying by 160 which is the same quadratic as above. C/4 Study Manual—17th edition Copyright ©2014 ASM

−9a 2 + 210a − 1104  0

5. POLICY LIMITS

94

5.11.

The original θ  1 and the inflated θ0  1.1. Then E[X ∧ 1]  1 (1 − e −1/1 )  0.6321

E[X 0 ∧ 1]  1.1 (1 − e −1/1.1 )  0.6568 0.6568 (C) − 1  0.0391 0.6321 5.12.

Before inflation, expected claim size is E[X ∧ 10,000]  500 −

1,000,000  400 10,000

After inflation, letting the loss variable be X 0,



E[X 0 ∧ 10,000]  1.25 E X ∧

1,000,000 10,000  1.25 500 −  468.75 1.25 8000







The ratio is 468.75/400  1.171875 . 5.13. This could be worked out from first principles, but let’s instead work it out as a mixture. Let X be the random variable for loss and Y the payment random variable. Then Y is a mixture of a uniform random variable on (0, 3000] with weight 0.6 and the constant 3000 with weight 0.4. The expected value of Y is E[Y]  0.6 (1500) + 0.4 (3000)  2100 The second moment of a uniform distribution on (0, u] is u 2 /3. The second moment of Y is 30002 E[Y ]  0.6 + 0.4 (30002 )  5,400,000 3 2

The variance of Y is

!

Var ( Y )  5,400,000 − 21002  990,000

5.14. The total number of patients hospitalized 21 days or longer is obtained by linear interpolation between 21 and 25: 0.8 (53,488) + 0.2 (17,384)  46,267.2 That will be the denominator. The numerator is the number of days past day 21 hospitalized times the number of patients hospitalized for that period. Within each interval of durations, the average patient released during that interval is hospitalized for half the period. So 46,267.2 − 17,384  28,883.2 patients are hospitalized for 2 days after day 21, 17,384 − 5,349  12,035 for 4 + 2.5  6.5 days, 5,349 − 1,337  4,012 for 11.5 days, and 1,337 for 16.5 days. Add it up: 28,883.2 (2) + 12,035 (6.5) + 4,012 (11.5) + 1,337 (16.5)  204,192.4 The mean residual time is 204,192.4/46,267.2  4.41333 . (A)

Quiz Solutions 5-1. The tables say so

C/4 Study Manual—17th edition Copyright ©2014 ASM

E[X ∧ x]  θ (1 − e −x/θ ) E[X ∧ 80]  100 (1 − e −80/100 )  55.07

Lesson 6

Deductibles Reading: Loss Models Fourth Edition 3.1, 8.1–8.2

6.1

Ordinary and franchise deductibles

Insurance coverages often have provisions to not pay small claims. A policy with an ordinary deductible of d is one that pays the greater of 0 and X − d for a loss of X. For example, a policy with an ordinary deductible of 500 pays nothing if a loss is 500 or less, and pays 200 for a loss of 700. A policy with a franchise deductible of d is one that pays nothing if the loss is no greater than d, and pays the full amount of the loss if it is greater than d. For example, a policy with a franchise deductible of 500 pays nothing if a loss is 500 or less, and pays 700 for a loss of 700. Assume that a deductible is an ordinary deductible unless stated otherwise. If a policy has a deductible, and the probability of a loss below the deductible is not zero, then not every loss is paid. Thus we must distinguish between payment per loss and payment per payment. The expected payment per loss is less than the expected payment per payment, since payments of 0 are averaged into the former but not into the latter.

6.2

Payment per loss with deductible

Let X be the random variable for loss size. The random variable for the payment per loss with a deductible d is Y L  ( X − d )+ . The symbol ( X − d )+ means the positive part of X − d: in other words, max (0, X − d ) . It is usually easy enough to calculate probabilities for Y L by reasoning, but let’s write out the distribution function of Y L . FY L ( x )  Pr ( Y L ≤ x )  Pr ( X − d ≤ x )  Pr ( X ≤ x + d )  FX ( x + d )

(6.1)

The expected value of ( X − d )+ can be obtained from the definition of expected value: E[ ( X − d )+ ] 



Z d

( x − d ) f ( x ) dx

(6.2)

which roughly means that you take the probability of a loss being x and multiply it by x − d, and sum up over all x. Higher moments can be calculated using powers of x − d in the integral. An alternative formula for the first moment, derived by integration by parts, is often easier to use: E[ ( X − d )+ ] 



Z d

S ( x ) dx

Example 6A Loss amounts have a distribution whose density function is f (x )  C/4 Study Manual—17th edition Copyright ©2014 ASM

4 (100 − x ) 3 1004 95

0 < x ≤ 100

(6.3)

6. DEDUCTIBLES

96

An insurance coverage for these losses has an ordinary deductible of 20. Calculate the expected insurance payment per loss. Answer: Using the definition of E[ ( X − 20)+ ], E[ ( X − 20)+ ] 

20

Let u  100 − x. E[ ( X − 20)+ ] 

100

Z

80

Z 0

4 ( x − 20)(100 − x ) 3 dx 1004

4 (80 − u ) u 3 du 1004

4 u5 4  20u − 5 1004 

! 80 0 ! 5

4 80 20 (804 ) − 5 1004

 6.5536

Alternatively, using equation (6.3), x

Z F (x )  E[ ( X − 20)+ ] 

0

Z

100

20

− 

(100 − x ) 4 4 (100 − x ) 3 dx  1 − 1004 1004 (100 − x ) 4

!

1004

dx

100

(100 − x ) 5 5 (1004 ) 20

805  6.5536 5 (1004 )



The expected payment for a franchise deductible is E[ ( X − d )+ ] + dS ( d ) Example 6B Loss amounts have a discrete distribution with the following probabilities: Loss Amount

Probability

100 500 1000 2000

0.4 0.3 0.2 0.1

An insurance coverage for these losses has a franchise deductible of 500. Calculate the expected insurance payment per loss. Answer: The coverage pays 1000 if the loss is 1000 and 2000 if the loss is 2000; otherwise it pays nothing since the loss is below the deductible. The expected payment is therefore 0.2 (1000) + 0.1 (2000)  400 .  The random variable ( X − d )+ is said to be shifted by d and censored. Censored means that you have some, but incomplete, information about certain losses. In this case, you are aware of losses below d, but don’t know the amounts of such losses. C/4 Study Manual—17th edition Copyright ©2014 ASM

6.3. PAYMENT PER PAYMENT WITH DEDUCTIBLE

97

If you combine a policy with ordinary deductible d and a policy with policy limit d, the combination covers every loss entirely. In other words: E[X]  E[X ∧ d] + E[ ( X − d )+ ]

(6.4)

Thus for distributions in the tables, you can evaluate the expected payment per loss with a deductible by the formula E[ ( X − d )+ ]  E[X] − E[X ∧ d] Example 6C Losses follow a two-parameter Pareto distribution with α  2, θ  2000. Calculate the expected payment per loss on a coverage with ordinary deductible 500. Answer: For a Pareto with α > 1, E[X] − E[X ∧ d]  In our case,

6.3

θ θ α−1 θ+d

! α−1

2000  1600 E[X] − E[X ∧ 500]  2000 2500

!



Payment per payment with deductible

The random variable for payment per payment on an insurance with an ordinary deductible is the payment per loss random variable conditioned on X > d, or Y P  ( X − d )+ | X > d. Let’s write out the distribution function of Y P . This random variable is conditioned on X > d so it is not defined for x ≤ d. For x > d, FY P ( x )  Pr ( Y P ≤ x )

 Pr ( X − d ≤ x | X > d )

 Pr ( X ≤ x + d | X > d ) Pr ( d < X ≤ x + d )  Pr ( X > d ) FX ( x + d ) − FX ( d )  1 − FX ( d )

(6.5)

Notice the need to subtract FX ( d ) in the numerator, .because of the   joint condition X > d and X ≤ x + d. A common error made by students is to use FX ( x + d ) 1 − FX ( d ) and to forget to subtract FX ( d ) . The survival function doesn’t have the extra term: SY P ( x ) 

SX ( x + d ) SX ( d )

(6.6)

because the joint condition X > d and X > x + d reduces to X > x + d. For this reason, working with survival functions is often easier. Example 6D Losses follow a single-parameter Pareto with α  2, θ  400. Y P is the payment per payment random variable for a coverage with a deductible of 1000. Calculate Pr ( Y P ≤ 600) . C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

98

Answer: Let X be the loss random variable. We’ll use formula (6.5). 400 FX (1000)  1 − 1000

!2

 0.84

!2

400 FX (1600)  1 −  0.9375 1600 0.9375 − 0.84  0.609375 FY P (600)  1 − 0.84



The expected value of Y P is E[ ( X − d )+ ]/S ( d ) . It is called the mean excess loss and is denoted by e X ( d ) . In life contingencies, it is called mean residual life or the complete life expectancy and is denoted by e˚d ; the symbol without a circle on the e has a somewhat different meaning. Based on the definition, a formula for e ( d ) is e (d ) 

E[ ( X − d )+ ] S (d )

(6.7)

Adapting formulas (6.2) and (6.3), the formulas for e ( d ) are

R e (d ) 

R e (d ) 

∞ (x d ∞ d

− d ) f ( x ) dx S (d )

or

S ( x ) dx

S (d )

Higher moments of Y P can be calculated by raising x − d to powers in the first integral. Combining (6.4) and (6.7), we get E[X]  E[X ∧ d] + e ( d ) 1 − F ( d )





(6.8)

The random variable Y P is said to be shifted by d and truncated. Truncated means you have absolutely no information about losses in a certain range. Therefore, all your information is conditional—you know the existence of a loss only when it is above d, otherwise you know nothing. Example 6E (Same data as Example 6A) Loss amounts have a distribution whose density function is f (x ) 

4 (100 − x ) 3 1004

0 < x ≤ 100

An insurance coverage for these losses has an ordinary deductible of 20. Calculate the expected insurance payment per payment. Answer: Above, we calculated the payment per loss as 6.5536. We also derived S ( x )  (100 − x ) 4 /1004 . So S (20)  0.84 and the expected payment per payment is 6.5536/0.84  16 .  For a franchise deductible, the payment made is d higher than the corresponding payment under an ordinary deductible, so the expected payment per payment is e ( d ) + d. Example 6F (Same data as Example 6B) Loss amounts have a discrete distribution with the following probabilities: C/4 Study Manual—17th edition Copyright ©2014 ASM

6.3. PAYMENT PER PAYMENT WITH DEDUCTIBLE

99

Loss Amount

Probability

100 500 1000 2000

0.4 0.3 0.2 0.1

An insurance coverage for these losses has a franchise deductible of 500. Calculate the expected insurance payment per payment. Answer: First of all, E[ ( X −500)+ ]  0.2 (1000−500) +0.1 (2000−500)  250. Then e (500)  250/0.3  833 13 , which would be the payment per payment under an ordinary deductible. The payment per payment under a franchise deductible is 500 higher, or 1333 13 .

?



Quiz 6-1 Losses follow a distribution that is a mixture of two exponentials, with weight 0.75 on an exponential with mean 1000 and weight 0.25 on an exponential with mean 2000. An insurance coverage has an ordinary deductible of 500. Calculate the expected payment per payment on this coverage.

Special cases Exponential An exponential distribution has no memory. This means that Y P has the same distribution as X: it is exponential with mean θ. Therefore e (d )  θ (6.9) Example 6G Losses under an insurance coverage are exponentially distributed with mean 1000. Calculate the expected payment per payment for an insurance coverage with franchise deductible 200. Answer: e (200)  1000, so the answer is e ( d ) + d  1200 .



Uniform If X has a uniform distribution on (0, θ], then ( X − d )+ | X > d has a uniform distribution on (0, θ − d].

Example 6H Losses under an insurance coverage follow a uniform distribution on (0, 1000]. Calculate the expected payment per payment for an insurance coverage with ordinary deductible 200. Answer: Payment per payment is uniformly distributed on (0, 800], so the answer is 400 .



The same shortcut applies to any beta distribution with a  1. If X is beta with parameters a  1, b, and θ, then ( X − d )+ | X > d is beta with parameters a  1, b, θ  d. This generalization would help us with Example 6E. In that example, a  1, b  4, and θ  100. After the deductible, the revised θ parameter for the payment distribution is θ − d  80, and the expected payment per payment is ( θ − d ) a/ ( a + b )  80/5  16 . C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

100

Pareto If X has a 2-parameter Pareto distribution with parameters α and θ, then ( X − d )+ | X > d has a Pareto distribution with parameters α and θ + d. This means that we can easily calculate the moments and percentiles of ( X − d )+ | X > d. In particular, the mean is θ+d α−1

e (d ) 

α>1

(6.10)

You can easily write down a formula for the second moment and calculate the variance as well. Example 6I An insurance coverage has an ordinary deductible of 20. The loss distribution follows a twoparameter Pareto with α  3 and θ  100. Calculate 1. The average payment per payment. 2. The variance of positive payment amounts. 3. The probability that a nonzero payment is greater than 10. Answer: The distribution of nonzero payments is two-parameter Pareto with α  3 and θ  100+20  120. Therefore, with Y  ( X − 20)+ | X > 20 1.

eY (20) 

120  60 3−1

2. E[Y 2 ] 

2 (1202 )  14,400 (2)(1)

Var ( Y )  14,400 − 602  10,800 3. 120 Pr ( Y > 10)  120 + 10

!3  0.78653



If X has a single-parameter Pareto distribution with parameters α and θ, and d ≥ θ, then ( X − d )+ | X > d has a 2-parameter Pareto distribution with parameters α and d, so for α > 1, e (d ) 

d α−1

d≥θ

(6.11)

If d < θ, then ( X − d )+ | X > d is a shifted single-parameter Pareto distribution. It is the original distribution shifted by −d. The mean excess loss is e (d ) 

α (θ − d ) + d α−1

Example 6J Calculate the payment per payment for an insurance coverage with an ordinary deductible of 5 if the loss distribution is 1. exponential with mean 10 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 6

101

2. Pareto with parameters α  3, θ  20 3. Single-parameter Pareto with parameters α  2, θ  1. Answer:

1. e (5)  E[X]  10 .

2. e (5)  (20 + 5) /2  12.5 . 3. e (5)  5/1  5 .



Terminology note Exam questions are usually pretty clear about whether you are to calculate payment per payment or payment per loss. However, if you encounter a question asking for “expected payment”, think about it this way: The expected value of a random variable X is the average total of the X’s divided by the number of X’s. Thus the “expected payment” is the total of the payments divided by the number of payments. Another type of question asks you for “expected annual payment”. This is neither payment per loss nor payment per payment. “Annual” means “per year”! We will learn more about this in Lesson 16.

Exercises Mean excess loss 6.1.

[151-83-94:1] (1 point) For a random loss X: Pr ( X  3)  Pr ( X  12)  0.5 and E ( X − d )+  3

f

g

Determine d. (A) 4.5 6.2.

(B) 5.0

(C) 5.5

(D) 6.0

(E) 6.5

[4B-F99:19] (2 points) You are given the following:

(i)

Claim sizes for Risk A follow a two-parameter Pareto distribution with parameters θ  10,000 and α  2. √ (ii) Claim sizes for Risk B follow a Burr distribution with parameters θ  20,000, α  2, and γ  2. (iii) r is the ratio of the proportion of Risk A’s claims (in number) that exceed d to the proportion of Risk B’s claims (in number) that exceed d. Determine the limit of r as d goes to infinity. (A) 0

(B) 1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2

(D) 4

(E) ∞

Exercises continue on the next page . . .

6. DEDUCTIBLES

102

Table 6.1: Summary of Deductible Formulas

if Y L  ( X − d )+

FY L ( x )  F X ( x + d ) E[ ( X − d )+ ] 

Z

E[ ( X − d )+ ] 

Z



d



d

( x − d ) f ( x ) dx

(6.2)

S ( x ) dx

(6.3)

FX ( x + d ) − FX ( d ) 1 − FX ( d ) SX ( x + d ) SY P ( x )  SX ( d ) E[X] − E[X ∧ d] e (d )  S (d ) FY P ( x ) 

R

∞ (x d

R

∞ d

e (d )  e (d ) 

(6.1)

if Y P  ( X − d )+ | X > d

(6.5)

if Y P  ( X − d )+ | X > d

(6.6) (version of 6.7)

− d ) f ( x ) dx S (d )

S ( x ) dx

S (d )

E[X]  E[X ∧ d] + e ( d ) 1 − F ( d )



e (d )  θ θ−d e (d )  2 θ−d e (d )  1+b θ+d e (d )  α−1

 

d

α−1

for exponential

(6.9)

d 0, the limited expected value of Z at d equals (1 + r ) times the limited expected value of X at d.

(A) 2 (B) 3 (C) 2,3 (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) 1,2,3

6.5. [4B-S95:21] (3 points) Losses follow a Pareto distribution with parameters θ and α > 1. Determine the ratio of the mean excess loss function at x  2θ to the mean excess loss function at x  θ. (A) 1/2 (B) 1 (C) 3/2 (E) Cannot be determined from the given information

(D) 2

6.6. [4B-F98:6] (2 points) Claim sizes follow a Pareto distribution with parameters α  0.5 and θ  10,000. Determine the mean excess loss at 10,000. (A) 5,000

(B) 10,000

(C) 20,000

(D) 40,000

(E) ∞

6.7. [4B-F94:16] (1 point) A random sample of auto glass claims has yielded the following five observed claim amounts: 100,

125,

200,

250,

300.

What is the value of the empirical mean excess loss function at x  150? (A) 75

(B) 100

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 200

(D) 225

(E) 250

Exercises continue on the next page . . .

6. DEDUCTIBLES

104

Use the following information for questions 6.8 and 6.9: The following random sample has been observed: 2.0,

10.3,

4.8,

16.4,

21.6,

3.7,

21.4,

34.4

The underlying distribution function is assumed to be the following: F ( x )  1 − e −x/10 ,

x ≥ 0.

6.8. [4B-S93:24] (2 points) Calculate the value of the mean excess loss function e ( x ) for x  8. (A) (B) (C) (D) (E)

Less than 7.00 At least 7.00, but less than 9.00 At least 9.00, but less than 11.00 At least 11.00, but less than 13.00 At least 13.00 [4B-S93:25] (2 points) Calculate the value of the empirical mean excess loss function e n ( x ) , for x  8.

6.9. (A) (B) (C) (D) (E) 6.10. •

Less than 7.00 At least 7.00, but less than 9.00 At least 9.00, but less than 11.00 At least 11.00, but less than 13.00 At least 13.00 [4B-S99:10] (2 points) You are given the following:

One hundred claims greater than 3,000 have been recorded as follows: Interval Number of Claims ( 3,000, 5,000] ( 5,000, 10,000] (10,000, 25,000] (25,000, ∞)

6 29 39 26



Claims of 3,000 or less have not been recorded.



Claim sizes follow a Pareto distribution with parameters α  2 and θ  25,000. Determine the expected claim size for claims in the interval (25,000, ∞) .

(A) 12,500 6.11.

(B) 25,000

(C) 50,000

(D) 75,000

(E) 100,000

[4B-S98:3] (3 points) The random variables X and Y have joint density function y

f ( x, y )  e −2x− 2 ,

0 < x < ∞,

0 < y < ∞.

Determine the mean excess loss function for the marginal distribution of X evaluated at X  4. (A) 1/4

(B) 1/2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1

(D) 2

(E) 4

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6

6.12.

105

[4B-F96:22] (2 points) The random variable X has the density function f (x ) 

1 −x/θ , θe

0 < x < ∞, θ > 0.

Determine e ( θ ) , the mean excess loss function evaluated at θ. (A) 1

(B) θ

(C) 1/θ

(D) θ/e

(E) e/θ

Use the following information for questions 6.13 through 6.15: You are given the following: •

The random variable X follows a two-parameter Pareto distribution with parameters θ  100 and α  2.



The mean excess loss function, e X ( k ) is defined to be E[X − k | X ≥ k].

6.13.

[4B-F99:25] (2 points) Determine the range of e X ( k ) over its domain of [0, ∞) .

(A) [0, 100] 6.14.

(B) [0, ∞]

(C) 100

(E) ∞

[4B-F99:26] (1 point) Y  1.10X

Determine the range of the function (A) (1, 1.10] 6.15.

(D) [100, ∞)

(B) (1, ∞)

eY ( k ) eX ( k )

over its domain [0, ∞) .

(C) 1.10

(D) [1.10, ∞)

(E) ∞

(D) [100, ∞)

(E) [150, ∞)

[4B-F99:27] (2 points) Z  min ( X, 500)

Determine the range of e Z ( k ) over its domain of [0, 500]. (A) [0, 150] 6.16.

(B) [0, ∞)

(C) [100, 150]

[3-F01:35] The random variable for a loss, X, has the following characteristics: F (x ) 0.0 0.2 0.6 1.0

x

0 100 200 1000

E(X ∧ x ) 0 91 153 331

Calculate the mean excess loss for a deductible of 100. (A) 250 6.17.

(B) 300

(C) 350

(D) 400

(E) 450

For an insurance coverage, you are given:

(i) The policy limit is 10,000. (ii) The expected value of a loss before considering the policy limit is 9,000. (iii) The probability that a loss is at least 10,000 is 0.1. (iv) The mean excess loss at 10,000, if the policy limit is ignored, is 20,000. Determine the average payment per loss for losses less than 10,000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

6. DEDUCTIBLES

106

[4B-S98:25] (2 points) You are given the following:

6.18. •

100 observed claims occurring in 1995 for a group of risks have been recorded and are grouped as follows: Interval Number of Claims ( 0, 250) [ 250, 300) [ 300, 350) [ 350, 400) [ 400, 450) [ 450, 500) [ 500, 600) [ 600, 700) [ 700, 800) [ 800, 900) [ 900, 1000) [1000, ∞ )



36 6 3 5 5 0 5 5 6 1 3 25

Inflation of 10% per year affects all claims uniformly from 1995 to 1998.

Using the above information, determine a range for the expected proportion of claims for this group of risks that will be greater than 500 in 1998. (A) (B) (C) (D) (E)

Between 35% and 40% Between 40% and 45% Between 45% and 50% Between 50% and 55% Between 55% and 60%

Deductibles 6.19.

Losses follow a uniform distribution on [0, 50,000]. There is a deductible of 1,000 per loss.

Determine the average payment per loss. 6.20. A policy covers losses subject to a franchise deductible of 500. Losses follow an exponential distribution with mean 1000. Determine the average payment per loss. 6.21. Losses follow a Pareto distribution with α  3.5, θ  5000. A policy covers losses subject to a 500 franchise deductible. Determine the average payment per loss.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6 [4B-S99:7] (2 points) You are given the following:

6.22. •

107

Losses follow a distribution (prior to the application of any deductible) with cumulative distribution function and limited expected values as follows: Loss Size (x) F (x ) E[X ∧ x] 10,000 15,000 22,500 32,500 ∞

0.60 0.70 0.80 0.90 1.00

6,000 7,700 9,500 11,000 20,000



There is a deductible of 10,000 per loss and no policy limit.



The insurer makes a payment on a loss only if the loss exceeds the deductible.

The deductible is raised so that half the number of losses exceed the new deductible compared to the old deductible of 10,000. Determine the percentage change in the expected size of a nonzero payment made by the insurer. (A) (B) (C) (D) (E)

Less than −37.5% At least −37.5%, but less than −12.5% At least −12.5%, but less than 12.5% At least 12.5%, but less than 37.5% At least 37.5% [4B-S94:24] (3 points) You are given the following:

6.23. •

X is a random variable for 1993 losses, having the density function f ( x )  0.1e −0.1x , x > 0.



Inflation of 10% impacts all losses uniformly from 1993 to 1994.



For 1994, a deductible, d, is applied to all losses.



P is a random variable representing payments of losses truncated and shifted by the deductible amount. Determine the value of the cumulative distribution function at P  5, FP (5) , in 1994.

(A) (B) (C) (D) (E)

1 − e −0.1 ((5+d )/1.1)



e −0.1 (5/1.1) − e −0.1 ((5+d )/1.1)

.

0 At least 0.25 but less than 0.35 At least 0.35 but less than 0.45

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 − e −0.1 (5/1.1)



Exercises continue on the next page . . .

6. DEDUCTIBLES

108

Use the following information for questions 6.24 and 6.25: You are given the following: •

Losses follow a distribution (prior to the application of any deductible) with cumulative distribution function and limited expected values as follows: Loss Size(x) F (x ) E[X ∧ x] 10,000 15,000 22,500 ∞

0.60 0.70 0.80 1.00



There is a deductible of 15,000 per loss and no policy limit.



The insurer makes a nonzero payment p.

6.24. (A) (B) (C) (D) (E)

6,000 7,700 9,500 20,000

[4B-F98:12] (2 points) Determine the expected value of p. Less than 15,000 At least 15,000, but less than 30,000 At least 30,000, but less than 45,000 At least 45,000, but less than 60,000 At least 60,000

6.25. [4B-F98:13] (2 points) After several years of inflation, all losses have increased in size by 50%, but the deductible has remained the same. Determine the expected value of p. (A) (B) (C) (D) (E)

Less than 15,000 At least 15,000, but less than 30,000 At least 30,000, but less than 45,000 At least 45,000, but less than 60,000 At least 60,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6 [4B-S94:21] (2 points) You are given the following:

6.26. •

109

For 1993 the amount of a single claim has the following distribution: Amount

Probability

$1000 2000 3000 4000 5000 6000

1/6 1/6 1/6 1/6 1/6 1/6



An insurer pays all losses after applying a $1500 deductible to each loss.



Inflation of 5% impacts all claims uniformly from 1993 to 1994.

Assuming no change in the deductible, what is the inflationary impact on losses paid by the insurer in 1994 as compared to the losses the insurer paid in 1993? (A) (B) (C) (D) (E)

Less than 5.5% At least 5.5%, but less than 6.5% At least 6.5%, but less than 7.5% At least 7.5%, but less than 8.5% At least 8.5% [4B-F94:17] (2 points) You are given the following:

6.27. •

Losses follow a Weibull distribution with parameters θ  20 and τ  1.0.



The insurance coverage has an ordinary deductible of 10.

If the insurer makes a payment, what is the probability that an insurer’s payment is less than or equal to 25? (A) (B) (C) (D) (E)

Less than 0.65 At least 0.65, but less than 0.70 At least 0.70, but less than 0.75 At least 0.75, but less than 0.80 At least 0.80

6.28. Losses follow a lognormal distribution with parameters µ  5, σ  2. Losses are subject to a 1000 franchise deductible. 10% inflation affects the losses. Calculate the revised franchise deductible so that the expected aggregate cost of claims after inflation with the deductible is the same as it was before inflation with the 1000 franchise deductible. 6.29. Losses follow a Pareto distribution with α  3, θ  5000. Insurance pays the amount of the loss minus a deductible, but not less than zero. The deductible is 100, minus 25% of the excess of the loss over 100, but not less than zero. Calculate the expected payment per payment. 6.30. X is a random variable representing loss sizes. You are given that E[X ∧ d]  100 1 − e −d/100 . Loss sizes are affected by 10% inflation.





Determine the average payment per loss under a policy with a 500 ordinary deductible after inflation. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

6. DEDUCTIBLES

110

6.31. [CAS3-S04:21] Auto liability losses for a group of insureds (Group R) follow a Pareto distribution with α  2 and θ  2,000. Losses from a second group (Group S) follow a Pareto distribution with α  2 and θ  3,000. Group R has an ordinary deductible of 500, while Group S has a franchise deductible of 200. Calculate the amount that the expected cost per payment for Group S exceeds that for Group R. (A) (B) (C) (D) (E)

Less than 350 At least 350, but less than 650 At least 650, but less than 950 At least 950, but less than 1,250 At least 1,250

6.32. [CAS3-S04:29] Claim sizes this year are described by a 2-parameter Pareto distribution with parameters θ  1,500 and α  4. What is the expected claim size per loss next year after 20% inflation and the introduction of a $100 deductible? (A) (B) (C) (D) (E)

Less than $490 At least $490, but less than $500 At least $500, but less than $510 At least $510, but less than $520 At least $520 For an automobile collision coverage, you are given

6.33. •

Loss sizes, before application of any deductible or limit, follow a distribution which is a mixture of a two-parameter Pareto distribution with parameters α  2, θ  1000 and a two-parameter Pareto distribution with parameters α  3, θ  2000.



For coverage with an ordinary deductible of 500, the average amount paid per claim for each claim above the deductible is 1471.63.



For coverage with an ordinary deductible of 600, the average amount paid per claim for each claim above the deductible is x. Determine x.

6.34. [CAS3-F04:29] High-Roller Insurance Company insures the cost of injuries to the employees of ACME Dynamic Manufacturing, Inc. •

30% of injuries are “Fatal” and the rest are “Permanent Total” (PT). There are no other injury types.



Fatal injuries follow a loglogistic distribution with θ  400 and γ  2.



PT injuries follow a loglogistic distribution with θ  600 and γ  2.



There is a $750 deductible per injury. Calculate the probability that an injury will result in a claim to High-Roller.

(A) (B) (C) (D) (E)

Less than 30% At least 30%, but less than 35% At least 35%, but less than 40% At least 40%, but less than 45% At least 45%

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6

111

6.35. [151-82-93:6] (2 points) A company has 50 employees whose dental expenses are mutually independent. For each employee, the company reimburses 100% of dental expenses in excess of a $100 deductible. The dental expense for each employee is distributed as follows: Expense $

0 50 200 500 1,000

Probability 0.20 0.30 0.30 0.10 0.10

Determine, by normal approximation, the 95th percentile of the cost to the company. (A) $8,000

(B) $9,000

(C) $10,000

(D) $11,000

(E) $12,000

6.36. [CAS3-F04:25] Let X be the random variable representing aggregate losses for an insured. X follows a gamma distribution with mean of $1 million and coefficient of variation 1. An insurance policy pays for aggregate losses that exceed twice the expected value of X. Calculate the expected loss for the policy. (A) (B) (C) (D) (E)

Less than $100,000 At least $100,000, but less than $200,000 At least $200,000, but less than $300,000 At least $300,000, but less than $400,000 At least $400,000 [4B-S92:23] (2 points) You are given the following information:

6.37. •

A large risk has a lognormal claim size distribution with parameters µ  8.443 and σ  1.239.



The insurance agent for the risk settles all claims under 5000. (Claims of 5000 or more are settled by the insurer, not the agent.) Determine the expected value of a claim settled by the insurance agent.

(A) (B) (C) (D) (E)

Less than 500 At least 500, but less than 1000 At least 1000, but less than 1500 At least 1500, but less than 2000 At least 2000

Additional released exam questions: CAS3-S05:4,35, SOA M-S05:9,32, CAS3-F05:20, SOA M-F05:14,26, CAS3-S06:39

Solutions 6.1. From the five choices, we see that d > f3, but you g can also deduce this as follows: If d < 3, then since X ≥ 3, it follows that ( X − d )+  X − d and E ( X − d )+  E[X − d]  E[X] − d, but E[X]  0.5 (3 + 12)  7.5,

so E[X] − d > 4.5, contradicting E ( X − d )+  3.

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

6. DEDUCTIBLES

112

It is also clear that d < 12, or else E ( X − d )+  0 since ( X − d )+  0 for both possible values of X. From d > 3 if follows that when X  3, ( X − d )+  0, so

f

g

E ( X − d )+  0.5 (0) + 0.5 (12 − d )

f

g

Setting this equal to 3, we obtain d  6 . (D) 6.2. The probability that a claim for Risk A exceeds d is θ S (d )  θ+d



10,000  10,000 + d

!2

The probability that a claim for Risk B exceeds d is 1 S (d )  1 + ( d/θ ) γ The ratio is



1  1 + d 2 /20,000

(20,000 + d 2 ) 2 (10,000 + d )

!2

20,000  20,000 + d 2

!2

!2

As d goes to infinity, the d 2 term will dominate, so the ratio will go to ∞. (E) 6.3.

See equation (6.8). (C)

6.4. Both the standard deviation and the mean are multiplied by 1 + r, so the coefficient of variation doesn’t change, and 1 is false . 2 is true , but 3 is false ; the limited expected value of Z at d (1 + r ) is 1 + r times the limited expected value of X at d. (A) 6.5. From formula (6.10), the mean excess loss at x  θ is ( θ + θ ) / ( α − 1) and the mean excess loss at x  2θ is ( θ + 2θ ) / ( α − 1) . The ratio of the latter to the former is 3θ/2θ  3/2 . (C) Why was this simple problem worth 3 points? 6.6. When α < 1, the expected value of a Pareto, and therefore the mean excess loss, is infinite . (E) 6.7. The empirical distribution assigns a probability of 1/5 to each of the five observations. Since the mean excess loss is calculated at 150, observations less than or equal to 150 are ignored. The excess loss over 150 is 50 for 200, 100 for 250, and 150 for 300. 50 + 100 + 150  100 3

(B)

6.8. The distribution of X is exponential, so the mean excess loss is the mean, 10 . (C) 6.9. The empirical mean excess loss is the mean excess loss based on the empirical distribution. The empirical distribution assigns probability 1/n to each observation. The empirical mean excess loss at x may be computed by summing up all observations greater than x, dividing by the number of such observations, and then subtracting x. Thus e n (8) 

1 (10.3 + 16.4 + 21.6 + 21.4 + 34.4) − 8  12.82 5

(D)

The subscript of n is a common way to indicate an empirical function based on n observations, as we’ll learn in Lesson 22.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 6

113

6.10. You are being asked for total loss size given that a loss is greater than 25,000. The mean excess loss e (25,000) is the excess of a loss over 25,000 given that it is greater than 25,000. So you are being asked for the mean excess loss at 25,000 plus 25,000 (the total loss size, not just the excess). For a Pareto, e ( d )  ( θ + d ) / ( α − 1) , so 25,000 + e (25,000)  25,000 + 6.11.

The marginal distribution of X is

25,000 + 25,000 θ + 25,000  25,000 +  75,000 α−1 1

R

∞ −2x− y 2 e 0

y

(D)

∞   2e −2x , which is exponential

dy  ( e −2x ) −2e − 2



0

with θ  1/2. The mean excess loss is the mean, or 1/2 . (B)

6.12. This is an exponential distribution with mean θ. By equation (6.9), e ( d )  θ for any d, in particular for d  θ. (B) 6.13.

By equation (6.10), the mean excess loss is an increasing linear function for a Pareto: eX ( k ) 

θ+k  100 + k α−1

which is 100 at k  0 and goes to infinity as k goes to infinity. (D) 6.14. By equation (6.10), e X ( k )  100 + k. The inflated variable Y  1.1X is a two-parameter Pareto with parameters θ  1.1 (100)  110 and α  2, as we discussed in Section 2.1. Therefore eY ( k )  110 + k. The quotient (110 + k ) / (100 + k ) is monotonic, equal to 1.1 when k  0 and decreasing asymptotically to 1 as k → ∞. (A) 6.15. This is hard! I wonder whether the exam setters expected you to really work this out, or to select the only choice of the five that could work. Notice that when k  500, e Z ( k )  0, and e Z ( k ) is bounded (it certainly cannot go above 500), so (A) is the only choice of the five that can possibly be correct. Proving it is correct, though, requires work. We are trying to find the maximum for eZ ( k )  and

E[X ∧ 500] − E[X ∧ k] 1 − FX ( k )

E[X ∧ k]  100 − (100 + k ) E[X ∧ 500]  100 − eZ ( k )   The maximum occurs at k  6.16.

−2/3 −2 (1/600)

1002 600 2 − 100 600

1002 100+k 1002 (100+k ) 2 −k 2 2

1002 1002  100 − 2 100 + k (100 + k )

 100 + k −

(100 + k ) 2 600

500 + k+ 600 3 6

 32 (300)  200, and e Z (200)  100 + 200 −

We use equation (6.8). Since F (1000)  1, E[X ∧ 1000]  E[X].

E[X]  E[X ∧ 100] + e (100) 1 − F (100)



331  91 + e (100)(1 − 0.2) 240 e (100)   300 (B) 0.8 C/4 Study Manual—17th edition Copyright ©2014 ASM



3002 600

 150.

6. DEDUCTIBLES

114

6.17.

By equation (6.8), E[X]  E[X ∧ 10,000] + e (10,000) Pr ( X > 10,000)

9,000  E[X ∧ 10,000] + 20,000 (0.1)

E[X ∧ 10,000]  9,000 − 2,000  7,000

However, by the Law of Total Probability, the limited expected value at 10,000 can be decomposed into the portion for losses below 10,000 and the portion for losses 10,000 and higher: E[X ∧ 10,000]  Pr ( X < 10,000) E[X ∧ 10,000 | X < 10,000] + Pr ( X ≥ 10,000) E[X ∧ 10,000 | X ≥ 10,000] 7,000  0.9 E[X | X < 10,000] + 0.1 (10,000)

since X ∧ 10,000  X for X < 10,000 and 10,000 for X > 10,000. Therefore, E[X | X < 10,000]  6.18.

500 1.13

7,000 − 1,000  6,666 23 0.9

 375.66. Somewhere between 36 + 6 + 3  45 and 36 + 6 + 3 + 5  50 claims are below 375.66,

out of a total of 100 claims, so between 50 and 55 percent are above . (D) 6.19. E[X]  25,000 E[X ∧ 1000] 

1 50,000

1000

Z 0

x dx +

(50,000 − 1000) 50,000

(1000)

 10 + 980  990 E[X] − E[X ∧ 1000]  25,000 − 990  24,010 6.20. Average payment per payment for an ordinary deductible is e ( x )  1000. For a franchise deductible, we add 500, making average payment per payment 1500. Average payment per loss is 1500e −500/1000  909.80 . 6.21. e (500)  5500 2.5  2200  average payment per payment for an ordinary deductible. For a franchise  deductible, average payment per payment is 2200 + 500  2700. The average payment per loss is 2700 1 − F (500)  2700



5000 3.5 5500

 1934.15 .

6.22. Under the old deductible, the F ( d )  F (10,000)  0.60 and Pr ( X > d )  1 − F (10,000)  0.40. You would like to select d 0 such that Pr ( X > d 0 )  12 (0.40)  0.20, or F ( d 0 )  0.80. From the table, we see that d 0  22,500. Under the old deductible of d  10,000, E[X]−E[X∧10,000]  20,000−6,000  35,000. Under the new de1−F (10,000) 0.4 ductible of d 0  22,500, it is

20,000−9,500 0.2

 52,500. This is an increase of 50.0% . (E)

6.23. X follows an exponential distribution with mean 10. Let X ∗ be the inflated variable. Then X ∗ follows an exponential disribution with mean 1.1 (10)  11. P is the conditional random variable X ∗ − d given that X ∗ > d. Since X ∗ is exponential and has no memory, the distribution of P is the same as the distribution of X ∗ —exponential with mean 11. Then FP (5)  FX (5)  1 − e −5/11  0.3653

(E)

6.24. The expected payment per loss is E[X] − E[X ∧ 15,000]  20,000 − 7,700  12,300. p is the expected payment per payment, so we divide by 1 − F (15,000)  0.3 and get 12,300/0.3  41,000 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 6

6.25.

115

Let X ∗  1.5X be the inflated variable. Then E[X ∗ ]  1.5 E[X]  1.5 (20,000)  30,000 E[X ∗ ∧ 15,000]  E[1.5X ∧ 15,000]  1.5 E[X ∧ 10,000]  1.5 (6,000)  9,000

1 − F ∗ (15,000)  1 − F (10,000)  1 − 0.60  0.40 E[X ∗ ] − E[X ∗ ∧ 15,000] 30,000 − 9,000   52,500 (D) 1 − F ∗ (15,000) 0.4

6.26. Let Y L be the payment per loss. Then E[Y L ]  16 (500 + 1500 + 2500 + 3500 + 4500)  2083 13 . After inflation, with Z  1.05X, the loss amounts are 1050, 2100, 3150, . . . , 6300, and if we let Z L be the payment per loss after a 1500 deductible, E[Z L ]  61 (600+1650+2700+3750+4800)  2250. The increase in expected payment per loss (the “impact”) is 2250 − 1  0.08 (D) 2083 13 6.27. A Weibull with τ  1 is an exponential with mean θ. An exponential has no memory, so the payment distribution is the same as the underlying loss distribution, which is exponential with mean 20. FY P (25)  FX (25)  1 − e −25/20  1 − e −1.25  0.713

(C)

6.28. The aggregate claim costs will be the same if the average payment per loss is the same, since the number of losses is not affected by inflation. If Y L is the average payment per loss variable, and X is the loss variable, then for a franchise deductible of 1000, E[Y L ]  E[X] − E[X ∧ 1000] + 1000 1 − F (1000)



e

5+22 /2

−e

5+22 /2

ln 1000 − 5 − 22 Φ 2



!

 e 7 − e 7 Φ (−1.05)  935.54 Notice that in the expression for E[X ∧ 1000] given in the Loss Models appendix is the same  the last term  as 1000 1 − F (1000) , and therefore cancels out. 0

We now equate the inflated variable Y L to 935.54. Let x be the new deductible. For the inflated variable, µ  5 + ln 1.1 and σ  2. Notice that e 7+ln 1.1  1.1e 7 . 935.54  1.1e 7 − 1.1e 7 Φ

ln x − 5 − ln 1.1 − 22 2

!

935.54 − 1.1e 7 ln x − ln 1.1 − 9 Φ   0.2245 2 −1.1e 7 ln x − ln 1.1 − 9  Φ−1 (0.2245)  −0.76 2 ln x  2 (−0.76) + 9 + ln 1.1  7.575

!

x  e 7.575  1949

6.29.

This is an example of a disappearing deductible. You must understand how much is paid for a loss:



If the loss is 100 or less, nothing is paid, since the deductible is 100.



If the loss is 500 or more, the entire loss is paid, since the deductible is 0. (100 − 0.25 (500 − 100)  0)

C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

116



For losses x in between 100 and 500, the deductible is 100−0.25 ( x −100) , or 125−0.25x. This is a linear function equal to 100 at 100 and 0 at 500. Hence the amount paid is x − (125 − 0.25x )  1.25x − 125.

Thus the amount the company pays for a loss is: •

0% of the first 100,



125% of the next 400,



100% of the excess over 500.

In effect, the company pays 1.25 times the part of the loss between 100 and 500, plus the part of the loss above 500. In other words, if the loss variable is X and the payment per loss variable Y L , Y L  1.25 ( X ∧ 500 − X ∧ 100) + ( X − X ∧ 500)  X + 0.25 ( X ∧ 500) − 1.25 ( X ∧ 100) . Therefore E[Y L ]  E[X] + 0.25 E[X ∧ 500] − 1.25 E[X ∧ 100]

!2 !2 50 ++ 50 + + * * * * / / − 1.25 .2500 .1 − //  2500 + 0.25 .2500 .1 − 55 51 --, , , ,

 2500 + 108.47 − 121.35  2487.12 To get the payment per payment, we divide by S (100) :

!3

5000  0.942322 S (100)  5100 2487.12  2639.36 0.942322 6.30.

We use E[1.1X ∧ d]  1.1 E[X ∧ d/1.1]. 1.1 E[X] − 1.1 E X ∧

f

500 1.1

g

 110 − 110 1 − e −500/110





 110e −500/110  1.16769

You can also recognize the distribution as exponential and use the tables. 6.31. For Group R, we want e (500) . Using formula (6.10), this is 2000+500  2500. 1 For Group S, we want 200 + e (200)  200 + 3000+200  3400. The difference is 3400 − 2500  900 . (C) 1 6.32.

The new θ is 1.2 (1500)  1800. Using the tables,

!3

1800 * 1800 + .1 − /  89.8382 E[X ∧ 100]  3 1900

,

-

1800 E[X]   600 3 f g E ( X − 100)+  600 − 89.8382  510.1618

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

EXERCISE SOLUTIONS FOR LESSON 6

117

6.33. The weighted average of the payment per payment is not the payment per payment for the mixture! What is true is that the weighted average of the payment per loss is the payment per loss of the mixture, and the weighted average of the distribution function is the distribution of the mixture. We must compute the average payment per payment of the mixture as the quotient of the average payment per loss over the complement of the distribution function. We use the fact that for a Pareto, e ( d )  ( θ + d ) / ( α − 1) , where e ( x ) is average payment per payment with a deductible d. Therefore, average payment per loss with a deductible d is e (d ) 1 − F (d ) 





θ+d α−1

!

θ θ+d



Let w be the weight on the first distribution. We have that the payment per loss with d  500 for the mixture is 1000 + 500 w 2−1

!

1000 1000 + 500

and 1 − F ( d ) for the mixture is w

1000 1000 + 500

!2

!2

2000 + 500 + (1 − w ) 3−1

+ (1 − w )

2000 2000 + 500

!

2000 2000 + 500

!3 



4 9

!3

 666 23 − 640 w + 640





− 0.512 w + 0.512



The quotient of the first of these over the second is payment per payment for the mixture. Let’s equate this to 1471.63 and solve for w.



26 32 w + 640  1471.63



w



4 9

− 0.512 w + 0.512  −99.41678w + 753.47456





113.4746  0.9 126.0834

Let’s calculate the quotient with a deductible of 600. x 6.34.

1600

5 2 8 (0.9) 5 2 8 (0.9)

+ 1300 +

10 3 13 (0.1)

10 3 13 (0.1)

 1565.61

This is a mixture distribution, and we want Pr ( X > 750) . Probabilities are weighted probabilities

of the component distributions. For θ  400, Pr ( X1 > 750)  1 −

Pr ( X2 > 750)  1 − 6.35.

(750/600) 2 1+ (750/600) 2

(750/400) 2 1+ (750/400) 2

 0.221453. For θ  600,

 0.390244. Then 0.3 (0.221453) + 0.7 (0.390244)  0.339607 . (B)

The payments after the deductible have the following distribution: Payment Probability 0 100 400 900

0.50 0.30 0.10 0.10

Let Y be the payment for one employee. E[Y]  0.30 (100) + 0.10 (400) + 0.10 (900)  160 E[Y 2 ]  0.30 (1002 ) + 0.10 (4002 ) + 0.10 (9002 )  100,000 Var ( Y )  100,000 − 1602  74,400 √ The 95th percentile is 50 (160) + 1.645 50 (74,400)  11,173 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

118 √

6.36. For a gamma distribution, the coefficient of variation is αθ  √1α . Thus α  1 and θ  1,000,000. The distribution is exponential since α  1. For an exponential, e X ( x )  E[X] for any x, so e (2,000,000)  1,000,000. Then αθ

E ( X − 2,000,000)+  1,000,000 1 − F (2,000,000)  1,000,000e −2  135,335.28

f

g





(B)

6.37. We want the expected payment per payment. Notice that the payment per loss is x if x < 5000 but 0 if x ≥ 5000. What you have in your tables, E[X ∧ 5000], represents the expectation of a payment per loss of x if x < 5000 and 5000 if x ≥ 5000. The expectation we need is therefore  E[X ∧ 5000]  minus 5000 times the probability that the loss is greater than 5000, or E[X ∧ 5000] − 5000 1 − F (5000) , which is e

8.443+1.2392 /2

ln 5000 − 8.443 − 1.2392  10000Φ (−1.18)  1190. Φ 1.239

!

The expected payment per payment is 1190 divided by F (5000)  Φ quotient is 1190/0.5239  2271 . (E)

ln 5000−8.443 1.239

 Φ (0.06)  0.5239. This

Quiz Solutions 6-1. The expected value of losses, X, is 0.75 (1000) + 0.25 (2000)  1250. The limited expected value at 500 is a weighted average of E[X ∧ 500] for each coverage. E[X ∧ 500]  0.75 (1000)(1 − e −500/1000 ) + 0.25 (2000)(1 − e −500/2000 )  0.75 (393.47) + 0.25 (442.40)  405.70 The survival function at 500 is S (500)  0.75e −500/1000 + 0.25e −500/2000  0.649598 Therefore, the expected payment per payment is (1250 − 405.70) /0.649598  1299.72 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 7

Loss Elimination Ratio Reading: Loss Models Fourth Edition 8.3 The Loss Elimination Ratio is defined as the proportion of the expected loss which the insurer doesn’t pay as a result of an ordinary deductible. In other words, for an ordinary deductible of d, it is LER ( d ) 

E[X ∧ d] . E[X]

The textbook defines loss elimination ratio only for ordinary deductibles. Example 7A You are given the following information for an auto collision coverage: Ordinary deductible

Average payment per payment

Loss elimination ratio

0 1000

800 720

0 0.8

A new version of the coverage with a 1000 franchise deductible is introduced. Determine the average payment per loss for this coverage. Answer: We have two ways to calculate the average payment per loss with the ordinary deductible of 1000. One of them is that the average payment per loss without the deductible is given as 800, and the loss elimination ratio of the 1000 deductible is 0.8, so the average payment per loss with the 1000 ordinary deductible is 800 (1 − 0.8)  160. The other way is that the average payment per payment is 720 and the probability of a payment is 1 − F (1000) , so the average payment per loss is 720 1 − F (1000) . Equating these two: 160  720 1 − F (1000)





2 9

1 − F (1000) 

Then the average payment per loss for the franchise deductible is 160 plus the additional payment of 1000 whenever the loss is above the deductible, or 160 + 1000 1 − F (1000)  160 + 1000 (2/9)  382 92





Alternatively, the average payment per payment under the franchise deductible is 1000 more than it would be with an ordinary deductible, or 720+1000  1720. The average payment per loss is the average payment per payment times the probability that a claim will be above 1000, or 1 − F (1000)  29 , or 1720 (2/9) , which comes out 382 29 , the same as above.  The next example combines inflation with LER. Example 7B An insurance coverage has an ordinary deductible of 500. Losses follow a two-parameter Pareto distribution with α  3, θ  1000. Calculate the reduction in the loss elimination ratio after 10% inflation as compared to the original loss elimination ratio. C/4 Study Manual—17th edition Copyright ©2014 ASM

119

7. LOSS ELIMINATION RATIO

120

Answer: The loss elimination ratio is E[X ∧ 500]/ E[X]. Checking the tables, we see that the formula for the limited expected value is

! α−1 θ * +/ E[X ∧ d]  E[X] .1 − θ+d ,  2 so the loss elimination ratio in our case is 1 − θ/ ( θ + 500) . For the original variable, this is 1000 1500

1−

!2 

5 9

For the inflated variable with θ  1000 (1.1)  1100, this is 1100 1− 1600

!2 

135 256

The reduction is (5/9) − (135/256)  0.02821 .



For exponentials and Paretos, the formula for E[X ∧ d] includes E[X] as a factor, which cancels out when calculating LER, so LER ( d )  1 − e −d/θ LER ( d )  1 −

θ d+θ

for an exponential

! α−1

for a Pareto with α > 1

You can also write a decent formula for LER of a single-parameter Pareto: LER ( d )  1 −

( θ/d ) α−1 α

α > 1, d ≥ θ

but for a lognormal, the formula isn’t so good.

Exercises 7.1.

[4B-S92:25] (2 points) You are given the following information:

Deductible Expected size of claim with no deductible Probability of a loss exceeding deductible Mean excess loss of the deductible Determine the loss elimination ratio. (A) (B) (C) (D) (E)

250 2500 0.95 2375

Less than 0.035 At least 0.035, but less than 0.070 At least 0.070, but less than 0.105 At least 0.105, but less than 0.140 At least 0.140

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 7

121

7.2. [4B-F92:18] (2 points) You are given the following information: • Deductible d • Expected value limited to d, E[X ∧ d] • Probability of a loss exceeding deductible, 1 − F ( d ) • Mean excess loss of the deductible, e ( d ) Determine the loss elimination ratio. (A) (B) (C) (D) (E)

500 465 0.86 5250

Less than 0.035 At least 0.035, but less than 0.055 At least 0.055, but less than 0.075 At least 0.075, but less than 0.095 At least 0.095 [4B-F99:1] (2 points) You are given the following:

7.3. •

Losses follow a distribution (prior to the application of any deductible) with mean 2,000.



The loss elimination ratio (LER) at a deductible of 1,000 is 0.30.



60 percent of the losses (in number) are less than the deductible of 1,000. Determine the average size of a loss that is less than the deductible of 1,000.

(A) (B) (C) (D) (E) 7.4.

Less than 350 At least 350, but less than 550 At least 550, but less than 750 At least 750, but less than 950 At least 950 You are given:

(i) The average loss below the deductible is 500. (ii) 60% of the number of losses is below the deductible. (iii) The loss elimination ratio at the deductible is 31%. (iv) Mean loss is 2000. Determine the deductible. 7.5.

[4B-S94:10] (2 points) You are given the following:

The amount of a single claim has a Pareto distribution with parameters α  2 and θ  2000. Calculate the Loss Elimination Ratio (LER) for a $500 deductible. (A) (B) (C) (D) (E)

Less than 0.18 At least 0.18, but less than 0.23 At least 0.23, but less than 0.28 At least 0.28, but less than 0.33 At least 0.33

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

7. LOSS ELIMINATION RATIO

122

7.6. [4B-S93:28] (3 points) You are given the following: •

The underlying loss distribution function for a certain line of business in 1991 is: F ( x )  1 − x −5



x > 1.

From 1991 to 1992, 10% inflation impacts all claims uniformly. Determine the 1992 Loss Elimination Ratio for a deductible of 1.2.

(A) (B) (C) (D) (E)

Less than 0.850 At least 0.850, but less than 0.870 At least 0.870, but less than 0.890 At least 0.890, but less than 0.910 At least 0.910

Use the following information for questions 7.7 and 7.8: You are given the following: •

Losses follow a Pareto distribution with parameters θ  k and α  2, where k is a constant.



There is a deductible of 2k.

7.7. [4B-F96:13] (2 points) What is the loss elimination ratio (LER)? (A) 1/3

(B) 1/2

(C) 2/3

(D) 4/5

(E) 1

7.8. [4B-F96:14] (2 points) Over a period of time, inflation has uniformly affected all losses, causing them to double, but the deductible remains the same. Calculate the new loss elimination ratio (LER). (A) 1/6

(B) 1/3

(C) 2/5

(D) 1/2

(E) 2/3

7.9. Losses follow a single-parameter Pareto distribution with α  3, θ  500. Determine the deductible d needed to achieve a loss elimination ratio of 20%. Losses follow a single-parameter Pareto distribution with α  3, θ  500.

7.10.

Determine the deductible d needed to achieve a loss elimination ratio of 80%. [4B-F93:27] (3 points) You are given the following:

7.11. •

Losses for 1991 are uniformly distributed on [0, 10,000].



Inflation of 5% impacts all losses uniformly from 1991 to 1992 and from 1992 to 1993 (5% each year). Determine the 1993 Loss Elimination Ratio for a deductible of $500.

(A) (B) (C) (D) (E)

Less than 0.085 At least 0.085, but less than 0.090 At least 0.090, but less than 0.095 At least 0.095, but less than 0.100 At least 0.100

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 7

123

Use the following information for questions 7.12 and 7.13: You are given the following: •

Losses follow a distribution with density function f (x ) 

−x/1000 1 , 1000 e

0 < x < ∞.



There is a deductible of 500.



10 losses are expected to exceed the deductible each year.

7.12. [4B-S97:19] (3 points) Determine the amount to which the deductible would have to be raised to double the loss elimination ratio (LER). (A) (B) (C) (D) (E)

Less than 550 At least 550, but less than 850 At least 850, but less than 1150 At least 1150, but less than 1450 At least 1450

7.13. [4B-S97:20] (2 points) Determine the expected number of losses that would exceed the deductible each year if all loss amounts doubled, but the deductible remained at 500. (A) (B) (C) (D) (E)

Less than 10 At least 10, but less than 12 At least 12, but less than 14 At least 14, but less than 16 At least 16

Use the following information for questions 7.14 and 7.15: Losses follow a lognormal distribution with parameters µ  6.9078 and σ  1.5174. 7.14. [4B-S99:20] (2 points) Determine the ratio of the loss elimination ratio (LER) at 10,000 to the loss elimination ratio (LER) at 1,000. (A) (B) (C) (D) (E)

Less than 2 At least 2, but less than 4 At least 4, but less than 6 At least 6, but less than 8 At least 8

7.15. [4B-S99:21] (2 points) Determine the percentage increase in the number of losses that exceed 1,000 that would result if all losses increased in value by 10%. (A) (B) (C) (D) (E)

Less than 2% At least 2%, but less than 4% At least 4%, but less than 6% At least 6%, but less than 8% At least 8%

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

7. LOSS ELIMINATION RATIO

124

Use the following information for questions 7.16 and 7.17: You are given the following: •

Losses follow a lognormal distribution with parameters µ  7 and σ  2.



There is a deductible of 2,000.



10 losses are expected each year.



The number of losses and the individual loss amounts are independent.

7.16. [4B-S96:9 and 1999 C3 Sample:17] (2 points) Determine the loss elimination ratio (LER) for the deductible. (A) (B) (C) (D) (E)

Less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20, but less than 0.25 At least 0.25

7.17. [4B-S96:10 and 1999 C3 Sample:18] (2 points) Determine the expected number of annual losses that exceed the deductible if all loss amounts are increased uniformly by 20%, but the deductible remained the same. (A) (B) (C) (D) (E)

Less than 4.0 At least 4.0, but less than 5.0 At least 5.0, but less than 6.0 At least 6.0, but less than 7.0 At least 7.0 [4B-S95:6] (3 points) You are given the following:

7.18. •

For 1994, loss sizes follow a uniform distribution on [0, 2500].



In 1994, the insurer pays 100% of all losses.



Inflation of 3.0% impacts all losses uniformly from 1994 to 1995.



In 1995, a deductible of 100 is applied to all losses. Determine the Loss Elimination Ratio (L.E.R.) of the deductible of 100 on 1995 losses.

(A) (B) (C) (D) (E)

Less than 7.3% At least 7.3%, but less than 7.5% At least 7.5%, but less than 7.7% At least 7.7%, but less than 7.9% At least 7.9%

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 7

7.19.

125

Losses in 2008 follow a distribution with density function 1 x f (x )  1− 500,000 1,000,000





0 ≤ x ≤ 1,000,000.

Reinsurance pays the excess of each loss over 100,000. 10% inflation impacts all losses in 2009. Let LER2008 be the reinsurance loss elimination ratio in 2008, and LER2009 the reinsurance loss elimination ratio in 2009. Determine LER2008 − LER2009 . 7.20.

[SOA3-F03:29] The graph of the density function for losses is: 0.012 0.010 f (x )

0.008 0.006 0.004 0.002 0.000 0

80 Loss amount, x

120

Calculate the loss elimination ratio for an ordinary deductible of 20. (A) 0.20 7.21.

(B) 0.24

(C) 0.28

(D) 0.32

(E) 0.36

[SOA3-F03:34] You are given:

(i) Losses follow an exponential distribution with the same mean in all years. (ii) The loss elimination ratio this year is 70%. (iii) The ordinary deductible for the coming year is 4/3 of the current deductible. Compute the loss elimination ratio for the coming year. (A) 70%

(B) 75%

(C) 80%

(D) 85%

(E) 90%

7.22. [CAS3-S04:20] Losses have an exponential distribution with a mean of 1,000. There is a deductible of 500. The insurer wants to double the loss elimination ratio. Determine the new deductible that achieves this. (A) 219

(B) 693

(C) 1,046

(D) 1,193

(E) 1,546

7.23. [SOA3-F04:18] Losses in 2003 follow a two-parameter Pareto distribution with α  2 and θ  5. Losses in 2004 are uniformly 20% higher than in 2003. An insurance covers each loss subject to an ordinary deductible of 10. Calculate the Loss Elimination Ratio in 2004. (A) 5/9

(B) 5/8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2/3

(D) 3/4

(E) 4/5

Exercises continue on the next page . . .

7. LOSS ELIMINATION RATIO

126

Use the following information for questions 7.24 and 7.25: Losses have the following distribution: F ( x )  1 − 0.4e −x/20 − 0.6e −x/2000 Insurance coverage is subject to an ordinary deductible of 100. 7.24.

Calculate the average payment per payment.

7.25.

Losses increase uniformly by 20% inflation.

Calculate the Loss Elimination Ratio after inflation. 7.26. For an insurance coverage, losses (before application of any deductible) follow a 2-parameter Pareto with parameters α  3 and θ  5000. The coverage is subject to a deductible of 500. Calculate the deductible needed to double the loss elimination ratio. Additional released exam questions: CAS3-F05:33, SOA M-F05:28

Solutions 7.1.

We’ll use formula (6.8) to evaluate E[X ∧ 250]. E[X ∧ d] E[X] − e ( d ) 1 − F ( d ) (2375)(0.95) LER ( d )   1−  0.0975 E[X] E[X] 2500





(C)

7.2. We’ll use formula (6.8) to evaluate E[X]. E[X]  E[X ∧ d] + e ( d ) 1 − F ( d )  465 + 5250 (0.86)  4980



LER ( d ) 

7.3.



465  0.0934 . (D) 4980

E[X ∧ 1000]  0.3 (2000)  600. Let x be the answer. Then 600  0.6x + 0.4 (1000) , so x  333 13 . (A)

7.4. We are given that 0.31  LER  can decompose E[X ∧ d] into

E[X∧d] E[X]

and E[X]  2000, so E[X ∧ d]  620. On the other hand, we

E[X ∧ d]  E[X ∧ d | X < d] Pr ( X < d ) + E[X ∧ d | X ≥ d] Pr ( X ≥ d )  (Average loss < d ) Pr ( X < d ) + d Pr ( X ≥ d )  500 (0.6) + d (0.4)

from which it follows that 620  300 + 0.4d, so d  800 . 7.5.

E[X]  2000. For E[X ∧ 500] you can use the formula:



E[X ∧ 500]  2000 1 − LER  C/4 Study Manual—17th edition Copyright ©2014 ASM

2000  400 2500

400  0.2 2000



(B)

EXERCISE SOLUTIONS FOR LESSON 7

127

If you wanted to back it out of e X (500) , the calculations would go: 2000 + 500  2500 1 !2 2000 1 − F (500)   0.64 2500 e (500) 

E[X ∧ 500]  E[X] − e (500) 1 − F (500)  2000 − 2500 (0.64)  400.





7.6. Losses X follow a single parameter Pareto with θ  1, α  5. Let Z  1.1X. The new θ for Z is 1.1. LER is the quotient of E[Z ∧ 1.2] over E[Z], and using the tables: αθ α−1 αθ θα E[Z ∧ d]  − α − 1 ( α − 1) d α−1 E[Z] 

d≥θ

θα

LER ( d )  1 − 1−

( α−1) d α−1 αθ α−1 θ α−1

αd α−1

So with θ  1.1, α  5, LER (1.2)  1 −

1.14  0.8588 5 (1.24 )

(B)

Of course, you could also work out the integral for E[Z] and E[Z ∧ 1.2] using the survival function—and this was probably expected by the exam setters, since they gave 3 points for this problem. Then you would get, since the survival function is 1 below 1.1 (Pr ( X < 1.1)  0) ∞

Z E[Z] 

0

1.1

Z 

S ( x ) dx 

0

1 dx +

1.15  1.1 + 4 E[Z ∧ 1.2] 

1.2

Z 0

1.1

Z 

0

!

Z

0 ∞

S ( x ) dx +



Z

!5

1.1

S ( x ) dx

1.1 dx x

1.1

1  1.375 1.14

!

S ( x ) dx  1 dx +

1.1

Z

Z

1.1

Z

1.2 1.1

0

S ( x ) dx +

!5

Z

1.2 1.1

1.1 dx x

1 1.15 1  1.1 + −  1.180832 4 4 1.1 1.24 1.180832 LER   0.8588 1.375

!

C/4 Study Manual—17th edition Copyright ©2014 ASM



S ( x ) dx

7. LOSS ELIMINATION RATIO

128

7.7. Using the formulas in the tables: E[X]  k k 2k E[X ∧ 2k]  k 1 −  3k 3

!

LER 

2 3

(C)

7.8. Double the scale parameter, so the new θ  2k. θ  2k E[X]  2k E[X ∧ 2k]  2k 1 − LER 

1 2

2k k 4k

!

(D)

7.9. Let X be loss size. We calculate the expected value of the loss, E[X]. We then calculate the deductible d such that E[X ∧ d]  0.2 E[X]; this is the deductible which achieves the required loss elimination ratio E[X∧d] E[X]  0.2. For a single-parameter Pareto, E[X] 

αθ 3 (500)   750 α−1 2

The probability that a single-parameter Pareto’s loss is less than θ is 0. This is more or less the same as saying that every loss is greater than 500. (While it is possible for a loss to be less than 500 with probability 0, events with probability 0 do not affect expected values.) Therefore, if we set d < 500, E[X ∧ d]  E[min ( X, d ) ]  d, because the minimum of X and d is the constant d. (Every loss is greater than 500, and therefore certainly greater than d.) To achieve E[X ∧ d]  0.2 (750)  150, we set d  150 . If you tried using the formula for E[X ∧ d] in the tables to back out d, you would obtain a d less than θ  500. But the formula in the tables only works for d ≥ θ, so the d you would get this way would not be correct. 7.10.

As in the previous exercise, E[X] 

αθ 3 (500)   750 α−1 2

We can’t use the method of the last exercise, though, since 0.8 (750) > θ, so we use the formula from the tables: αθ θα − α − 1 ( α − 1) d α−1 5003  750 − 2d 2 5003 LER  1 −  0.8 (2d 2 )(750) 5003  0.2 1500d 2

E[X ∧ d] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 7

129 5002  0.2 3d 2 5002  0.6 d2 500 d√  645.50 0.6

7.11.

For the inflated variable X 0 in terms of the original variable X E[X 0]  1.052 E[X]  1.052 (5000)

We can evaluate E[X 0 ∧ 500] by conditioning on X 0 < 500. If X 0 is greater, E[X 0 ∧ 500]  500; if it’s less, by uniformity it is 250 on the average. So E[X 0 ∧ 500]  250 Pr ( X 0 ≤ 500) + 500 Pr ( X 0 ≥ 500)  The quotient is LERX0 (500)  7.12.

5,387,500 250 (500) + 500 (11,025 − 500)  10,000 (1.052 ) 10,000 (1.052 )

5,387,500 E[X 0 ∧ 500]   0.088646 0 E[X ] 50,000,000 (1.054 )

(B)

Losses are exponential. Let X be loss size. The formula for the loss elimination ratio is LER ( d ) 

For a deductible of 500,

E[X ∧ d]  E[X]

θ 1 − e −d/θ



θ

  1 − e −d/θ  1 − e −d/1000

LER (500)  1 − e −500/1000  1 − e −1/2  0.39347

We must compute d such that LER ( d )  2 (0.39347)  0.78694.

LER ( d )  1 − e −d/1000  2 (0.39347)  0.78694 e −d/1000  1 − 0.78694  0.21306

d  −1000 (ln 0.21306)  1546.2

(E)

Note: the question is asking for the amount to which the deductible is raised; this means, the amount of the new deductible. If the question had asked "by how much is the deductible raised", then the answer would be the difference between the new deductible and the old deductible. 7.13. For an exponential, Pr ( X > x )  e −x/θ . Since there are 10 expected losses above the deductible of 500, the total expected number of losses is 10/ Pr ( X > 500)  10/e −500/1000  10e 1/2 . When losses are doubled, θ is doubled and becomes 2000. The expected number of losses above the deductible of 500 is then 10e 1/2 Pr ( X > 500)  10e 1/2 e −500/2000  10e 1/4  12.84 . (C) 7.14. By definition, the loss elimination ratio is E[X ∧ d]/ E[X], so the ratio of loss elimination ratios is the ratio of E[X ∧ d]’s. This exercise is asking for 2

1.5174 Φ ln 10,000−6.9078−1.5174 + 10,000 1 − F (10,000) E[X ∧ 10,000] exp 6.9078 + 2 1.5174       2 2 ln 1000−6.9078−1.5174 E[X ∧ 1000] exp 6.9078 + 1.5174 Φ + 1000 1 − F ( 1000 ) 2 1.5174 2





Let’s evaluate the normal distribution functions we need. ln 10,000 − 6.9078 − 1.51742  Φ (0)  0.5 1.5174

!

Φ

C/4 Study Manual—17th edition Copyright ©2014 ASM





 (*)

7. LOSS ELIMINATION RATIO

130

ln 10,000 − 6.9078 1 − F (10,000)  1 − Φ  1 − Φ (1.52)  0.0643 1.5174

!

ln 1000 − 6.9078 − 1.51742  Φ (−1.52)  0.0643 1.5174

!

Φ

ln 1000 − 6.9078  1 − Φ (0)  0.5 1 − F (1000)  1 − Φ 1.5174

!

Plugging these into (*), and also plugging in exp 6.9078 +



1.51742 2



 3162.29, we have

E[X ∧ 10,000] 3162.29 (0.5) + 10,000 (0.0643) 2224.15    3.16 E[X ∧ 1000] 3162.29 (0.0643) + 1000 (0.5) 703.34 7.15.

(B)

Let X be the original loss variable and let Z  1.1X be the inflated loss variable. 1 − FX (1000)  0.5 from previous exercise

ln 1000 − 6.9078 − ln 1.1  1 − Φ (−0.06)  0.5239 1 − FZ (1000)  1 − Φ 1.5174 0.5239 (C) − 1  0.0478 0.5

!

7.16. E[X]  exp (7 + 42 )  8103.08 ln 2000 − 7 − 4 ln 2000 − 7 + * / E[X ∧ 2000]  e Φ + 2000 .1 − Φ 2 2

!

9

!

,   e Φ (−1.7) + 2000 1 − Φ (0.3) 9





-



 e 9 (0.0446) + 2000 (0.3821)  1125.60 1125.60  0.1389 (B) LER  8103.08 7.17. The expected number of annual losses that exceed the deductible is the product of the expected number of losses and the probability that a single loss exceeds the deductible. (For example, if the probability that a loss exceeds the deductible is 0.2 and there are 5 losses, the expected number of losses exceeding the deductible is 5 (0.2)  1.) In this question, the expected number of losses is 10 and the probability of exceeding the deductible of 2000 is 1 − F (2000) . F is the distribution function of the inflated variable. For a lognormal distribution, inflating a variable by 20% is achieved by adding ln 1.2 to µ and not changing σ, as discussed in Section 2.1, page 29. Therefore, the expected number of losses above the deductible is 10 1 − F (2000)  10 .1 − Φ





* ,

C/4 Study Manual—17th edition Copyright ©2014 ASM

  ln 2000 − 7 − ln 1.2 + /  10 1 − Φ (0.21)  4.2 2 !

-

(B)

EXERCISE SOLUTIONS FOR LESSON 7

7.18.

131

The inflated variable X is uniform on [0, 2575].

2575 2 To calculate E[X ∧ 100] for a uniform random variable, treat it as a mixture: probability that it is under 100 times the midpoint, plus probability that it is over 100 times 100. E[X] 

2475 100 (50) + (100)  98.058 E[X ∧ 100]  2575 2575

!

The loss elimination ratio is

!

98.058  0.0762 2575/2

LERX (100) 

(C)

7.19. Let’s substitute y  x/1,000,000. Then fY ( y )  2 (1 − y ) , 0 ≤ y ≤ 1, a beta distribution with a  1, a b  2. E[Y]  a+b  31 . (You can also calculate this from basic principles.) E[Y ∧ 0.1]  0.1

Z 0

0.1

Z 0

2y (1 − y ) dy + 0.1 1 − F (0.1)





0.1

2y (1 − y ) dy  y 2 − 23 y 3 0

 0.01 − 23 (0.001) 0.028 28   3 3000 F (0.1) 

0.1

Z 0

2 (1 − y ) dy

0.1

 2y − y 2  0.19 0 28 + 0.1 (1 − 0.19)  E[Y ∧ 0.1]  3000 271/3000 LER2008   0.271 1/3

271 3000

After inflation, with Z  1.1Y E[Z]  1.1 E[Y] 

1.1 3

E[Z ∧ 0.1]  1.1 E Y ∧

0.1 1.1

f

1/11

Z 0



g

11 30

 1.1

1/11

Z 0

2y (1 − y ) dy + 1.1

1 11

 

1−F

1 11



1/11 1 2 31  2y (1 − y ) dy  y 2 − 23 y 3 −  121 3 · 1331 3993 0   Z 1/11 1/11  2 − 1  21 1 F 11  2 (1 − y ) dy  2y − y 2 0 11 121 121 0     E[Z ∧ 0.1]  1.1

31 3993

+ 0.1 1 −

31 10 331 +  3630 121 3630 331/3630 331   11/30 1331

21 121

 LER2009

LERY (0.1)  LERX (100,000) , because multiplying a random variable by a constant and multiplying the 331 deductible by the same constant does not affect the LER. So the final answer is 0.271 − 1331  0.02231 . C/4 Study Manual—17th edition Copyright ©2014 ASM

7. LOSS ELIMINATION RATIO

132

7.20. f ( x )  0.01 for 0 ≤ x ≤ 80. The line from 80 to 120 has slope −0.01/40 and equals 0 at 120, so an equation for it is 0.01 (120 − x ) /40. If you have good intuition, you can calculate E[X] using E[X]  Pr ( X ≤ 80) E[X | X ≤ 80] + Pr ( X > 80) E[X | X > 80] X is uniform below 80, so the expected value of it given that it is below 80 is 40. Given that X > 80, X has a beta distribution with a  1, b  2, θ  40, shifted 80; recall that such a beta distribution is of the form c (40 − x ) , and the graph is a decreasing line, so it fits this description. Such a beta has mean 80 + 40/3. Pr ( X ≤ 80) is the area of the graph up to 80, or 80 (0.01)  0.8. So E[X]  0.8 (40) + 0.2 (80 + 40/3)  50 32 If you couldn’t do it that way, the straightforward way of calculating E[X] is the definition: 120

Z E[X] 

0

Z 

80

0

x f ( x ) dx

0.01x dx +

 (0.01)

802 2

+

120

Z

0.01 40

80

Z

120

80

x3 0.01 60x 2 −  32 + 40 3  32 +

0.01 40

!

120 − x (0.01) x dx 40

!

(120x − x 2 ) dx

! 120 80

60 (1202 − 802 ) −

1203 − 803 3

!

 32 + 18 32  50 23 The limited expected value at the deductible of 20 is (since X is uniform on [0, 20]) 10 times Pr ( X ≤ 20) plus 20 times Pr ( X > 20) , or E[X ∧ 20]  10 (0.2) + 20 (0.8)  18 So the LER is 18/50 32  0.355263 . (E)

7.21.

For an exponential, the loss elimination ratio is E[X ∧ d]  1 − e −d/θ E[X]

We have e −d/θ  0.3 d  −θ ln 0.3 Using 34 d as the deductible leads to an LER of 1 − e (3) ln 0.3  1 − 0.34/3  0.79917 4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

EXERCISE SOLUTIONS FOR LESSON 7

133

7.22. Yes, you guessed it, the answer is A!!! More seriously, the LER for an exponential is LER ( d )  1 − e −d/θ  1 − e −500/1000  0.393469 Doubling this, we have 1 − e −d/1000  0.786938 e −d/1000  0.213062 d  −1000 ln 0.213062  1546.18 7.23.

(E)

The new θ is 1.2 (5)  6. E[X] 

6 6 1

6  3.75 6 + 10 3.75 5 LER  (B)  6 8





E[X ∧ 10]  6 1 −

7.24. The calculations for (limited) expected value of the mixture are done by taking weighted averages of the corresponding amounts for the two exponentials. E[X]  0.4 (20) + 0.6 (2000)  1208 E[X ∧ 100]  0.4 20 (1 − e −100/20 ) + 0.6 2000 (1 − e −100/2000 )  66.4707









S (100)  0.4e −100/20 + 0.6e −100/2000  0.5734

So the expected payment per payment is E[X] − E[X ∧ 100] 1208 − 66.4707   1990.70 . S (100) 0.5734 7.25. Inflation of the mixture is performed by inflating each component separately; the first exponential’s parameter becomes 20 (1.2)  24 and the second exponential’s parameter becomes 2000 (1.2)  2400. If the new loss variable is X 0, then E[X 0]  0.4 (24) + 0.6 (2400)  1449.60 E[X 0 ∧ 100]  0.4 24 (1 − e −100/24 ) + 0.6 2400 (1 − e −100/2400 )  68.218



LERX0 (100)  7.26.







68.218  0.0471 1449.60

To double the LER, it suffices to double E[X ∧ 500], since the denominator E[X] doesn’t change. E[X ∧ 500] 

!2

5000 * 5000 + /  433.884 .1 − 2 5500

,

-

!2 5000 * +/  2 (433.884)  867.77 E[X ∧ x]  2500 .1 − 5000 + x , C/4 Study Manual—17th edition Copyright ©2014 ASM

7. LOSS ELIMINATION RATIO

134

5000  5000 + x

r 1−

867.77  0.808018 2500

x  1187.98 Alternatively, note that the loss elimination ratio is E[X ∧ x] θ 1− E[X] θ+x

! α−1

and calculate 5000 LER (500)  1 − 5500

!2

 0.173554

!2

5000 LER ( x )  1 −  2 (0.173554)  0.347108 5000 + x √ 5000  1 − 0.347101  0.808018 5000 + x x  1187.98

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 8

Risk Measures and Tail Weight Reading: Loss Models Fourth Edition 3.4–3.5, 5.3.4 Broadly speaking, a risk measure is a real-valued function of a random variable. We use the letter ρ for a risk measure; ρ ( X ) is the risk measure of X. You can probably think of several real-valued functions of random variables: • Moments. E[X], Var ( X ) , etc. • Percentiles. For example, the median is a real valued function of X. • Premium principles. For example, the premium may be set equal to the expected loss plus a constant times the standard deviation of a loss, or ρ ( X )  µ X + kσX . This is called the standard deviation principle. However, the risk measures we are interested in are measures for the solvency of a company. In the insurance context, they are high positive numbers indicating how high company reserves or surplus should be to give comfort to regulators and the public that they can cover losses in an adverse scenario. Among the functions listed above, high percentiles, or the premium principle ρ ( X )  µ X + cσX with a suitable c may qualify as such a risk measure.

8.1

Coherent risk measures

Let’s list four desirable properties of risk measures: 1. Translation invariance. Adding a positive1 constant to the random variable should add the same constant to the risk measure. Or: ρ (X + c )  ρ (X ) + c This is reasonable, since the amount of reserves or surplus needed for a fixed loss equals that loss, no more and no less. A company faced with having to pay the random amount X + c could break up into two companies, one with the obligation to pay X and another with the obligation to pay c. The second company would have a reserve of c and the first company would have a reserve equal to the appropriate risk measure for X. 2. Positive homogeneity. Multiplying the random variable by a positive constant should multiply the risk measure by the same constant: ρ ( cX )  cρ ( X ) This is reasonable, since expressing the random variable in a different currency (for example) should not affect the surplus or reserve needed. 3. Subadditivity. For any two random losses X and Y, the risk measure for X + Y should not be greater than the sum of the risk measures for X and Y separately: ρ ( X + Y ) ≤ ρ ( X ) + ρ (Y )

1I’m not sure why Loss Models says positive, since this holds for positive constants if and only if it holds for all constants. C/4 Study Manual—17th edition Copyright ©2014 ASM

135

8. RISK MEASURES AND TAIL WEIGHT

136

This is reasonable, since combining losses may result in diversification and reducing the total risk measure, but it should not be possible by breaking a risk into two sub-risks to reduce the total risk measure. 4. Monotonicity. For any two random losses X and Y, if X is always less than Y, or even if the probability that X is less than or equal to Y is 1, then the risk measure for X should be no greater than the risk measure for Y. ρ ( X ) ≤ ρ ( Y ) if Pr ( X ≤ Y )  1

This is reasonable, since the reserves or surplus needed to cover adverse scenarios of Y will be adequate to cover X as well with probability 1.

Risk measures satisfying all four of these properties are called coherent. Example 8A Which of the properties of coherence are satisfied by each of the following premium principles? 1. Equivalence principle: ρ ( X )  E[X] 2. Expected value principle: ρ ( X )  k E[X] 3. Variance principle: ρ ( X )  E[X] + k Var ( X ) 4. Standard deviation: ρ ( X )  µ X + kσX Answer: 1. The equivalence principle satisfies all four properties and is therefore coherent. By the properties of expected value, Translation invariance E[X + c]  E[X] + c Positive homogeneity E[cX]  c E[X] Subadditivity E[X + Y]  E[X] + E[Y] Monotonicity If Pr ( X ≤ Y )  1, then Pr ( X −Y ≤ 0)  1, so E[X −Y] ≤ 0, which implies E[X] ≤ E[Y].

2. The expected value principle fails translation invariance since

ρ ( X + c )  k E[X + c]  k E[X] + kc , ρ ( X ) + c  k E[X] + c However, the other three properties are satisfied: Positive homogeneity k E[cX]  ck E[X] Subadditivity k E[X + Y]  k E[X] + k E[Y] Monotonicity If Pr ( X ≤ Y )  1, then Pr ( X − Y ≤ 0)  1, so k E[X − Y] ≤ 0, which implies k E[X] ≤ k E[Y].

3. The variance principle only satisfies translation invariance.

Translation invariance ρ ( X + c )  E[X + c] + k Var ( X + c )  E[X] + c + k Var ( X )  ρ ( X ) + c Positive homogeneity ρ ( cX )  c E[X] + kc 2 Var ( X ) , cρ ( X )  c E[X] + kc Var ( X ) Subadditivity The variance principle fails subadditivity since Var ( X + Y )  Var ( X ) + Var ( Y ) + 2 Cov ( X, Y ) so if Cov ( X, Y ) > 0 the variance of the sum will be greater than the sum of the variances. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.2. VALUE-AT-RISK (VAR)

137

Table 8.1: Coherence properties of four premium principles

Translation invariance Positive homogeneity Subadditivity Monotonicity

Equivalence ! ! ! !

Expected value # ! ! !

Standard deviation ! ! ! #

Variance ! # # #

Monotonicity The variance principle fails monotonicity. For example, let Y be a constant loss of 100 and let X be a loss with mean 99, variance 2/k, and maximum 99.9. We can arrange for X to have this mean and variance, regardless of how small k > 0 is, by setting X equal to some small (possibly negative) number x with probability p and equal to 99.9 with probability 1 − p, so as to make its variance large enough yet make its mean 99. X is always less than Y, yet has a higher risk measure. 4. The standard deviation principle satisfies all properties except monotonicity. Translation invariance ρ ( X + c )  µ X+c + kσX+c  µ X + c + kσX  ρ ( X ) + c. Positive homogeneity ρ ( cX )  cµ X + kcσX  cρ ( X ) . Subadditivity The correlation is always no greater than 1, so Var ( X + Y )  Var ( X ) + Var ( Y ) + 2 Corr ( X, Y ) Var ( X ) Var ( Y ) ≤

p

p

Var ( X ) + Var ( Y )

p

2

Monotonicity The standard deviation principle fails for the same reason as the variance principle fails: a variable X may always be lower than a constant Y and yet have an arbitrarily high standard deviation.  Table 8.1 summarizes these results.

8.2

Value-at-Risk (VaR)

As indicated above, any percentile (or quantile) is a risk measure. This risk measure has a fancy name: Value-at-Risk, or VaR. Definition 1 The Value-at-Risk at security level p for a random variable X, denoted VaRp ( X ) , is the 100p th percentile of X: −1 VaRp ( X )  π p  FX (p ) 2 In practice, p is selected to be close to 1: 95% or 99% or 99.5%. For simplicity, the textbook only deals with continuous X, for which percentiles are well-defined. The tables list VaR for almost any distribution it can be calculated in closed form for. The only distributions for which VaR is not listed are lognormal and normal. We will calculate VaR for these two distributions, and for educational purposes we’ll also calculate VaR for other distributions. VaR for normal and lognormal distribution Let z p be the 100p th percentile of a standard normal distribution. Then, if X is normal, VaRp ( X )  µ + z p σ. VaR reduces to a standard deviation principle. If X is lognormal, then VaRp ( X )  e µ+z p σ . C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

138

Example 8B Losses have a lognormal distribution with mean 10 and variance 300. Calculate the VaR at security levels 95% and 99%. Answer: Let X be the loss random variable. We back out parameters µ and σ by matching moments. The second moment is Var ( X ) + E[X]2  300 + 102  400. 2

e µ+0.5σ  10 2

e 2µ+2σ  400 µ + 0.5σ 2  ln 10 2µ + 2σ 2  ln 400 Subtracting twice the first equation from the second equation, σ2  ln 400 − 2 ln 10  5.9915 − 2 (2.3026)  1.3863 √ σ  1.3863  1.1774 µ  ln 10 − 0.5σ2  2.3026 − 0.5 (1.3863)  1.6094 The 95th percentile is The 99th percentile is

VaR0.95  e µ+1.645σ  e 1.6094+ (1.645)(1.1774)  34.68 VaR0.99  e 1.6094+ (2.326)(1.1774)  77.36



VaR for exponential distribution For X exponential with mean θ, if F ( x )  p, then e −x/θ  1 − p

x  −θ ln (1 − p )

so VaRp ( X )  −θ ln (1 − p ) . VaR for Pareto distribution For X following a two-parameter Pareto distribution with parameters α and θ, if F ( x )  p, then



θ 1−p θ+x p θ  α1−p θ+x θ θ+x p α 1−p θ 1−



x so VaRp ( X ) 

 √ α 1−p √ . α

p α

p α

1−p



1−p

θ 1−



1−p

Example 8C Losses follow a Pareto distribution with mean 10 and variance 300. Calculate the VaR at security levels 95% and 99%. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.2. VALUE-AT-RISK (VAR)

139

Answer: Let X be the loss random variable. We back out parameters α and θ by matching moments. θ  10 α−1 2θ 2 E[X 2 ]   400 ( α − 1)( α − 2) E[X] 

We divide the square of the first equation into the second.

2 ( α − 1) 4 α−2 2α − 2  4α − 8 α3

Plugging this into the equation for E[X], we get θ  20. The 95th percentile of the Pareto distribution is x such that S ( x )  0.05. θ S (x )  θ+x

!3

 0.05

!3

20  0.05 20 + x √3 20  0.05  0.368403 20 + x 20 20 + x   54.2884 0.368403 VaR0.95  x  34.29 Similarly, the 99th percentile is x such that S ( x )  0.01. 20 20 + x

!3

 0.01

√3 20  0.01  0.215443 20 + x 20 VaR0.99  x  − 20  72.83 0.215443

!

? ?



Quiz 8-1 Losses X follow a Pareto distribution with parameters α  2 and θ  1000. An insurance company pays Y  max (0, X − 2000) for these losses. Calculate VaR0.99 ( Y ) . Quiz 8-2 Losses X follow a paralogistic distribution with α  2 and θ  1000. Calculate VaR0.99 ( X ) . VaR is not coherent. It satisfies translation invariance, positive homogeneity, and monotonicity, but not subadditivity. To see that it does not satisfy subadditivity, assume that X and Y are two mutually exclusive losses. Each one has a 3% probability of occurring, and each loss size is 1000 if it occurs. Then the 95th percentiles of X and Y are 0, while the 95th percentile of X + Y is 1000 since the loss of 1000 has C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

140

a 6% probability of occurring. While X and Y are not continuous random variables, the 95th percentile is well-defined for X, Y, and X + Y in this example. This example is relevant for the insurance industry, particularly the segment insuring against catastrophes. It would be absurd for an insurance company to hold no reserve or surplus for a catastrophe whose probability of occurring is less than the VaR threshold. Therefore, VaR is not considered a good risk measure for insurance.

8.3

Tail-Value-at-Risk (TVaR)

Definition 2 The tail-value-at-risk of a continuous random variable X at security level p, denoted TVaRp ( X ) , is the expectation of the variable given that it is above its 100p th percentile: TVaRp ( X )  E X | X > VaRp ( X )

f

g 2

We will not discuss TVaR for discrete random variables, although if the 100p th percentile is welldefined the above definition may be used. This measure is also called Conditional Tail Expectation (CTE), Tail Conditional Expectation (TCE), and Expected Shortfall (ES). On later life exams, as well as life insurance regulations and literature, it is called CTE; however, on the P/C side it is often called TVaR. TVaR can be calculated directly from the definition as

R TVaRp ( X ) 

∞ VaRp ( X )

x f ( x ) dx

1 − F VaRp ( X )





The numerator is called the partial expectation of X given that X is greater than VaRp ( X ) . The term “partial expectation” is not used in the textbook and you are not responsible for it, but it is a convenient description of the numerator and I will use it. −1 Recall that VaRp ( X )  FX ( p ) , so the above equation can be rewritten as

R TVaRp ( X ) 

∞ −1 ( p ) FX

x f ( x ) dx

1−p

(8.1)

If we substitute y  F ( x ) , then x  F −1 ( y )  VaR y ( X ) and dy  F0 ( x ) dx  f ( x ) dx. The lower limit of the   integral becomes F F −1 ( p )  p, and the upper limit becomes F (∞)  1, so we get2

R TVaRp ( X ) 

1 VaR y ( X ) dy p

1−p

(8.2)

So TVaR can be calculated by integrating percentiles. However, I do not find this equation useful for calculating TVaR, since for most distributions percentiles are difficult to integrate. The only distributions for which percentiles are easy to integrate are beta distributions in which either a or b is 1, and it’s easy enough to calculate partial expectations for those using the original formula. Example 8D X is a uniform distribution on [0, 100]. Calculate TVaR0.95 ( X ) . 2The textbook says that this equation is derived by integration by parts and substitution, but I do not see why integration by parts is needed. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.3. TAIL-VALUE-AT-RISK (TVAR)

141

S (x )

0.05 0.05 VaR0.95 (X ) 0

0

 0.05e VaR0.95 (X ) x

VaR0.95 (X )

Figure 8.1: Illustration of TVaR  0.95 for a continuous loss distribution. The shaded area is (1 − 0.95) TVaR0.95 , and consists of (1 − 0.95) e VaR0.95 ( X ) plus (1 − 0.95) VaR0.95 ( X ) .

Answer: We will use equation (8.2). For X, the 100p th percentile is 100p. (This is rather obvious, but if you must do this with algebra, note that FX ( x )  0.01x and then solve for the x that makes FX ( x )  p.) So

R TVaR0.95 ( X )  

1 100y dy 0.95

1 − 0.95

1 (100y 2 /2)

0.95

0.05 50 − 45.125  97.5  0.05 However, this result is intuitively obvious; the conditional expectation of a uniform given that it is between 95 and 100 is the midpoint.  A more useful equation comes from noting that the difference between the mean excess loss at   VaRp ( X ) , e X VaRp ( X ) , and TVaRp ( X ) is that the former averages only the excess over VaRp ( X ) whereas the latter averages the entire X. Therefore TVaRp ( X )  VaRp ( X ) + e X VaRp ( X )



(8.3)



Figure 8.1 illustrates this equation. The area under the curve is the integral of S ( x ) , the total  expected  value. The shaded region at the right is the partial expectation above VaRp ( X ) , or (1 − p ) e X VaRp ( X ) . The shaded rectangle at the left is (1 − p ) VaRp ( X ) . Formula (8.3) is especially useful for distributions where e ( x ) has a simple formula, such as exponential and Pareto distributions. The tables list TVaR for any distribution for which this can be calculated, except for normal and lognormal distributions. Thus there is no need for you to calculate it for other distributions. However, for educational purposes, we’ll calculate TVaR for other distributions as well. TVaR for exponential distribution We derived above that VaRp ( X )  −θ ln (1 − p ) . Therefore TVaRp ( X )  −θ ln (1 − p ) + θ  θ 1 − ln (1 − p )



C/4 Study Manual—17th edition Copyright ©2014 ASM



8. RISK MEASURES AND TAIL WEIGHT

142

Example 8E X is exponentially distributed with mean 1000. Calculate TVaR0.95 ( X ) and TVaR0.99 ( X ) . Answer: Using the formula we just derived, TVaR0.95  1000 (1 − ln 0.05)  3996 TVaR0.99  1000 (1 − ln 0.01)  5605



TVaR for Pareto distribution We derived the following formula for VaRp ( X ) above. θ 1−



VaRp ( X ) 

p α

p α

1−p



1−p

For a Pareto, e ( x )  ( θ + x ) / ( α − 1) (equation (6.10) on page 100). Thus TVaRp ( X )  VaRp ( X ) +

θ + VaRp ( X )

!α − 1 θ α +  VaRp ( X ) α−1 α−1 

 E[X] .1 +

* ,

1−p + / p α 1−p

α 1−

p α



(8.4)

-

Example 8F Losses follow a two-parameter Pareto distribution with mean 10 and variance 300. Calculate the tail-value-at-risk for the losses at security levels 95% and 99%. Answer: In example 8C, we calculated θ  20 and α  3. Using the above formula,

  √3 0.05 3 1 − +/  61.433 * TVaR0.95 ( X )  10 .1 + √3 0.05 ,  √3 3 1 − 0.01 + * /  119.248 TVaR0.99 ( X )  10 .1 + √3 0.01 ,

?



-

Quiz 8-3 Losses X follow a single-parameter Pareto with θ  10 and α  3. Calculate TVaR0.65 ( X ) . For random variables X following other distributions for which the tables give E[X∧x], we can translate equation (8.3) into f g E[X] − E X ∧ VaRp ( X ) (8.5) TVaRp ( X )  VaRp ( X ) + 1−p TVaR for lognormal distribution For a lognormal distribution with parameters µ and σ, the tables have

  ln x − µ − σ 2 + x 1 − F (x ) σ !

E[X ∧ x]  E[X]Φ C/4 Study Manual—17th edition Copyright ©2014 ASM

8.3. TAIL-VALUE-AT-RISK (TVAR)

143

Since F VaRp ( X )  p,





E X ∧ VaRp ( X )  E[X]Φ

f

g

ln VaRp ( X ) − µ − σ2 σ

!

+ VaRp ( X )(1 − p )

In equation (8.5), this expression is divided by 1 − p and subtracted. The last summand of this expression, VaRp ( X )(1 − p ) , when divided by 1 − p, cancels against the first summand of equation (8.5), VaRp ( X ) . Also, 1 − Φ ( x )  Φ (−x ) . So the formula for TVaR for a lognormal reduces to 1 − Φ ln VaRp ( X ) − µ − σ2



TVaRp ( X )  E[X] .

*

.  σ

1−p

,

+/ -

  ln exp ( µ + z p σ ) − µ − σ2 + * *. 1 − Φ . / +/ .. // σ , . //  E[X] . 1−p .. // . / , ! Φ(σ − zp

 E[X]

1−p

(8.6)

where, as usual, z p  Φ−1 ( p ) is the 100p th percentile of a standard normal distribution. Since TVaR is not listed in the tables for the lognormal distribution, you’ll have to decide whether to memorize equation (8.6) or to be prepared to derive it as needed. Example 8G Losses have a lognormal distribution with mean 10 and variance 300. Calculate the Tail-Value-at-Risk for these losses at the 95% and 99% security levels. Answer: In example 8B, we backed out µ  1.6094 and σ  1.1774. Therefore TVaR0.95 ( X )  10

Φ (−0.47) (10)(0.3192) Φ (1.174 − 1.645)  10   63.84 0.05 0.05 0.05

!

!

(10)(0.1251) Φ (−1.15) Φ (1.174 − 2.326)  10   125.1 TVaR0.99 ( X )  10 0.01 0.01 0.01 !

!



TVaR for normal distribution The tables do not include the normal distribution. For a normal random variable X with mean µ and variance σ2 , X  µ + σZ where Z is standard normal, so   x − µ x − µ E[X | X > x]  E µ + σZ | Z >  µ+σE Z | Z > σ σ so if we calculate the TVaR for a standard normal random variable, we can transform it to get the TVaR for any normal random variable. We will calculate TVaR using the definition, equation (8.1). In the following derivation, VaRp ( Z ) is the 100p th percentile of a standard normal distribution, not of X. The numerator is 1

(1 − p ) TVaRp ( Z )  √



Z

∞ VaRp ( Z )

xe −x

2 /2

dx

∞ 1  −x 2 /2   √ −e 2π VaRp (Z )



C/4 Study Manual—17th edition Copyright ©2014 ASM

e − VaRp ( Z ) √ 2π

2 /2

8. RISK MEASURES AND TAIL WEIGHT

144

so TVaRp ( Z ) is the final expression divided by 1 − p. Also, VaRp ( Z )  Φ−1 ( p )  z p . So for X following a general normal distribution, 2

σ e −z p /2 √ 1 − p 2π φ (zp ) µ+σ 1−p

TVaRp ( X )  µ +

(8.7)

This is a standard deviation principle.3 Since TVaR is not listed in the tables for the normal distribution, you’ll have to decide whether to memorize equation (8.7) or to be prepared to derive it as needed. Example 8H Losses have a normal distribution with mean 10 and variance 300 Calculate the Tail-Value-at-Risk at the 95% and 99% security levels for the loss distribution. Answer: Using formula (8.7), TVaR0.95 ( X )  10 +  10 + TVaR0.99 ( X )  10 +  10 +

√ 300 exp (−1.6452 /2) √ 0.05 2π 17.3205 (0.1031)  45.73 0.05 √ 300 exp (−2.3262 /2) √ 0.01 2π 17.3205 (0.02665)  56.16 0.01



Note the following properties of TVaR: 1. TVaR is coherent. 2. TVaR0 ( X )  E[X]. 3. TVaRp ( X ) ≥ VaRp ( X ) , with equality holding only if VaRp ( X )  max ( X ) .

8.4

Tail Weight

Questions on the material in this section appear in pre-2007 CAS 3 exams. To my knowledge there haven’t been any questions on this material since 2007. Parametric distributions are often used to model loss size. Parametric distributions vary in the degree to which they allow for very large claims. Tail weight describes how much weight is placed on the tail of the distribution. The bigger the tail weight of the distribution, the more provision for high claims. The following quantitative measures of tail weight are available: 1. The more positive raw or central moments exist, the less the tail weight. For a gamma distribution, all positive raw moments exist, but for a Pareto, only the k th moment for k < α exists. Thus the lower the α of the Pareto, the fatter the tail. 3Notice that φ ( x ) is not the same as Φ ( x ) . The former is the probability density function of a standard normal distribution: 2

e −x /2 φ (x )  √ 2π The standard normal distribution function Φ ( x ) is the integral from −∞ to x of the density function. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.4. TAIL WEIGHT

145

Gamma Pareto

0.1 0.01 0.001 0.0001 10−5 10−6 10−7 10−8

0

25

50

75

100

125

150

175

200

Figure 8.2: Comparison of densities for Pareto and gamma with equal means and variances

2. To compare two distributions, the limits of the ratios of the survival functions, or equivalently the ratios of the density functions, can be examined as x → ∞. A ratio going to infinity implies the function in the numerator has heavier tail weight. For example, comparing an exponential with parameter θ to a gamma distribution with parameters α and θα (so that they have the same mean), if we divide the density of the exponential by the density of the gamma and ignore constants, the quotient is e −x/θ  x 1−α e − (1−α ) x/θ x α−1 e −αx/θ If α > 1, the e ( α−1) x/θ will go to infinity faster than a power of x goes to zero (use L’Hospital if necessary), so the exponential distribution has the fatter tail, and the opposite holds if α < 1. The textbook demonstrates that a Pareto’s density function is higher than a gamma’s density function as x goes to ∞, Its example is a Pareto with parameters α  3 and θ  10 compared to a gamma with parameters α  1/3 and θ  15. Both of these have the same mean (5) and variance (75). To make the difference sharper, I’m providing my own graph where I use a logarithmic scale (instead of the book’s linear scale). See Figure 8.2 to see the difference. The gamma swoops down in a straight line, just like an exponential (for high x, a gamma behaves like an exponential), while the Pareto’s descent slows down. We will soon see that the lower the α for a gamma distribution, the heavier the tail weight, yet even with α  13 for the gamma, its tail weight is less than that of a Pareto. 3. An increasing hazard rate function means a lighter tail and a decreasing one means a heavier tail. An exponential distribution has a constant hazard rate function, so it has a medium tail weight. The textbook shows that for a gamma distribution, limx→∞ h ( x )  θ1 , the same as the hazard rate of an exponential. If α > 1, the hazard rate increases; if α < 1, the hazard rate decreases. For a two-parameter Pareto distribution, h (x )  which decreases as x → ∞. C/4 Study Manual—17th edition Copyright ©2014 ASM

f (x ) αθ α  S (x ) ( x + θ ) α+1

!

x+θ θ

!α 

α x+θ

8. RISK MEASURES AND TAIL WEIGHT

146

For a Weibull distribution, the hazard rate is h ( x )  versa, so the higher the τ, the lighter the tail.

τx τ−1 θτ ,

so if τ > 1, then this increases and vice

So far, we only have three classifications: light tail (increasing hazard rate function), medium tail (constant hazard rate function) and heavy tail (decreasing hazard rate function). The textbook then says that to compare the tail weights of two distributions, one should compare the rates of increase of the hazard rate functions. The higher the rate of increase, the lighter the tail. This implies that you should differentiate the hazard rate function. If you do this for a two-parameter Pareto, you find h0 ( x )  −

α (x + θ)2

Larger α will make h 0 ( x ) lower, which by the above logic implies a heavier tail. Yet all of the other measures imply that the higher the α, the lighter the tail. In fact, holding the mean fixed, the limiting distribution as α → ∞ is an exponential. In private correspondence, Klugman agreed that this measure of tail weight—rate of increase of hazard rate function—does not work for comparing Paretos. I think the correct measure of which distribution has a heavier tail weight would be the difference in hazard rate functions as x → ∞. If you compare two Pareto’s, one with parameter α 1 and the other with parameter α2 > α1 , then α1 α2 − α1 , the ratio of mean excess losses is

( θ + x ) / ( α 1 − 1) α 2 − 1  >1 ( θ + x ) / ( α 2 − 1) α 1 − 1 implying that the Pareto with the lower α has the higher tail weight. Using this measure of tail weight, higher tail weight is equivalent to higher TVaR. A decreasing hazard rate function implies an increasing mean excess loss and an increasing hazard rate function implies a decreasing mean excess loss, but the converse isn’t necessarily true—an increasing mean excess loss doesn’t necessarily imply a decreasing hazard rate function. Also, lim e ( d )  lim

d→∞

d→∞

1 h (d )

Since for a gamma distribution limd→∞ h ( d )  θ1 , limd→∞ e ( d )  θ, but e (0)  αθ, so the mean excess loss increases if and only if α is less than 1. The textbook has a discussion of the equilibrium distribution Fe ( x ) , which is defined by the probability density function S (x ) fe (x )  E[X] C/4 Study Manual—17th edition Copyright ©2014 ASM

8.5. EXTREME VALUE DISTRIBUTIONS

147

which is a legitimate density function since R

x 0

S ( u ) du E[X]

R

∞ 0

S ( x ) dx  E[X]. The distribution function is Fe ( x ) 

and the expected value of the equilibrium random variable X e is E[X e ] 

E[X 2 ] 2 E[X]

Using this distribution, it proves that if e ( x ) ≥ e (0) for all x, which would be true for a decreasing hazard rate, then the coefficient of variation is at least 1, and if e ( x ) ≤ e (0) for all x, then the coefficient of variation is at most 1. I doubt the equilibrium distribution will be tested on. Example 8I X follows an exponential distribution with mean θ. Determine the distribution function of the equilibrium distribution for X, or X e . Answer: S ( x )  e −x/θ . So

e −x/θ θ which is the density function of an exponential with mean θ. So X e has the same distribution as X. fe (x ) 

Example 8J X follows a two-parameter Pareto distribution with parameters α and θ. Determine the distribution function of the equilibrium distribution for X, or X e . Answer: S ( x )  θ/ ( θ + x )





. So

 fe (x ) 

θ/ ( θ + x )



θ/ ( α − 1)



( α − 1) θ α−1 (θ + x )α

which is the density function of a Pareto distribution with parameters α − 1 and θ.

8.5



Extreme value distributions

The following material was added to the syllabus for October 2013. It is purely descriptive (no calculations), so exam coverage of it is bound to be light. We will mention two heavy-tailed distributions used for risk management. The first distribution arises as the limit of the maximum of a set of observations as the size of the set goes to infinity. The Fisher-Tippett Theorem states that this limit has only one of three possible distributions. Of the three, we will mention only one of them, the Fréchet distribution. This is the same as the inverse Weibull distribution. The second distribution arises as the limit of the excess loss random variable as the truncation point (the deductible) goes to infinity. The Balkema-de Haan-Pickands Theorem states that this limit has only one of three possible distributions. We will mention only two of them. You could guess what these two are. We’ve learned that the excess loss of an exponential is exponential and the excess loss of a Pareto is Pareto, so these two are certainly limiting distributions. And in fact, an exponential distribution is the limiting distribution for excess loss for lighter-tailed distributions, while the Pareto distribution is a limiting distribution for excess loss for other distributions. Thus a Pareto distribution may be a good model for high-deductible insurances. (The exponential distribution is not heavy-tailed, so we didn’t count it when we said that we’ll mention two heavy-tailed distributions.) Extreme value theory calls the set of limiting distributions for mean excess loss “generalized Pareto“, but this is not the same distribution as the generalized Pareto in the Loss Models appendix. C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

148

Table 8.2: Summary of Risk Meaaures

Coherent Risk Measures 1. Translation independent: ρ ( X + c )  ρ ( X ) + c 2. Positive homogeneous: ρ ( cX )  cρ ( X ) 3. Subadditive: ρ ( X + Y ) ≤ ρ ( X ) + ρ ( Y )

4. Monotonic: ρ ( X ) ≤ ρ ( Y ) if Pr ( X ≤ Y )  1

−1 Value-at-Risk: VaRp ( X )  π p  FX (p )

Tail-Value-at-Risk: TVaRp ( X )  E X | X > VaRp ( X )

f

R



x f ( x ) dx

1 − F VaRp ( X )



R 

∞ VaRp ( X )

g



1 VaR y ( X ) dy p

1−p

(8.2)

 VaRp ( X ) + e X VaRp ( X )



(8.3)



E[X] − E X ∧ VaRp ( X )

f

 VaRp ( X ) +

g

Distribution

VaRp ( X )

TVaRp ( X )

Exponential

−θ ln (1 − p )

θ 1 − ln (1 − p )

θ 1−



Pareto Normal Lognormal

p α

p α

1−p

1−p

µ + zp σ e

µ+σz p

(8.5)

1−p





α 1−



E[X] .1 +

* ,



1−p + / p α 1−p



p α

-

µ+σ

E[X]

φ (zp ) 1−p

Φ(σ − zp )

!

1−p

The tables you get at the exam have exponential and Pareto VaR’s and TVaR’s listed, but not normal or lognormal ones.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 8

149

Table 8.3: Summary of Tail Weight and Extreme Value Distributions

Tail Weight Measures 1. The more positive moments, the lower the tail weight. 2. Ratios of survival or density function: if limit at infinity is greater than 1, numerator has higher tail weight. 3. Increasing hazard rate function implies lighter tail. 4. Increasing mean excess loss implies heavier tail. Equilibrium distribution is defined by f e ( x )  S ( x ) / E[X]. Its mean is E[X e ]  E[X 2 ]/2 E[X]. Extreme Value Distributions A limiting distribution function for the maximum of a sample is the inverse Weibull (or Fréchet) distribution. A limiting distribution function for excess loss as deductible goes to infinity is the Pareto distribution. (Exponential for light-tailed distributions.)

Exercises Risk measures 8.1. Consider the exponential premium principle: ρ (X ) 

ln E[e αX ] , α

α>0

Which of the four coherence properties does the exponential premium principle satisfy? 8.2. Losses follow a Weibull distribution with τ  2, θ  500. Calculate the Value-at-Risk of losses at the 95% security level, VaR0.95 ( X ) . (A) (B) (C) (D) (E)

Less than 875 At least 875, but less than 900 At least 900, but less than 925 At least 925, but less than 950 At least 950

8.3. Losses follow an inverse exponential distribution with θ  1000. Determine the Value at Risk at 99%. 8.4. Losses follow a single-parameter Pareto distribution. Let X be the random variable for losses. You are given: (i) VaR0.95 ( X )  11,052 (ii) VaR0.99 ( X )  32,317 Determine VaR0.995 ( X ) . C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

8. RISK MEASURES AND TAIL WEIGHT

150

8.5. For an insurance company, losses follow a mixture of two Pareto distributions with equal weights. For the first Pareto distribution, α  1 and θ  1000. For the second Pareto distribution, α  2 and θ  1000. Calculate the Value at Risk at 99% for the mixture. 8.6.

X is a random variable for losses. X follows a beta distribution with θ  1000, a  2, b  1.

Calculate TVaR0.90 ( X ) . 8.7. For an insurance company, losses follow a lognormal distribution with parameters µ  5, σ  2. Calculate the Tail-Value-at-Risk at the 90% security level. 8.8.

Annual losses follow a Pareto distribution with mean 100, variance 20,000.

Calculate the Tail-Value-at-Risk at the 65% security level. 8.9. Annual losses follow a Pareto distribution with parameters α  2 and θ  100. The Tail-Value-atRisk at a certain security level is 1900. Determine the security level. 8.10. Annual losses follow a normal distribution with mean 100, variance 900. A company calculates its risk measure as the Tail-Value-at-Risk of losses at the 90% security level. It would calculate the same risk measure if it used Value-at-Risk at the p security level. Determine p. Use the following information for questions 8.11 and 8.12: Annual aggregate losses follow an exponential distribution with mean 1000. Let X be the random variable for annual aggregate losses. 8.11.

Calculate the difference between TVaR0.95 ( X ) and VaR0.95 ( X ) .

8.12.

Calculate the absolute difference between TVaR0.99 ( X ) and TVaR0.95 ( X ) .

8.13.

Losses X follow a normal distribution. You are given

(i) TVaR0.5 ( X )  67.55 (ii) TVaR0.8 ( X )  80.79 Determine TVaR0.9 ( X ) . Tail weight 8.14. Random variable X1 with distribution function F1 and probability density function f1 has a heavier tail than random variable X2 with distribution function F2 and probability density function f2 . Which of the following statements is/are true? (More than one may be true.) I. II.

X1 will tend to have fewer positive moments than X2 . The limiting ratio of the density functions, f1 / f2 , will go to infinity.

III.

The hazard rate of X1 will increase more rapidly than the hazard rate of X2 .

IV.

The mean residual life of X1 will increase more rapidly than the mean residual life of X2 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 8 [CAS3-F03:16] Which of the following are true based on the existence of moments test?

8.15. I.

151

The Loglogistic Distribution has a heavier tail than the Gamma Distribution.

II.

The Paralogistic Distribution has a heavier tail than the Lognormal Distribution.

III.

The Inverse Exponential has a heavier tail than the Exponential Distribution.

(A) I only

(B) I and II only

(C) I and III only

(D) II and III only

(E) I, II, and III

[CAS3-F04:27] You are given:

8.16. •

X has density f ( x ) , where f ( x )  500,000/x 3 , for x > 500 (single-parameter Pareto with α  2).



Y has density g ( y ) , where g ( y )  ye −y/500 /250,000 (gamma with α  2 and θ  500). Which of the following are true for sufficiently high x and y?

1. 2. 3.

X has an increasing mean residual life function. Y has an increasing hazard rate. X has a heavier tail than Y based on the hazard rate test.

(A) 1 only.

(B) 2 only.

(C) 3 only.

(D) 2 and 3 only.

(E) All of 1, 2, and 3.

8.17. A catastrophe reinsurance policy has a high deductible. You are modeling payments per loss for this policy. Based on extreme value theory, which of the following probability density functions may be appropriate for this model? (A)

f ( x )  5x 4 e −x

(B)

f (x ) 

(C) (D) (E)

5 / (5×105 )

5 × 105 e −10 x6 5 × 105 f (x )  (100 + x ) 6 500x 4 f (x )  (100 + x ) 6 1005 e −100/x f (x )  24x 6

5 /x 5

Solutions 8.1. •

Translation invariance. ρ (X + c ) 



ln E[e α ( X+c ) ] ln e αc + ln E[e αX ]   c + ρ (X ) α α

!

Positive homogeneity. ln E[e cαX ] α A simple counterexample for α  1 is if X only assumes the values 0 and 1 with probabilities 0.5 and c  2. Then ρ (2X )  ln 0.5 (1 + e 2 ) , 2ρ ( X )  2 ln 0.5 (1 + e ) # ρ ( cX ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

152



Subadditivity. ln E[e α ( X+Y ) ] α If X and Y are independent, then ρ ( X + Y )  ρ ( X ) + ρ ( Y ) . However, if X and Y are not independent, there is no reason ρ ( X + Y ) ≤ ρ ( X ) + ρ ( Y ) . For example, if we use the counterexample for positive homogeneity for X and let Y  X, then ρ (2X )  ln 0.5 (1 + e 2 )  1.4338 > 2ρ ( X )  2 ln 0.5 (1 + e )  1.2402. # ρ (X + Y ) 



Monotonicity. If X ≤ Y always, then e αX ≤ e αY , from which it follows that E[e αX ] ≤ E[e αY ]. !

Only translation invariance and monotonicity are satisfied. 8.2.

You can look this up in the tables. VaR0.95 ( X )  θ − ln (1 − 0.95)



 1/τ

 500 (− ln 0.05) 1/2  865.41

8.3. We need the 99th percentile of the loss distribution. Let it be x. Then e −1000/x  0.99 1000  − ln 0.99 x 1000 x−  99,499 ln 0.99 8.4. The formula for VaR, from the tables, is VaRp ( X )  θ (1 − p ) −1/α Therefore, θ (0.05) −1/α  11,052 θ (0.01) −1/α  32,317 Dividing the second into the first, 5−1/α  0.341987 1 − ln 5  ln 0.341987 α ! 1 ln 5 − ln 2  ln 0.341987 α ln 2 2−1/α  0.341987ln 5/ ln 2  0.629954 It follows that the VaR at 99.5% is VaR0.995 ( X )  θ (0.005) −1/α θ (0.01) −1/α 2−1/α 32,317   51,301 0.629954



C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

EXERCISE SOLUTIONS FOR LESSON 8

153

8.5. We need the 99th percentile of the mixture. The survival function of the mixture is the weighted average of the survival functions of the components. The survival function is 0.01 when the cumulative distribution function is 0.99. Letting x be the 99th percentile, 1000 1000 + 0.5 S ( x )  0.5 1000 + x 1000 + x

!

!2

 0.01

For convenience, let y  1000/ (1000 + x ) . 0.5y 2 + 0.5y  0.01 y 2 + y − 0.02  0 √ −1 + 1 + 0.08 y  0.01961524 2 y must be positive, so we reject the negative solution to the quadratic. 1000  0.01961524 1000 + x 1000 − 1000  49,980.76 x 0.01961524 8.6. The density function for this beta is f ( x )  2x/10002 , 0 ≤ x ≤ 1000. First we calculate the 90th percentile. x

x 2u du  F (x )  2 1000 1000 0 √ x  1000 0.9 √ The partial expectation above x  1000 0.9 is

Z

(1 − p ) TVaR0.9 ( X )  

1000

Z x

!2

2u 2 du 10002 1000

2u 3 3 (10002 ) x

2000 1 − 0.93/2





 0.9

3

  97.4567

Dividing by 1 − p  0.1, we get 974.567 . The same result could be obtained using equation (8.1). The 100y th percentile is

!2

x p F (x )  1000 √ x  1000 p and integrating this,

  3/2 1 2000 1 − 0.9 √ 2 (1 − p ) TVaR0.9 ( X )  1000 y dy  1000 y 3/2  3 3 0.9 0.9 Z

as above. C/4 Study Manual—17th edition Copyright ©2014 ASM

1

8. RISK MEASURES AND TAIL WEIGHT

154

8.7. By formula (8.6), TVaR0.9 ( X )  e

µ+0.5σ 2

Φ ( σ − z 0.9 ) 0.1

!

2

 10e 5+0.5 (2) Φ (2 − 1.282)  10,966Φ (0.72)  8380

8.8. Let X be annual losses. If we wish to do this by first principles, our first task is to calculate the 65th percentile of X. The second moment is E[X 2 ]  Var ( X ) + E[X]2  20,000 + 1002  30,000. We back out the Pareto parameters. θ  100 α−1 2θ 2 E[X 2 ]   30,000 ( α − 1)( α − 2) E[X] 

Dividing the square of the first into the second, 2 ( α − 1) 3 α−2 3α − 6  2α − 2 α4

From the equation for E[X], θ  100 4−1 θ  300 The 65th percentile is x such that F ( x )  0.65 or S ( x )  0.35.

!4

300  0.35 S (x )  300 + x 300  0.351/4 300 + x 300 x − 300  90.0356 0.351/4 For a Pareto, the mean excess loss is e ( x )  ( θ + x ) / ( α − 1) . The TVaR is x + e ( x ) . Here we have TVaR0.65 ( X )  90.0356 +

300 + 90.0356  220.047 3

To do this using formula (8.4), we would back out α and then have

  √4 4 1 − 0.35 + * /  220.047 TVaR0.65 ( X )  100 .1 + √4 0.35 , C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 8

155

8.9. Let’s call the security level p. By equation (8.4), 2 (1 − 1 − p )

p

100 *1 +

,

+  1900 p 2 (1 − 1 − p )  18 p p

1−p

1−p

1− 1−p 9 1−p

p

p

p

1 − p  0.1

1 − p  0.12  0.01 p  0.99

8.10.

By equation (8.7),

φ ( z0.9 ) 0.1

TVaR0.9 ( X )  µ + σ

!

while VaRp ( X )  µ + σz p . So we can ignore µ and σ2 and set z p  φ ( z0.9 ) /0.1. 2

φ ( z 0.9 ) e −1.282 /2 0.4397    1.7540 √ 0.1 0.2507 0.1 2π z p  1.7540 p  Φ (1.75)  0.9599 8.11. TVaR0.95 ( X )  VaR0.95 ( X ) + eVaR0.95 ( X ) . But for an exponential, mean excess loss e ( x )  θ regardless of x. So TVaR0.95 ( X )  VaR0.95 ( X ) + θ and the difference is θ  1000 . 8.12. As we saw in the previous exercise, we can equivalently calculate the absolute difference between the VaR0.99 ( X ) and VaR0.95 ( X ) , since TVaRp ( X ) is always the p th quantile plus 1000. The p th quantile VaRp ( X ) is e − VaRp ( X )/1000  1 − p

VaRp ( X )  −1000 ln (1 − p )

VaR0.95 ( X )  −1000 ln 0.05  2995.73

VaR0.99 ( X )  −1000 ln 0.01  4605.17

The difference is 4605.17 − 2995.73  1609.44 8.13.

We calculate k p  φ ( z p ) / (1 − p ) for p  0.5, 0.8, 0.9. 2

k0.5 k0.8

φ ( z0.5 ) e −0 /2    0.7979 √ 0.5 0.5 2π 2 φ ( z0.8 ) e −0.842 /2    1.3994 √ 0.2 0.2 2π 2

k0.9 

C/4 Study Manual—17th edition Copyright ©2014 ASM

φ ( z0.9 ) e −1.282 /2   1.7540 √ 0.1 0.1 2π

8. RISK MEASURES AND TAIL WEIGHT

156

From equation (8.7), TVaRp ( X )  µ + k p σ. It follows that σ so

TVaR0.9 ( X ) − TVaR0.5 ( X ) TVaR0.8 ( X ) − TVaR0.5 ( X )  k0.9 − k0.5 k0.8 − k0.5

TVaR0.9 ( X )  TVaR0.5 ( X ) + TVaR0.8 ( X ) − TVaR0.5 ( X )



 67.55 + (80.79 − 67.55) 8.14.

 k0.9 − k0.5 ! k0.8 − k0.5

1.7540 − 0.7979  88.55 1.3994 − 0.7979

!

As discussed in the lesson, all of these statements are true except III.

8.15. I.

The loglogistic distribution only has some positive moments and the gamma has all, so I is true. !

II.

The paralogistic distribution only has some positive moments and the lognormal has all, so II is true. !

III.

The inverse exponential distribution has no k th moments for k ≥ 1 and the exponential has all, so III is true. !

(E) 8.16. 1.

For a single-parameter Pareto, the mean residual life is an increasing linear function, so 1 is true. !

2.

The hazard rate of a gamma is difficult to compute because of the Γ ( α ) in the denominator of S ( y ) . However, the textbook provides a clever proof that it increases when α > 1, so 2 is true. !

3.

The hazard rate of a single-parameter Pareto is easy to compute; use the formula h (x )  −

d ln S ( x ) dx

Then ln S ( x )  α (ln θ − ln x ) d ln S ( x ) α −  dx x It decreases. Thus 3 is true. !(E) 8.17. A Pareto distribution would be appropriate for the limit of the mean excess loss as the deductible goes to infinity. (C) is a Pareto density with α  5 and θ  10.

Quiz Solutions √ √ 8-1. By the Pareto formula that we developed, the 99th percentile of the Pareto is 1000 (1− 0.01) / 0.01  9000. After deducting 2000, the VaR is 7000 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 8 8-2. From the tables,

157

VaRp ( X )  θ (1 − p ) −1/α − 1) 1/α



Substituting p  0.99, α  2, θ  1000,

VaR0.99 ( X )  1000 (0.01−1/2 − 1) 1/2  3000 Or from basic principles: We want the 99th percentile, so set F ( x )  0.99 and solve for x.

!2

1 1−  0.99 1 + ( x/1000) 2 √ 1  0.01  0.1 1 + ( x/1000) 2

!2

x 1+  10 1000 √ x  93 1000 x  3000 8-3. From the tables, TVaRp ( X )  Substituting p  0.65, α  3, and θ  10, TVaR0.65 ( X ) 

αθ (1 − p ) −1/α α−1

30 (0.35) −1/3  21.2847 2

Or from first principles: The 65th percentile is x such that 10 x

!3

 0.35

x  √3

10 0.35

 14.1898

For a single-parameter Pareto, e ( x )  x/ ( α − 1) , which is 14.1898/2  7.0949 here. So TVaR0.65 ( X )  14.1898 + 7.0949  21.2847 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

158

C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

Lesson 9

Other Topics in Severity Coverage Modifications Reading: Loss Models Fourth Edition 8.5 A coverage may have both a policy limit and a deductible, and we then need to specify the order of the modifications. We distinguish the policy limit from the maximum covered loss: Policy limit is the maximum amount that the coverage will pay. In the presence of a deductible or other modifications, perform the other modifications, then the policy limit. If a coverage has a policy limit of 10,000 and an ordinary deductible of 500, then it pays 10,000 for a loss of 10,500 or higher, and it pays the loss minus 500 for losses between 500 and 10,500. Maximum covered loss is the stipulated amount considered in calculating the payment. Apply this limit first, and then the deductible. If a coverage has a maximum covered loss of 10,000 and an ordinary deductible of 500, then it pays 9,500 for a loss of 10,000 or higher, and it pays the loss minus 500 for losses between 500 and 10,000. The payment per loss random variable in the presence of a maximum covered loss of u and an ordinary deductible of d is Y L  X ∧ u − X ∧ d. The payment per payment is Y L | X > d. Coinsurance of α means that a portion, α, of each loss is reimbursed by insurance. For example, 80% coinsurance means that insurance will pay 80% of the loss.1 The expected payment per loss if there is α coinsurance, d deductible, and u maximum covered loss is E[Y L ]  α E ( X ∧ u ) − E [X ∧ d]





If there is inflation of r, you multiply X by (1 + r ) , then pull the (1 + r ) factor out to get

!   d + u * / −E X∧ E[Y ]  α (1 + r ) . E X ∧ 1+r 1+r , L

You need not memorize these formulas; rather, you should understand how they are derived. Here is an example combining several of the modifications listed above. Example 9A Losses for an insurance coverage have the following density function:

 1   5 1−  f (x )   0

1 10 x



0 ≤ x ≤ 10

otherwise

Insurance reimburses losses subject to an ordinary deductible of 2 and a maximum covered loss of 7. Inflation increases all loss sizes uniformly by 10%. Calculate the expected amount paid per payment after inflation. 1That is the definition as far as this course is concerned. Unfortunately, other courses use the same term for the complementary meaning: the proportion paid by the policyholder. C/4 Study Manual—17th edition Copyright ©2014 ASM

159

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

160

Answer: If we let X be the original loss variable and Y the inflated payment per loss variable, then Y  (1.1X ) ∧ 7 − (1.1X ) ∧ 2. The expected value of Y (which is the expected payment per loss, not per payment) is E[Y]  E (1.1X ) ∧ 7 − E (1.1X ) ∧ 2

f

g

 1.1 E X ∧

 f

7 1.1

f

g

f

−E X∧

2 1.1

g

g

2 The expected payment per payment is this expression divided by 1 − FY (0)  1 − FX 1.1 . We see that a formula for E[X ∧ d] would be useful. We will have to calculate FX ( x ) anyway in order to get FX (2) , so it is easiest to use formula (5.6).





x 1 1 F (x )  1 − u du 5 0 10   1 1 2 x  u− u 5 20 ! 0 x2 1 x−  5 20

Z

d

Z E[X ∧ d] 

0 d

Z 



0



1 − F ( u ) du



1 u2 1− u− du 5 20

!

d

u2 u 3 + 10 300 0 d2 d3 + d− 10 300  u−

We are now ready to evaluate E[Y] and 1 − FY (0) .

703 70 702 +  3.17305 − 2 11 10 · 11 300 · 113 f g 20 202 203 2 E X ∧ 1.1  − +  1.50764 11 10 · 112 300 · 113 !   1 20 202 2 FX 1.1  −  0.33058 5 11 20 · 112 E[Y] 1.1 (3.17305 − 1.50764)   2.7366 1 − FY ( 0 ) 1 − 0.33058

f

E X∧

?

7 1.1

g





Quiz 9-1 Losses follow an exponential distribution with mean 1000. An insurance coverage on the losses has an ordinary deductible of 500 and a policy limit of 2000. Calculate expected payment per payment for this coverage. Calculating the variance of payment per loss and payment per payment in the presence of an ordinary deductible is not so straightforward. We can temporarily ignore inflation and coinsurance, since each of these multiply the random variable by a factor, and we can adjust the variance by multiplying by that factor squared. So let’s calculate the variance of the payment per loss random variable Y L , defined by YL  X ∧ u∗ − X ∧ d∗ C/4 Study Manual—17th edition Copyright ©2014 ASM

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

161

where u ∗ and d ∗ are the maximum covered loss and the deductible respectively, adjusted for inflation rate r by dividing by 1 + r. We can calculate the second moment of Y L as follows:

(Y L ) 2  ( X ∧ u ∗ − X ∧ d ∗ ) 2

 ( X ∧ u ∗ ) 2 − 2 ( X ∧ u ∗ )( X ∧ d ∗ ) + ( X ∧ d ∗ ) 2

Now, we would prefer a formula starting with ( X ∧ u ∗ ) 2 − ( X ∧ d ∗ ) 2 , so we’ll subtract and add 2 ( X ∧ d ∗ ) 2 .

(Y L ) 2  ( X ∧ u ∗ ) 2 − ( X ∧ d ∗ ) 2 + 2 ( X ∧ d ∗ ) 2 − 2 ( X ∧ u ∗ )( X ∧ d ∗ )  ( X ∧ u ∗ ) 2 − ( X ∧ d ∗ ) 2 + 2 (| X{z ∧ d}∗ )( X ∧ d ∗ − X ∧ u ∗ ) ∗

We can replace the X ∧ d ∗ with the star below it with d ∗ . Because if X < d ∗ , then X ∧ d ∗ − X ∧ u ∗  0, since both X ∧ d ∗ and X ∧ u ∗ are X, so the factor doesn’t matter. And if X ≥ d ∗ , then X ∧ d ∗  d ∗ . Making this replacement and taking expectations on both sides, we get the final formula E ( Y L ) 2  E ( X ∧ u ∗ ) 2 − E ( X ∧ d ∗ ) 2 − 2d ∗ E[X ∧ u ∗ ] − E[X ∧ d ∗ ]

f

g

f

g

f

g





I doubt that you have to memorize this formula. Exam questions on variance are rare. It is hard to use this formula for most distributions, since calculating limited second moments such as E[ ( X ∧ d ) 2 ], usually requires evaluating incomplete gamma or beta distributions, which you cannot be expected to do on an exam. There have been some questions on exams to calculate the variance of payment per loss in the presence of a deductible. These questions involved simple loss distributions such as exponentials, and could be solved by alternative methods, such as: 1. Treat the payment per loss random variable as a mixture distribution, a mixture of the constant 0 (with weight equal to the probability of being below the deductible) and the excess loss random variable. For an exponential, the excess loss random variable is the same as the original random variable. Then calculate the second moment of the mixture. 2. Treat the payment per loss random variable as a compound distribution. The primary distribution is Bernoulli: either the loss is higher than the deductible or it isn’t. The secondary distribution is the excess loss random variable. Then use the compound variance formula, equation (14.2) on page 236. The concepts “primary distribution” and “secondary distribution”, as well as the compound variance formula, are discussed in Lesson 14. Example 9B The loss severity random variable follows an exponential distribution with mean 1000. A coverage for this loss has a deductible of 500. Calculate the variance of the payment per loss random variable. Answer: We’ll treat payment per loss as a mixture distribution. Let X be the loss random variable and Y L the payment per loss random variable. Also let p  Pr ( X > 500) . Then Y L is a mixture of the constant 0 with probability 1− p and an exponential random variable with mean 1000 with probability p. The second moment of an exponential is 2θ 2 , so E ( Y L ) 2  (1 − p )(02 ) + p (2)(10002 )  2 · 106 p

f

g

Here, p  e −1/2 and E[Y L ]  1000p. So Var ( Y L )  2 · 106 p − 106 p 2  2 · 106 e −1/2 − 106 e −1  845,182 . C/4 Study Manual—17th edition Copyright ©2014 ASM



9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

162

Example 9C Losses for an insurance coverage have the following density function:

 1   5 1− f (x )   0 

1 10 x



0 ≤ x ≤ 10

otherwise

Insurance reimburses losses subject to an ordinary deductible of 2 and a maximum covered loss of 7. Calculate the variance of the average payment, taking into account payments of zero on losses below the deductible. Answer: Let X be the loss variable and Y the payment variable. We will use the formula for E[X ∧ d] developed in the Example 9A. To calculate the variance of Y, we will use Var ( Y )  E[Y 2 ] − E[Y]2 . First we calculate E[Y]. 73 72 +  3.24333 10 300 22 23 E[X ∧ 2]  2 − +  1.62667 10 300 E[Y]  E[X ∧ 7] − E[X ∧ 2]  1.61667 E[X ∧ 7]  7 −

The payment size is 0 when X < 2, X − 2 for X between 2 and 7, and 5 when X > 7. So E[Y 2 ]  FX (2)(02 ) +

7

Z 2

  ( x − 2) 2 f X ( x ) dx + 1 − FX (7) (52 ) .

Let’s evaluate the integral. We will substitute y  x − 2 1 5

7

Z 2

( x − 2)

2



1 1 1 − x dx  10 5



5

Z

1  50

0

 5

Z 0

1−

1 ( y + 2) y 2 dy 10



1 (8 − y ) y dy  50 2

5

5

Z 0

(8y 2 − y 3 ) dy

1 8y 3 y 4  − 50 3 4 0   1 1000 625  −  3.54167 50 3 4

!

Now we evaluate FX (7) , using the formula for FX ( x ) developed in the preceding problem. 1 72 FX ( 7 )  7−  0.91 5 20

!

E[Y 2 ]  3.54167 + (1 − 0.91)(25)  5.79167 So Var ( Y )  5.79167 − 1.616672  3.1781

C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISES FOR LESSON 9

163

Exercises 9.1. [4B-F92:3] (1 point) You are given the following: •

Based on observed data truncated from above at 10,000, the probability of a claim exceeding 3000 is 0.30.



Based on the underlying distribution of losses, the probability of a claim exceeding 10,000 is 0.02. Determine the probability that a claim exceeds 3000.

(A) (B) (C) (D) (E)

Less than 0.28 At least 0.28, but less than 0.30 At least 0.30, but less than 0.32 At least 0.32, but less than 0.34 At least 0.34

9.2. [4B-S93:33] (3 points) The distribution for claim severity follows a single-parameter Pareto distribution of the following form:    3 x −4 f (x )  x > 1000. 1000 1000 Determine the average size of a claim between 10,000 and 100,000, given that the claim is between 10,000 and 100,000. (A) (B) (C) (D) (E) 9.3.

Less than 18,000 At least 18,000, but less than 28,000 At least 28,000, but less than 38,000 At least 38,000, but less than 48,000 At least 48,000 [CAS3-F03:21] The cumulative loss distribution for a risk is F ( x )  1 − 106 / ( x + 103 ) 2 .

An insurance policy pays the loss subject to a deductible of 1000 and a maximum covered loss of 10,000. Calculate the percentage of expected aggregate losses that are paid. (A) 10%

(B) 12%

(C) 17%

(D) 34%

(E) 41%

9.4. Losses follow a lognormal distribution with µ  6.9078, σ  2.6283. A policy covers losses subject to a 1000 franchise deductible and a 100,000 policy limit. Determine the average payment per paid claim. 9.5. An insurer pays losses subject to an ordinary deductible of $1000 and a coinsurance factor of 80%. The coinsurance factor is applied before the deductible, so that nothing is paid for losses below $1250. You are given: (i) Losses follow a two-parameter Pareto distribution with α  2. (ii) Average payment per loss is $2500. Determine the average loss.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

164

9.6. An insurer pays losses subject to an ordinary deductible of $1000 and a coinsurance factor of 80%. The coinsurance factor is applied before the deductible, so that nothing is paid for losses below $1250. You are given: (i) Losses follow a two-parameter Pareto distribution with α  2. (ii) Average payment per paid claim is $2500. Determine the average loss. 9.7. [151-82-92:6] The probability density function of the loss, Y, is

   0.02 1− f ( y)   0 

y 100



0 < y < 100 otherwise

The amount paid, Z, is 80 percent of that portion of the loss that exceeds a deductible of 10. Determine E[Z] (A) 17

(B) 19

(C) 21

(D) 23

(E) 25

9.8. Losses follow a two parameter Pareto distribution with parameters α  0.5 and θ  2000. Insurance pays claims subject to a deductible of 500 and a maximum covered loss of 20,000, with 75% coinsurance. Determine the size of the average claim payment. You are given the following:

9.9.

(i) Losses follow a single-parameter Pareto distribution with parameters α  4, θ  5000. (ii) For each loss, insurance covers 80% of the amount of the loss up to 10,000 and 100% of the amount of the loss above that level. Calculate the expected payment per loss. [4B-F95:13] [4B-S98:9] (3 points) You are given the following:

9.10. •

Losses follow a uniform distribution on the interval from 0 to 50,000.



The maximum covered loss is 25,000.



There is a deductible of 5,000 per loss.



The insurer makes a nonzero payment P. Determine the expected value of P.

(A) (B) (C) (D) (E)

Less than 15,000 At least 15,000, but less than 17,000 At least 17,000, but less than 19,000 At least 19,000, but less than 21,000 At least 21,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 9

9.11.

165

You are given:

1.

The amount of a single claim X has a continuous distribution.

2.

Some values from the distribution are given in the following table: x

F (x )

E[X ∧ x]

100 500 1,000 10,000

75 200 300 800

0.6 0.7 0.9 1.0

Calculate the average payment per payment under a coverage with franchise deductible 100 and maximum covered loss of 1000. 9.12.

[3-S00:30] X is a random variable for a loss.

Losses in the year 2000 have a distribution such that: E[X ∧ d]  −0.025d 2 + 1.475d − 2.25,

d  10, 11, 12, . . . , 26

Losses are uniformly 10% higher in 2001. An insurance policy reimburses 100% of losses subject to a deductible of 11 up to a maximum reimbursement of 11. Calculate the ratio of expected reimbursements in 2001 over expected reimbursements in the year 2000. (A) 110.0%

(B) 110.5%

(C) 111.0%

(D) 111.5%

(E) 112.0%

9.13. [CAS3-F03:22] The severity distribution function of claims data for automobile property damage coverage for Le Behemoth Insurance Company is given by an exponential distribution F ( x ) . F ( x )  1 − exp

−x 5000

!

To improve profitability of this portfolio of policies, Le Behemoth institutes the following policy modifications: (i) It imposes a per-claim deductible of 500. (ii) It imposes a per-claim maximum covered loss of 25,000. Previously, there was no deductible and no maximum covered loss. Calculate the average savings per (old) claim if the new deductible and maximum covered loss had been in place. (A) 490

(B) 500

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 510

(D) 520

(E) 530

Exercises continue on the next page . . .

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

166

9.14. [SOA3-F04:7] Annual prescription drug costs are modeled by a two-parameter Pareto distribution with θ  2000 and α  2. A prescription drug plan pays annual drug costs for an insured member subject to the following provisions: (i) The insured pays 100% of costs up to the ordinary annual deductible of 250. (ii) The insured then pays 25% of the costs between 250 and 2250. (iii) The insured pays 100% of the costs above 2250 until the insured has paid 3600 in total. (iv) The insured then pays 5% of the remaining costs. Determine the expected annual plan payment. (A) 1120 9.15.

(B) 1140

(C) 1160

(D) 1180

(E) 1200

Annual losses follow a two-parameter Pareto distribution with θ  500 and α  2.

An insurance plan has the following provisions: (i) The insured pays 100% of the costs up to an ordinary annual deductible of 250. (ii) The insurance pays 80% of the costs between 250 and 2250. (iii) The insurance pays 95% of the costs above 2250. Calculate the Tail-Value-at-Risk for the annual payments of the insurance plan at the 90% security level. 9.16. [CAS3-S04:35] The XYZ Insurance Company sells property insurance policies with a deductible of $5,000, policy limit of $500,000, and a coinsurance factor of 80%. Let X i be the individual loss amount of the i th claim and Yi be the claims payment of the i th claim. Which of the following represents the relationship between X i and Yi ? (A)

Yi 

(B)

Yi 

(C)

Yi 

(D)

Yi 

(E)

Yi 

 0     0.80 ( X i − 5,000)    500,000   0     0.80 ( X i − 4,000)    500,000   0     0.80 ( X i − 5,000)    500,000   0     0.80 ( X i − 6,250)    500,000   0     0.80 ( X i − 5,000)    500,000 

C/4 Study Manual—17th edition Copyright ©2014 ASM

X i ≤ 5,000 5,000 < X i ≤ 625,000 X i > 625,000 X i ≤ 4,000 4,000 < X i ≤ 500,000 X i > 500,000 X i ≤ 5,000 5,000 < X i ≤ 630,000 X i > 630,000 X i ≤ 6,250 6,250 < X i ≤ 631,250 X i > 631,250 X i ≤ 5,000 5,000 < X i ≤ 505,000 X i > 505,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 9

167

[CAS3-F04:33] Losses for a line of insurance follow a Pareto distribution with θ  2,000 and α  2.

9.17.

An insurer sells policies that pay 100% of each loss up to $5,000. The next year the insurer changes the policy terms so that it will pay 80% of each loss after applying a $100 deductible. The $5,000 limit continues to apply to the original loss amount. That is, the insurer will pay 80% of the loss amount between $100 and $5,000. Inflation will be 4%. Calculate the decrease in the insurer’s expected payment per loss. (A) (B) (C) (D) (E)

Less than 23% At least 23%, but less than 24% At least 24%, but less than 25% At least 25%, but less than 26% At least 26% For an insurance coverage, you are given

9.18. •

Losses, before application of any deductible or limit, follow a Pareto distribution with α  2.



The coverage is subject to a franchise deductible of 500 and a maximum covered loss of 10,000.



Average payment per paid claim on this coverage is 2500. Determine the average loss size, before application of deductible and limit, for this insurance coverage.

Use the following information for questions 9.19 and 9.20: A jewelry store has obtained two separate policies that together provide full coverage. You are given: (i) (ii) (iii) (iv) (v)

The average ground-up loss is 11,100. Policy A has an ordinary deductible of 5,000 with no policy limit. Under policy A, the expected amount paid per loss is 6,500. Under policy A, the expected amount paid per payment is 10,000. Policy B has no deductible and a policy limit of 5,000.

9.19. [4-F00:18] Given that a loss has occurred, determine the probability that the payment under policy B is 5,000. (A) (B) (C) (D) (E)

Less than 0.3 At least 0.3, but less than 0.4 At least 0.4, but less than 0.5 At least 0.5, but less than 0.6 At least 0.6

9.20. [4-S00:6] Given that a loss less than or equal to 5,000 has occurred, what is the expected payment under policy B? (A) (B) (C) (D) (E)

Less than 2,500 At least 2,500, but less than 3,000 At least 3,000, but less than 3,500 At least 3,500, but less than 4,000 At least 4,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

168

9.21. Losses follow a uniform distribution on [0, 10,000]. Insurance has a deductible of 1000 and 80% coinsurance. The coinsurance is applied after the deductible, so that a positive payment is made on any loss above 1000. Calculate the variance of the amount paid per loss. 9.22. Losses follow a two-parameter Pareto distribution with parameters α  3 and θ  1000. An insurance coverage has an ordinary deductible of 1500. Calculate the variance of the payment per loss on the coverage. 9.23.

For losses X, you are given x E[X ∧ x]

800 300

1000 380

1250 440

4000 792

5000 828

6250 900

Inflation of 25% affects these losses. Calculate the expected payment per loss after inflation on a coverage with a 1000 ordinary deductible and a 5000 maximum covered loss. Additional released exam questions: CAS3-F06:30, SOA M-F06:6,20,29,31, C-S07:13

Solutions 9.1. Truncation means that claims are observed only if they’re in the untruncated range. In other words, the first bullet is saying Pr ( X > 3000 | X ≤ 10,000)  0.3 Combining the first two bullets,

Pr ( X ≤ 3000) 0.98 Pr ( X ≤ 3000)  0.686 0.7 

Pr ( X > 3000)  0.314

(C)

9.2. Note that we are being asked for the average size of the total claim, including the amount below 10,000, not just the amount between 10,000 and 100,000. The intended solution was probably the following: For a claim, the expected value of the amount of the claim between 10,000 and 100,000 (this is ignoring the condition that the claim is between 10,000 and 100,000) is

Z

100,000 10,000

However, E[X ∧ d]  100,000

Z 0

R

x f ( x ) dx −

d 0

x f ( x ) dx 

100,000

Z 0

x f ( x ) dx −

10,000

Z 0

x f ( x ) dx

x f ( x ) dx + dS ( d ) . Therefore 10,000

Z 0

x f ( x ) dx

 E[X ∧ 100,000] − 100,000S (100,000) − E[X ∧ 10,000] − 10,000S (10,000)



C/4 Study Manual—17th edition Copyright ©2014 ASM







EXERCISE SOLUTIONS FOR LESSON 9

169

That is before the condition that the claim is between 10,000 and 100,000. The probability of that condition is F (100,000) − F (10,000) . Thus the answer to the question is



E[X ∧ 100,000] − 100,000S (100,000) − E[X ∧ 10,000] − 10,000S (10,000)







F (100,000) − F (10,000)

We now proceed to calculate the needed limited expected values and distribution functions. The single-parameter Pareto has parameters α  3, θ  1000. We’ll use the tables in the appendix for the limited expected values. 10003 3 (1000) −  1500 − 5  1495 2 2 (10,0002 ) 10003 E[X ∧ 100,000]  1500 −  1500 − 0.05  1499.95 2 (100,0002 ) E[X ∧ 10,000] 

1000 F (10,000)  1 − 10,000

!3

1000 F (100,000)  1 − 100,000

 0.999

!3

 0.999999

So the answer is



E[X ∧ 100,000] − 100,000S (100,000) − E[X ∧ 10,000] − 10,000S (10,000)







F (100,000) − F (10,000) (1499.95 − 0.1) − (1495 − 10) 14.85    14,864.86 0.999999 − 0.999 0.000999

(A)

It turns out that the correct answer choice could be identified more easily.2 Roughly speaking, the average size of a claim over 10,000 is higher than the average size of a claim between 10,000 and 100,000, since the former brings higher claims into the average. However, the average size of a claim over 10,000 is 10,000 + e (10,000)  15,000. Even after adjusting for the smaller base (the average we need is divided by only the number of claims between 10,000 and 100,000, whereas e ( x ) is divided by the number of claims between 10,000 and ∞), the result is well below 18,000. So (A) must be correct. Let’s carry out this idea formally. The problem is asking for

R

100,000 10,000

x f ( x ) dx

F (100,000) − F (10,000)

(9.1)

where f ( x ) is the single-parameter Pareto’s density function. We can relate this quantity to the mean excess loss, which is easy to calculate. We know by formula (6.11) that e (10,000)  However, by definition,

R e (10,000)  2This idea was shown to me by Rick Lassow. C/4 Study Manual—17th edition Copyright ©2014 ASM

∞ (x 10,000

d 10,000   5000 α−1 3−1 − 10,000) f ( x ) dx

1 − F (10,000)

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

170

R 

x f ( x ) dx

∞ 10,000

x f ( x ) dx

1 − F (10,000)

R 

∞ 10,000

1 − F (10,000)

R − 10,000

∞ 10,000

f ( x ) dx

1 − F (10,000)

− 10,000

Let’s compare the integral in (9.1) to this integral.

R

100,000 10,000

x f ( x ) dx

R

F (100,000) − F (10,000)



R 

100,000 10,000

x f ( x ) dx !

100,000 10,000

x f ( x ) dx !

1 − F (10,000)

1 − F (10,000) F (100,000) − F (10,000)

1 − F (10,000)

1 − 0.999 0.999999 − 0.999

R

 1.001001

100,000 10,000

!

!

x f ( x ) dx

1 − F (10,000)

where we’ve used our calculations of F (10,000) and F (100,000) from above. The integral is no greater than an integral going to ∞ instead of to 100,000. But

R

∞ 10,000

x f ( x ) dx

1 − F (10,000)

 10,000 + e (10,000)  15,000

We’ve shown that the answer is no greater than 15,000 (1.001001)  15,015. Thus it is certainly less than 18,000, and the answer must therefore be (A). 9.3. This is a Pareto distribution with α  2, θ  1000. We must calculate

We’ll use the tables in the appendix.

E[X ∧ 10,000] − E[X ∧ 1000]

1000  500 2000   1000 10,000 E[X ∧ 10,000]  1000 1 −  11000 11 E[X]  1000



E[X ∧ 1000]  1000 1 −



10,000/11 − 500  0.4091 . (E) 1000 9.4. Did you notice that the deductible is a franchise deductible? The insurer pays the entire loss if it is above 1000. If the loss is 100,000, the insurer pays 100,000. If the loss is greater than 100,000, the payment is capped at 100,000 by the policy limit. Hence we are interested in E[X ∧ 100,000]. The answer is

E[X ∧ 100,000]  exp 6.9078 +

2.62832 ln 100,000 − 6.9078 − 2.62832 Φ + 2 2.6283

!

! ln 100,000 − 6.9078 + * / 100,000 .1 − Φ 2.6283 ,    31,627Φ (−0.88) + 100,000 1 − Φ (1.75)

 31,627 (0.1894) + 100,000 (1 − 0.9599)  10,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

EXERCISE SOLUTIONS FOR LESSON 9

171

2.62832 ln 1000 − 6.9078 − 2.62832 E[X ∧ 1000]  exp 6.9078 + Φ + 2 2.6283

!

1000 .1 − Φ

*

!

ln 1000 − 6.9078 + / 2.6283

!

-

,

 31,627Φ (−2.63) + 1000 1 − Φ (0)





 31,627 (0.0043) + 500  636

F (1000)  Φ (0)  0.5 The average payment per paid claim for an ordinary deductible would be Average per claim 

10,000 − 636  18,728 1 − F (1000)

For a franchise deductible, each paid claim gets 1000 more. The answer is 19,728 . 9.5. The average amount paid per loss is 80% of the average amount by which a loss exceeds 1250. Before multiplying by 80%, the expected payment per loss is E[X] − E[X ∧ 1250]

By equation (6.8), this is equal to

e (1250) 1 − F (1250)





Now multiply this by 0.8 and equate to 2500.

2500  0.8e (1250) 1 − F (1250)



(*)



For our two-parameter Pareto, by equation (6.10), e (1250)  and from the tables

θ + 1250  θ + 1250 α−1



θ θ  θ + 1250 θ + 1250 Substituting these expressions into (*), and dividing both sides by 0.8, 1 − F (1250) 

θ 3125  ( θ + 1250) θ + 1250

!2

!2

θ 2 − 3125θ − (3125)(1250)  0

3125 + 31252 + 4 (3,906,250) θ  4081.9555 2 θ E[X]   4081.9555 α−1

p

9.6. The average amount payment per paid claim is 80% of the mean excess loss at 1250. As we mentioned in the previous solution, for our Pareto, e (1250)  θ + 1250. 2500  0.8e (1250)  0.8 ( θ + 1250) 0.8θ θ  1875 θ E[X]   1875 α−1 C/4 Study Manual—17th edition Copyright ©2014 ASM

 1500

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

172

9.7. We can apply the 80% factor after calculating E ( Y − 10)+ . To calculate E ( Y − 10)+ , while equation (6.2) could be used, it’s easier to use equation (6.3). First we calculate F ( y ) .

f

y

Z F( y) 

0



0.02 1 −

g

f

g

u du 100



 2 y 0  y 2

u − 1− 100





1− 1−

100 y 2 S( y)  1 − 100 f g Z 100  y 2 E ( Y − 10)+  1− dy 100 10



100

−

 y  3 100 1− 3 100 10

100 (0.93 )  24.3  3

!

E[Z]  0.8 E ( Y − 10)+  (0.8)(24.3)  19.44

f

g

(B)

9.8. Average claim payment per loss is 0.75 E[X ∧ 20,000] − E[X ∧ 500] .





2000 * 2000 E[X ∧ 20,000]  1− −0.5 22,000

,

2000 * 2000 E[X ∧ 500]  1− −0.5 2500

,

! −0.5

! −0.5

+  9266.50 -

+  472.14 2000 0.5

 0.8944 to get claim payment We must divide average claim payment per loss by 1 − F (500)  2500 per claim. Average claim payment is 0.75 (9266.50 − 472.14) /0.8944  7374.30 .

9.9. Did you understand (ii) properly? Statement (ii) does not say that the insurance covers 100% of the loss for losses above 10,000. It says that insurance pays 100% of the amount of the loss above 10,000. For a loss of 11,000, the company would pay 80% of the first 10,000 and 100% of the next 1000, resulting in a payment of 9000, not 11,000. If the loss variable is X, the payment per loss variable X L is equal to X − 0.2 ( X ∧ 10,000) , since 20% of the loss below 10,000 is deducted from the loss. For a single-parameter Pareto, the formulas for E[X] and E[X ∧ d] are E[X] 

αθ α−1

E[X ∧ d]  E[X] − Therefore E[X] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(4)(5000) 3

θα ( α − 1) d α−1



20,000 3

EXERCISE SOLUTIONS FOR LESSON 9

173

20,000 19,375 50004  − 3 3 3 3 (10,000) 0.2 ( 19,375 ) 20,000 −  5375 E[X L ]  3 3

E[X ∧ 10,000] 

9.10. The question asks for the expected value of the payment random variable P, or the expected payment per payment. That is E[X ∧ 25,000] − E[X ∧ 5,000] E[P]  1 − F (5,000) When X has a uniform distribution on [0, θ], the limited expected value at u can be calculated as a weighted average: the probability of X ≤ u times u/2 plus the probability of X > u time u: u u u E[X ∧ u]  + 1− (u ) θ 2 θ

!





Applying this formula, E[X ∧ 25,000]  0.5 (12,500) + 0.5 (25,000)  18,750 E[X ∧ 5,000]  0.1 (2,500) + 0.9 (5,000)  4,750

The denominator is 1 − F (5,000)  0.9. Therefore, the expected payment per payment is E[P] 

18,750 − 4,750  15555 59 1 − 0.1

(B)

9.11. The payment per loss is X ∧ 1000 − X ∧ 100 + 100 1 − F (100) . The payment per payment is the payment per loss divided by S (100) . So we compute





E[X ∧ 1000] − E[X ∧ 100] 300 − 75 + 100  + 100  662.5 S (100) 1 − 0.6

9.12.

In 2000, we need E[X ∧ 22] − E[X ∧ 11]. In 2001, we need

E[1.1X ∧ 22] − E[1.1X ∧ 11]  1.1 E[X ∧ 20] − E[X ∧ 10]





We therefore proceed to calculate the four limited expected values.

E[X ∧ 22]  −0.025 (222 ) + 1.475 (22) − 2.25  18.1

E[X ∧ 11]  −0.025 (112 ) + 1.475 (11) − 2.25  10.95

E[X ∧ 20]  −0.025 (202 ) + 1.475 (20) − 2.25  17.25

The ratio is

E[X ∧ 10]  −0.025 (102 ) + 1.475 (10) − 2.25  10

1.1 E[X ∧ 20] − E[X ∧ 10]





1.1 (17.25 − 10)  1.115385 (D) E[X ∧ 22] − E[X ∧ 11] 18.1 − 10.95 The calculations can be shortened a bit by leaving out −2.25 from each calculation, since it’s a constant. 9.13.



θ for this exponential is 5000.

E[X ∧ 25,000] − E[X ∧ 500]  5000 (1 − e −5 ) − (1 − e −0.1 )



 4490

5000 − 4490  510 C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)



9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

174

Plan Payment 2500 2000 1500 1000 500

250

2250

Loss Size

5100

Figure 9.1: Plan payments as a function of loss in exercise 9.14

9.14.

The four components (i)–(iv) of the formula have the following expected plan payments:

(i)

Insured pays 100% of the costs up to 250: 0 (The plan pays nothing for these losses.)

(ii)

Insured pays 25% of the costs between 250 and 2250: 0.75 E[X ∧ 2250] − E[X ∧ 250] .

(iii) (iv)





Insured pays 100% of the costs between 2250 and u, where u is the loss for which the payment is 3600: 0.  Insured pays 5% of the excess over u: 0.95 E[X] − E[X ∧ u]) .

Let’s determine u. u is clearly above 2250. For a loss of 2250, the insured pays 250 + 0.25 (2250 − 250)  750. The insured then pays 100% until the loss is 3600, so the insured’s payment for a loss u ≥ x ≥ 2250 is 750 + 100% ( x − 2250)  x − 1500, and the payment is 3600 when x − 1500  3600, or x  5100, so we conclude u  5100. A graph of plan payments as a function of loss is shown in Figure 9.1. Combining the four components of expected plan payments, we have 0.75 E[X ∧ 2250] − E[X ∧ 250] + 0.95 E[X] − E[X ∧ 5100]









Let’s calculate the needed limited and unlimited expected values. E[X]  2000

2000  222.22 2250   2000 E[X ∧ 2250]  2000 1 −  1058.82 4250   2000 E[X ∧ 5100]  2000 1 −  1436.62 7100





E[X ∧ 250]  2000 1 −

The insurance pays 0.75 E[X ∧ 2250] − E[X ∧ 250] + 0.95 E[X] − E[X ∧ 5100]









 0.75 (1058.82 − 222.22) + 0.95 (2000 − 1436.62)  1162.66

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

EXERCISE SOLUTIONS FOR LESSON 9

175

9.15. The 90th percentile of the Pareto for the ground-up losses may be read off the tables (VaR0.9 ( X ) ), or may be calculated; it is the x determined from:

!2

500  1 − 0.9  0.1 500 + x √ 500  0.1 500 + x 500 500 + x  √  1581.14 0.1 x  1081.14 We need the partial expectation of payments for losses above 1081.14. The partial expectation of the amount of a payment for the loss above 1081.14, given that the loss is greater than 1081.14 is 0.8 E[X ∧ 2250] − E[X ∧ 1081.14] + 0.95 E[X] − E[X ∧ 2250]









We compute the needed limited expected values. E[X]  500

500  341.89 1581.14   500 E[X ∧ 2250]  500 1 −  409.09 2750





E[X ∧ 1081.14]  500 1 −

so the partial expectation of the amount above 1081.14 is 0.8 (409.09 − 341.89) + 0.95 (500 − 409.09)  140.1245. Dividing by 1 − p  0.1, the conditional expectation of the payment given that a loss is greater than 1081.14 is 1401.245. In addition, the insurer pays 0.8 (1081.14 − 250)  664.91 on each loss above 1081.14. Therefore, the Tail-Value-at-Risk of annual payments is 1401.245 + 664.91  2066.16 . 9.16. The policy limit is the most that will be paid. To pay 500,000, claims are needed, so the answer is (C). 9.17.

500,000 0.8

+ 5000  630,000 of covered

In the first year, the payment per loss is



E[X ∧ 5000]  2000 1 −

2000  1428.57 7000



In the second year, the new θ is 1.04 (2000)  2080 and the payment per loss, considering 80% coinsurance, is 0.8 (E[X ∧ 5000] − E[X ∧ 100])  0.8 (2080)



2080 2080 −  1098.81 2180 7080



The ratio is 1098.81/1428.57  0.7692. The decrease is 1 − 0.7692  0.2308 . (B)

9.18. The payment per payment for a franchise deductible is the deductible (500) plus the payment per payment for an ordinary deductible, so we will subtract 500 and equate the payment per payment to 2000. E ( X ∧ 10,000) − E ( X ∧ 500)  2000 1 − F (500) !   10,000 500 ( θ + 500) 2 θ −  2000 10,000 + θ 500 + θ θ2

C/4 Study Manual—17th edition Copyright ©2014 ASM

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

176

(500 + θ )

2



10,000 500  2000θ − 10,000 + θ 500 + θ



 (500 + θ ) 10,000 (500 + θ ) − 500 (10,000 + θ )  2000θ (10,000 + θ ) 

(500 + θ )(10,000θ − 500θ )  2000θ (10,000 + θ ) 9500θ (500 + θ )  2000θ (10,000 + θ ) 95 (500 + θ )  20 (10,000 + θ ) 75θ  200,000 − 500 (95)  152,500 θ  2033 13

9.19. It’s strange that this question is easier than the similar one from the previous exam (the next question). Expected amount per payment is expected amount per loss divided by the probability of a payment, so from (ii) and (iii), the probability of a payment on policy A is 6,500 Pr ( X > 5000) 6,500 Pr ( X > 5000)   0.65 10,000 10,000 

(E)

X > 5000 and policy B paying exactly 5000 are equivalent. 9.20. From the previous question’s solution, we know that Pr ( X > 5000)  0.65. Since policy A and policy B cover the entire loss, their expected payments per loss must add up to 11,100. Since the expected payment per loss under policy A is 6,500, the expected payment per loss under policy B is 11,100 − 6,500  4,600. Let x be the payment per loss given that the loss is less than 5000. By double expectation, Pr ( X ≤ 5000) x + Pr ( X > 5000)(5000)  4600

Then

0.35x + 0.65 (5000)  4600 1350 x  3857 17 (D) 0.35 9.21. Let X be the loss, Y the payment, and Z the payment before coinsurance. Since the payment is Y  0.8Z, its variance is Var ( Y )  0.82 Var ( Z ) . The straightforward way of doing this problem is calculating first and second moments of Z. Since X is uniform on [0, 10,000], Z is a mixture of a variable which is the constant 0 with weight 0.1 and uniform on [0, 9000] with weight 0.9. The probability that a loss leads to a payment is 0.9, and the average payment given that a payment is made is the midpoint of the range or 4500. Therefore the mean is E[Z]  0.1 (0) + 0.9 (4500)  4050 The second moment of a uniform random variable on [0, x] is x 2 /3, so the second moment of the mixture is ! 90002  24,300,000 E[Z 2 ]  0.1 (02 ) + 0.9 3 Then Var ( Z )  24,300,000 − 40502  7,897,500 and Var ( Y )  0.82 (7,897,500)  5,054,400 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 9

177

An alternative is to use the conditional variance formula (formula (4.2) on page 64). Let I be the indicator variable on whether the loss is greater than 1000 or not. Then Z | X < 1000  0, and Z | X > 1000 is a uniform random variable on (0, 9000) with mean 4500 and variance 90002 /12. So Var ( Z )  E Var ( Z | I ) + Var E[Z | I]

f

g

and Var ( Z | I ) is Bernoulli with value 0 or 90002 /12, so





90002  6,075,000 E Var ( Z | I )  0.1 (0) + 0.9 12

f

g

!

while E[Z | I] is either 0 or 4500, so using the Bernoulli variance formula,

Var E[Z | I]  45002 (0.9)(0.1)  1,822,500



The answer is therefore



0.82 (6,075,000 + 1,822,500)  5,054,400

9.22. We mentioned on page 100 that E[ ( X − d )+ | X > d] has a Pareto distribution with parameters α and θ + d. Here, that means the parameters are α  3 and θ  1000 + 1500  2500. Let X be the random variable for loss, Y L the random variable for payment per loss, and p  Pr ( X > 1500) the probability of a loss above the deductible. The expected payment per loss is a mixture of 0 with weight 1 − p and a Pareto with weight p. The Pareto for payment per payment, which we’ll call Y P , has parameters α  3 and θ  2500.  α The probability of a loss above the deductible is p  θ/ ( θ + d )  (1000/2500) 3  0.064. The L expected value of Y is ! 2500 L P E[Y ]  p E[Y ]  p  0.064 (1250)  80 3−1

The second moment of Y L is

f

L 2

E (Y ) The variance of Y L is

9.23.

g

f

P 2

 p E (Y )

g

2 (25002 )  0.064  400,000 (2)(1)

!

Var ( Y L )  400,000 − 802  393,600

If X 0 is the inflated loss, then X 0  1.25X, and the payment per loss is E[X 0 ∧ 5000] − E[X 0 ∧ 1000]  E[1.25X ∧ 5000] − E[1.25X ∧ 1000]

 1.25 E[X ∧ 4000] − 1.25 E[X ∧ 800]  1.25 (792 − 300)  615

Quiz Solutions 9-1. Let Y P be the payment per payment random variable. E[X ∧ 500]  1000 (1 − e −0.5 )  393.47

E[X ∧ 2500]  1000 (1 − e −2.5 )  917.92

1 − F (500)  e −0.5  0.606531 917.92 − 393.47 E[Y P ]   864.66 0.606531

C/4 Study Manual—17th edition Copyright ©2014 ASM

178

C/4 Study Manual—17th edition Copyright ©2014 ASM

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

Lesson 10

Bonuses Reading: Loss Models Fourth Edition 8.2–8.5 Exams in the early 2000’s used to have bonus questions. To my knowledge, they no longer appear on exams, but these questions give you some practice in manipulating limited expected losses. However, you may skip this lesson. A bonus question goes like this: An agent receives a bonus if his loss ratio is below a certain amount, or a hospital receives a bonus if it doesn’t submit too many claims. Usually the losses in these questions follow a two-parameter Pareto distribution with α  2, to make the calculations simple. As I mentioned earlier, the formulas for E[X ∧ d] are difficult to use for distributions other than exponential, Pareto, and lognormal. To work out these questions,  express the bonus in terms of the earned premium; it’ll always be something like max 0, c ( rP − X ) , where r is the loss ratio, P is earned premium, X is losses, and c is some constant. Then you can pull out crP and write it as crP − c min ( rP, X )  crP − c ( X ∧ rP ) . You then use the tables to calculate the expected value. Since the Pareto with α  2 is used so often for this type of problem, let’s write the formula for E[X ∧ d] down for this distribution: d E[X ∧ d]  θ d+θ

!

Example 10A Aggregate losses on an insurance coverage follow a Pareto distribution with parameters α  2, θ  800. Premiums for this coverage are 500. The loss ratio, R is the proportion of aggregate losses to premiums. If the loss ratio is less than 70%, the insurance company pays a dividend of 80% of premiums times the excess of 70% over the loss ratio. Calculate the expected dividend. Answer: If we let X be aggregate losses, then R  X/500 and the dividend is X (0.8)(500)(0.7 − R )  0.8 (500) 0.7 −  0.8 (350 − X ) 500 when this is above 0. But we have





max 0, 0.8 (350 − X )  0.8 max (0, 350 − X )





 0.8 max (350 − 350, 350 − X )  0.8 350 − min (350, X )



 280 − 0.8 ( X ∧ 350)



So we calculate E[X ∧ 350], using the formulas from the Loss Models appendix. θ θ * 1− E[X ∧ 350]  α−1 θ + 350

 ,

! α−1

800 800 + 350 (800)(350)   243.48 1150  800 1 −

C/4 Study Manual—17th edition Copyright ©2014 ASM

179



+ -

10. BONUSES

180

The expected dividend is then 280 − 0.8 (243.48)  85.22 . The dividend can be graphed as a function of aggregate losses by noting that 1. If aggregate losses are 0, the loss ratio is 0 and the dividend is (0.8)(0.7 − 0)(500)  280.

2. In order for the formula to develop a dividend of zero (before flooring at 0), the loss ratio must be 0.7, which means losses must be (0.7)(500)  350. 3. The dividend formula is linear in aggregate losses, so the graph is a straight line between these two points.

The graph is drawn below. Dividend 280

350

Aggregate losses

You may be able to write down the dividend formula, 280 − 0.8 ( X ∧ 350) , just by drawing the graph. If not, use the algebraic approach. 

Exercises 10.1. An insurance company purchases reinsurance. Reinsurance premiums are 800,000. The reinsurance company reimburses the insurance company for aggregate reinsured losses and also pays an experience refund to the insurance company if experience is good. Aggregate reinsured losses follow a Pareto distribution with parameters α  2, θ  500,000. The experience refund is equal to 80% of the excess of reinsurance premiums over aggregate reinsured losses, minus a charge of 200,000. The charge is applied after multiplying the excess by 80%. However, the experience refund is never less than 0. The net cost of the reinsurance contract to the insurance company is the excess of reinsurance premiums over reinsurance payment of losses and experience refunds. Determine the expected net cost to the insurance company of the reinsurance contract. 10.2. An agent gets a bonus based on product performance. His earned premium is 500,000. Losses follow an exponential distribution with mean 350,000. His bonus is equal to earned premium times a proportion of the excess of 70% over his loss ratio, but not less than zero. The loss ratio is defined as losses divided by earned premium. The proportion is set so that the expected bonus is 20,000. Determine the proportion.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 10

10.3.

181

[3-S00:25] An insurance agent will receive a bonus if his loss ratio is less than 70%. You are given:

(i) His loss ratio is calculated as incurred losses divided by earned premium on his block of business. (ii) The agent will receive a percentage of earned premium equal to 1/3 of the difference between 70% and his loss ratio. (iii) The agent receives no bonus if his loss ratio is greater than 70%. (iv) His earned premium is 500,000. (v) His incurred losses have the following distribution: F (x )  1 −

!3

600,000 , x + 600,000

x>0

Calculate the expected value of his bonus. (A) 16,700

(B) 31,500

(C) 48,300

(D) 50,000

(E) 56,600

10.4. [3-F00:27] Total hospital claims for a health plan were previously modeled by a two-parameter Pareto distribution with α  2 and θ  500. The health plan begins to provide financial incentives to physicians by paying a bonus of 50% of the amount by which total hospital claims are less than 500. No bonus is paid if total claims exceed 500. Total hospital claims for the health plan are now modeled by a new Pareto distribution with α  2 and θ  K. The expected claims plus the expected bonus under the revised model equals expected claims under the previous model. Calculate K. (A) 250

(B) 300

(C) 350

(D) 400

(E) 450

10.5. [3-F02:37] Insurance agent Hunt N. Quotum will receive no annual bonus if the ratio of incurred losses to earned premiums for his book of business is 60% or more for the year. If the ratio is less than 60%, Hunt’s bonus will be a percentage of his earned premium equal to 15% of the difference between his ratio and 60%. Hunt’s annual earned premium is 800,000. Incurred losses are distributed according to the Pareto distribution, with θ  500,000 and α  2. Calculate the expected value of Hunt’s bonus. (A) 13,000

(B) 17,000

(C) 24,000

(D) 29,000

(E) 35,000

10.6. [SOA3-F03:3] A health plan implements an incentive to physicians to control hospitalization under which the physicians will be paid a bonus B equal to c times the amount by which total hospital claims are under 400 (0 ≤ c ≤ 1).

The effect the incentive plan will have on underlying hospital claims is modeled by assuming that the new total hospital claims will follow a two-parameter Pareto distribution with α  2 and θ  300. E ( B )  100. Calculate c. (A) 0.44

(B) 0.48

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.52

(D) 0.56

(E) 0.60

Exercises continue on the next page . . .

10. BONUSES

182

10.7. To encourage maximum performance, your company has a profit sharing program. Your bonus is based on GAAP operating earnings of the company (expressed in millions) as follows: (i) If GAAP operating earnings are 90 or less, your bonus is 0. (ii) If GAAP operating earnings are P and P > 90, then your bonus is equal to your salary times (P − 90) /100. (iii) There is no maximum for the bonus. GAAP operating earnings follow a single-parameter Pareto distribution with parameters θ  80 and α  8. Your salary is 50,000. Calculate your expected bonus. 10.8. An insurance agent receives no bonus if her loss ratio is higher than 70%. Otherwise, she receives a bonus of 25% of her earned premium times the excess of 70% over her loss ratio, defined as losses divided by earned premium, but no more than 10% of her earned premium. Losses follow a two-parameter Pareto distribution with α  2 and θ  500,000. The agent’s earned premium is 600,000. Calculate the expected value of her bonus. Additional released exam questions: CAS3-F05:22

Solutions 10.1. It may help to draw a graph showing the reinsurance company’s payment as a function of aggregate losses. If aggregate losses are zero, the experience refund would be 0.8 (800,000−0) −200,000  440,000. In order for the experience refund formula to generate 0, 80% of the difference between premiums (800,000) and claims would have to be equal to the 200,000 charge, or 0.8 (800,000 − x )  200,000

640,000 − 0.8x  200,000 0.8x  440,000 x  550,000

The experience refund is a linear function, so we can draw a straight line between x  0 and x  550,000. Figure 10.1 plots the reinsurance company’s payment, namely x plus the experience refund, with the experience refund shaded. We will now derive this algebraically. If aggregate reinsured losses is X, then the experience refund is the maximum of 0 and 80% of the excess of 800,000 over X minus 200,000, or max 0, 0.8 (800,000 − X ) − 200,000





If we let R be the expected value of the experience refund, then R is R  E max (0, 0.8 (800,000 − X ) − 200,000)

f

 E max (0, 640,000 − 0.8X − 200,000)

f

 E max (0, 440,000 − 0.8X )

f

g

g

g

 E max (440,000 − 440,000, 440,000 − 0.8X )

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

EXERCISE SOLUTIONS FOR LESSON 10

183

Reinsurance payment in thousands 800

600 440 400

200

0

0

200

550 600

400

800

Losses in thousands

Figure 10.1: Reinsurance payment in exercise 10.1

 E 440,000 − min (440,000, 0.8X )

f

 440,000 − E[0.8X ∧ 440,000]

 440,000 − 0.8 E[X ∧ 550,000]

g

by linearity of expected value

and this looks like our graph. Using the Pareto formulas from the Loss Models appendix, we have E[X]  E[X ∧ 550,000] 

500,000 θ   500,000 α−1 1 θ θ * 1− α−1 θ+x

! α−1

, 

+ -

500,000 500,000 1− 1 500,000 + 550,000 ! 550,000  500,000  261,905 1,050,000 



R  440,000 − 0.8 (261,905)  230,476

The reinsurance company receives 800,000 and pays X plus the experience refund, so the expected net cost is 800,000 − E[X] − R  800,000 − 500,000 − 230,476  69,524 10.2. Let the loss variable be X. The base on which the agent is paid is max (350,000 − X, 0)  350,000 − X ∧ 350,000. 350,000 − E[X ∧ 350,000]  350,000 − 350,000 + 350,000e −1  128,757.80.

The proportion is then

C/4 Study Manual—17th edition Copyright ©2014 ASM

20,000  0.1553 128,757.80

10. BONUSES

184

10.3.

The distribution is Pareto with α  3, θ  600,000. The bonus is 1 1 max (0, 350,000 − X )  (350,000 − X ∧ 350,000) 3 3

We calculate E[X ∧ 350,000].

!2

600,000 * 600,000 + E[X ∧ 350,000]  1−  180,332.41 2 950,000

,

-

1 E[bonus]  (350,000 − 180,332.41)  56,555.86 3 10.4.

The bonus is

1 2

(E)

max (0, 500 − X )  250 − 12 X ∧ 500. Current expected claims are 500. We have

1 500K 500  K + 250 − E[X ∧ 500]  K + 250 − 2 500 + K 250K 250  K − 500 + K (500 + K )( K − 250)  250K 1 2

K 2 + 250K − 250 (500)  250K

K 2  250 (500)  125,000 K  353.55

10.5.

(C)

The bonus is 0.15 max (0, 480,000 − X )  0.15 (480,000 − X ∧ 480,000) . 480,000  244,898 E[X ∧ 480,000]  500,000 980,000

!

0.15 (480,000 − 244,898)  35,265 10.6.

(E)

The bonus is 400 − X ∧ 400. We have 100  E ( B )  c 400 − E[X ∧ 400]





! 400 + 1600c * / 100  c .400 − 300 700 7 , c 10.7.

7  0.4375 16

(A)

Let your bonus be X. X  500 max ( P − 90, 0)



 500 P − min ( P, 90)







E[X]  500 E[P] − E[P ∧ 90]





808 8 (80) 8 (80)  500 − +  2505.50 7 7 7 (907 )

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

!

EXERCISE SOLUTIONS FOR LESSON 10

185

Bonus in thousands 60

0

0

180

Losses in thousands

420

Figure 10.2: Bonus in exercise 10.8

10.8. The bonus is 60,000 if losses are below 180,000, and 0.25 (420,000 − X ) if losses, X, are between 180,000 and 420,000. The bonus is graphed in Figure 10.2. The best way to get an expression for the value of the bonus is to observe that the bonus is 60,000 minus one-quarter of the amount of the loss between 180,000 and 420,000. The expected value of the amount of the loss between 180,000 and 420,000 (this is the amount that an insurance policy with a deductible of 180,000 and a maximum covered loss of 420,000 would pay) is E[X ∧ 420,000] − E[X ∧ 180,000] so the value of the bonus is

60,000 − 0.25 E[X ∧ 420,000] + 0.25 E[X ∧ 180,000]

If you did not observe this, you could derive it algebraically. The expected value of the bonus is 60,000F (180,000) + 0.25

Z

420,000 180,000

(420,000 − x ) f ( x ) dx

In working this out, an important observation is that d

Z E[X ∧ d] 

0

x f ( x ) dx + d 1 − F ( d )





and therefore d

Z 0

b

Z a

x f ( x ) dx  E[X ∧ d] − d 1 − F ( d )





x f ( x ) dx  E[X ∧ b] − E[X ∧ a] − b 1 − F ( b ) + a 1 − F ( a )









Answer  60,000F (180,000) + 0.25 (420,000) F (420,000) − F (180,000)



− 0.25 E[X ∧ 420,000] − 420,000 1 − F (420,000)







+ 0.25 E ( X ∧ 180,000) − 180,000 1 − F (180,000)







 60,000 − 0.25 E[X ∧ 420,000] + 0.25 E[X ∧ 180,000]

Now we evaluate this expression.

Answer  60,000 − 0.25 (500,000) C/4 Study Manual—17th edition Copyright ©2014 ASM

420,000 180,000 + 0.25 (500,000) 920,000 680,000

!

!



10. BONUSES

186

 60,000 − 0.25 (228,261) + 0.25 (132,353)  36,023

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 11

Discrete Distributions Reading: Loss Models Fourth Edition 6

11.1

The ( a, b, 0) class

Discrete distributions are useful for modeling frequency. Three basic distributions are Poisson, negative binomial, and binomial. Probabilities, moments, and similar information for them are in Loss Models Appendix B, which you get at the exam. Let p n  Pr ( N  n ) . A Poisson distribution with parameter λ > 0 is defined by p n  e −λ

λn n!

λ>0

The mean and variance are λ. A sum of n independent Poisson P random variables N1 ,. . . , Nn with parameters λ1 , . . . ,λ n has a Poisson distribution whose parameter is ni1 λ i . A negative binomial distribution with parameters r and β is defined by n−1+r pn  n

!

1 1+β

!r

β 1+β

!n

β > 0, r > 0

The mean is rβ and the variance is rβ (1 + β ) . Note that the variance is always greater than the mean. The above parametrization is unusual; you may have learned a k, p parametrization in your probability course. However, this parametrization is convenient when we obtain the negative binomial as a gamma mixture of Poissons in the next lesson. Also note that r must be greater than 0, but need not be an integer. The general definition of a binomial coefficient for any real numerator and a non-negative integer denominator is x x ( x − 1) · · · ( x − n + 1)  n n!

!

The appendix to the textbook (which you get on the exam) writes out this product rather than using a binomial coefficient, but I will occasionally use the convenient binomial coefficient notation. A special case of the negative binomial distribution is the geometric distribution, which is a negative binomial distribution with r  1. By the above formulas, p0 

1 1+β

and β p n−1 pn  1+β

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

187

11. DISCRETE DISTRIBUTIONS

188

so the probabilities follow a geometric progression with ratio β/ (1 + β ) . To calculate Pr ( N ≥ n ) , we can sum the geometric series up: Pr ( N ≥ n ) 

∞ X in

1 1+β

!

β 1+β

!i

β  1+β

!n

(11.1)

Therefore, the geometric distribution is the discrete counterpart of the exponential distribution, and has a similar memoryless property. Namely Pr ( N ≥ n + k | N ≥ n )  Pr ( N ≥ k )

A sum of n independent negative binomial random variables N1 ,. . . ,Nn P having the same β and parameters r1 ,. . . ,r n has a negative binomial distribution with parameters β and ni1 r i . One way to remember that you need the same β’s is that the variance of the sum has to be the sum of the variances, and this wouldn’t work with summing β’s since the variance is a quadratic in β, namely rβ (1 + β ) . A binomial distribution with parameters m and q is defined by m n pn  q (1 − q ) m−n n

!

m a positive integer, 0 < q < 1

and has mean mq and variance mq (1−q ) . Thus its variance is less than its mean. Also, it has finite support; it is zero for n > m. A sum of n independent binomial random variables N1 ,. .P . ,Nn having the same q and parameters m 1 ,. . . ,m n has a binomial distribution with parameters q and ni1 m i . One way to remember that you need the same q’s is that the variance of the sum has to be the sum of the variances, and this wouldn’t work with summing q’s since the variance is a quadratic in q, mq (1 − q ) . Given the mean and variance of a distribution, you can back out the parameters. Moreover, you can tell which distribution to use by seeing whether the variance is greater than the mean (negative binomial), equal (Poisson), or less than the mean (binomial). So quickly, if one of these distributions has mean 10 and variance 30, which distribution is it and what are its parameters? (Answer below1) These three frequency distributions are the complete set of distributions in the ( a, b, 0) class. This class is defined by the following property. If we let p k  Pr ( X  k ) , then, pk b a+ p k−1 k β

a is positive for the negative binomial distribution (how ironic!) and equal to 1+β ; it is 0 for the Poisson; and it is negative for the binomial distribution. Thus, given that a random variable follows a distribution from the ( a, b, 0) class, the sign of a determines which distribution it follows. For the Poisson distribution, b  λ. For the geometric distribution, which is a special case of the negative binomial distribution, b  0. The formulas for a and b all appear in the tables you are given on the exam, so you don’t have to memorize them. For convenience, they are duplicated in Table 11.1 on page 190. For a distribution in the ( a, b, 0) class, the values of p k for three consecutive values of k allow you to determine the exact distribution, since you can set up two equations in two unknowns (a and b). Similarly, knowing p k for two pairs of consecutive values allows determination of a and b. Example 11A Claim frequency follows a distribution in the ( a, b, 0) class. You are given that (i) The probability of 4 claims is 0.066116. (ii) The probability of 5 claims is 0.068761. (iii) The probability of 6 claims is 0.068761. (Answer: Negative binomial, r  5, β  2)

1

C/4 Study Manual—17th edition Copyright ©2014 ASM

11.1. THE ( a, b, 0) CLASS

189

Calculate the probability of no claims. Answer: First we solve for a and b b 5 b a+ 6 b 30 b a+



0.068761 0.066116

1

0.068761 − 1  0.0400 0.066116  1.2 b a  1 −  0.8 6 

Positive a means that this is a negative binomial distribution. We now solve for r and β. a  0.8 

β 1+β

β4 b  ( r − 1)

β  ( r − 1) a 1+β

b  2.5 a !r 1  0.22.5  0.01789 p0  1+β r 1+

?



Quiz 11-1 For a discrete distribution in the ( a, b, 0) class, you are given (i) The probability of 2 is 0.258428. (ii) The probability of 3 is 0.137828. (iii) The probability of 4 is 0.055131. Determine the probability of 0. Sometimes the ( a, b, 0) formula will be more convenient for computing successive probabilities than computing them directly. This is particularly true for the Poisson distribution, for which a  0 and b  λ; for a Poisson with λ  6, if you already know that p 3  0.089235, then compute p4  p3 ( λ/4)  1.5 (0.089235)  0.133853.

Moments The tables you get on the exam provide means and variances for the discrete distributions we’ve discussed. If you need to calculate higher moments, the tables recommend using the probability generating function (the pgf). The factorial moments are derivatives of the pgf. See page 12 for the definition of factorial moments. For an example of calculation of moments using the pgf, see Example 1F, in which we calculated the third raw moment for a negative binomial using the pgf. Here’s another example: Example 11B Calculate the coefficient of skewness of a Poisson distribution with mean λ. Answer: The probability generating function for a Poisson is, from the tables, P ( z )  e λ ( z−1) C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

190

Table 11.1: Formula summary for members of the ( a, b, 0) class

Poisson

Binomial

Negative binomial n+r−1 n

1 1+β

!r

Geometric

!n

βn (1 + β ) n+1

pn

e −λ λn!

m n q (1 − q ) m−n n

Mean

λ

mq



β

Variance

λ

mq (1 − q )

rβ (1 + β )

β (1 + β )

a

0



β 1+β

β 1+β

b

λ

( m + 1)

n

!

q 1−q

q 1−q

!

( r − 1)

β 1+β

β 1+β

0

Differentiating three times, which means bringing a lambda down each time, P 000 ( z )  λ3 e λ ( z−1) P 000 (1)  λ3 That is the third factorial moment. The second raw moment is the mean squared plus the variance, or λ 2 + λ. Therefore, E[X ( X − 1)( X − 2) ]  λ3

E[X 3 ] − 3 E[X 2 ] + 2 E[X]  λ3

E[X 3 ]  λ3 + 3 ( λ2 + λ ) − 2λ  λ3 + 3λ 2 + λ

By formula (1.2), the third central moment is E[ ( X − λ ) 3 ]  E[X 3 ] − 3 E[X 2 ]λ + 2λ 3

 λ3 + 3λ 2 + λ − 3 ( λ2 + λ )( λ ) + 2λ 3 λ

The coefficient of skewness, the third central moment over the variance raised to the 1.5, is λ/λ1.5  √ 1/ λ .  The textbook’s distribution tables mention the following recursive formula for factorial moments for any ( a, b, 0) class distribution in terms of a and b: µ( j) 

( a j + b ) µ ( j−1) 1−a

Setting j  1 results in a formula for the mean. If N is the random variable having the ( a, b, 0) distribution, a+b 1−a a+b Var ( N )  (1 − a ) 2 E[N] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(11.2) (11.3)

11.2. THE ( a, b, 1) CLASS

191

These formulas are not directly provided in the distribution tables you get at the exam. Instead, more general formulas are provided on page 10, in the first paragraph of Subsection B.3.1. If you set p0  0 in those formulas, you get the formulas mentioned here. If you’re going to memorize any of them, formula (11.2) is the more important, in that it may save a little work in a question like the following. Example 11C For a random variable N following a distribution in the ( a, b, 0) class, you are given (i) Pr ( N  2)  0.03072 (ii) Pr ( N  3)  0.04096 (iii) Pr ( N  4)  0.049152 Calculate E[N]. Answer: Back out a and b. b 0.04096 4   3 0.03072 3 b 0.049152 a+   1.2 4 0.04096   4 2 b  − 1.2  12 3 15 24 b  1.6 15 1.6 a  1.2 −  0.8 4 a+

Then the mean is E[N] 

?

a + b 0.8 + 1.6   12 1−a 1 − 0.8



Quiz 11-2 For a distribution from the ( a, b, 0) class, you are given that p1  0.256, p2  0.0768, and p3  0.02048. Determine the mean of the distribution.

11.2

The ( a, b, 1) class

Material on the ( a, b, 1) class was originally in Course 3 in the 2000 syllabus, but was removed in 2002. When they moved severity, frequency, and aggregate loss material to Exam C/4 in 2007, they added material on the ( a, b, 1) class back to the syllabus. They’ve expanded the formula tables you get at the exam to include information about ( a, b, 1) distributions. There were no questions on the ( a, b, 1) class (or the ( a, b, 0) class for that matter) on the Spring 2007 exam, the only released exam after the syllabus change. However, students have reported at least two distinct ( a, b, 1) questions from the question bank used in recent exams. So this topic has some probability of appearing on your exam. Often, special treatment must be given to the probability of zero claims, and the three discrete distributions of the ( a, b, 0) class are inadequate because they do not give appropriate probability to 0 claims. The ( a, b, 1) class consists of distributions for which p0 is arbitrary, but the ( a, b, 0) relationship holds above 1; in other words pk b  a + for k  2, 3, 4, . . . p k−1 k but not for k  1. C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

192

One way to obtain a distribution in this class is to take one from the ( a, b, 0) class and truncate it at 0, i.e., make the probability of 0 equal to zero, and then scale all the other probabilities so that they add up to 1. To avoid confusion, let p n be the probabilities from the ( a, b, 0) class distribution we start with and p Tn be the probabilities of the new distribution. We let p 0T  0, and multiply the probabilities p n , n ≥ 1, by 1 1−p0 so that they sum up to 1; in other words, p Tn 

pn for n > 0. 1 − p0

These distributions are called zero-truncated distributions. Now let’s use the notation p kM for the probabilities of the new distribution. We can generalize the above c so that they add up to 1. by assigning p0M some constant 1 − c and then multiplying p n , n ≥ 1, by 1−p 0 Such a distribution is called a zero-modified distribution, with the zero-truncated distributions a special case where c  1. Comparing truncated and modified probabilities: p nM  (1 − p0M ) p Tn

n>0

Example 11D For a distribution from the ( a, b, 1) class, p1M  0.4, p2M  0.2, p3M  0.1. Determine p0M . Answer: We will use superscript M for the probabilities of this distribution, and unsuperscripted variables will denote probabilities of the related ( a, b, 0) distribution. Since

p2M

p1M

 0.5  a + 2b and

p 3M p 2M

 0.5  a + 3b , we conclude that b  0 and a  0.5. This is a zero-modified

geometric distribution, since b  0 implies the distribution is geometric. For a non-modified geometric β β 1 distribution, a  1+β , so β  1 and for the geometric p0  1+β  0.5 and p1  (1+β ) 2  0.25. Then the ratio of each modified probability to the ( a, b, 0) probability is of c as 1 − p0M , 1 − p0M



1 − p0M



1 − p0

0.5 p0M

c 1−p 0 ,

so

c 1−p 0



p 1M p1



0.4 0.25

 85 . By the definition

8 5

8 5  0.2



The zero-truncated geometric distribution has the special property that it is the same as an unmodified geometric distribution shifted by 1. This means that its mean is 1 more than the mean of an unmodified distribution, or 1+ β, and its variance is the same as the variance of an unmodified distribution, or β (1+ β ) .

?

Quiz 11-3 For a random variable N with a zero-truncated geometric distribution has p 1T  0.6. Calculate Pr ( N > 10) . In addition to modifications of ( a, b, 0) distributions, the ( a, b, 1) class includes an extended negative binomial distribution with −1 < r < 0; the negative binomial distribution only allows r > 0. This extended distribution is called the ETNB (extended truncated negative binomial) , even though it may be zeromodified rather than zero-truncated. C/4 Study Manual—17th edition Copyright ©2014 ASM

11.2. THE ( a, b, 1) CLASS

193

Example 11E A random variable N follows an extended truncated negative binomial distribution. You are given: Pr ( N  0)  0 Pr ( N  1)  0.6195 Pr ( N  2)  0.1858 Pr ( N  3)  0.0836 Calculate the mean of the distribution. Answer: Set up the equations for a and b. b 2 b a+ 3 b 6 b

a+

0.1858  0.3 0.6195 0.0836   0.45 0.1858 

 −0.15  −0.9

a  0.75

Then solve for r and β, using the formulas in the tables. a  0.75 

β 1+β

0.75 3 1 − 0.75 b  −0.9  ( r − 1) a −0.9 r 1+  −0.2 0.75

β

Using the formula for the mean in the tables, E[N] 

rβ −0.2 (3)   1.8779 −r 1 − (1 + β ) 1 − 40.2



Taking the limit of r → 0, we get the logarithmic distribution; the zero-truncated form has one parameter β and probability function

 p Tn



β/ (1 + β )

n

n ln (1 + β )

for n  1, 2, 3, . . .

and the zero-modified form is obtained by setting p0M  1 − c and multiplying all other probabilities by c. This distribution is called logarithmic for good reason; if we let u  u n−1 

β 1+β ,

then each probability

 n . This sequence has a logarithmic pattern (think of the Taylor expansion of − ln (1 − u ) ). For the ETNB with −1 < r < 0 and taking the limit of β → ∞, the distribution is sometimes called the Sibuya distribution. No moments exist for it. This distribution is not listed separately in the tables, so let’s discuss the truncated version of it a little. To calculate p1T , the formula in the tables, factoring out 1 + β in the denominator, is rβ   p 1T  (1 + β ) (1 + β ) r − 1 p nM

p1M

When β → ∞, then β/ (1 + β ) → 1. Since r < 0, it follows that (1 + β ) r → 0. So we have p1T  −r for a Sibuya. Also note that a  1 and b  r − 1. C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

194

Example 11F A zero-truncated discrete random variable follows a Sibuya distribution with r  −0.5. Calculate p Tk for k  1, . . . , 5. Answer: We have p 1T  −r  0.5, a  1, and b  r − 1  −1.5, so

1.5  0.5 (0.25)  0.125 2   1.5 p3T  0.125 1 −  0.125 (0.5)  0.0625 3   1.5 p4T  0.0625 1 −  0.0625 (0.625)  0.0390625 4   1.5 p5T  0.0390625 1 −  0.0390625 (0.7)  0.02734375 5





p 2T  0.5 1 −



You should be able to go back and forth between a and b and the parameters of the distribution, using the tables. The means and variances of the zero-truncated distributions are given in the tables. Zero-modified distributions are a mixture of a zero-truncated distribution with weight c and a degenerate distribution at zero (one which has p0  1) with mean and variance 0 with weight 1 − c. Therefore, if the mean of the zero-truncated distribution is m and the variance v, the mean of the zero-modified distribution is cm. Letting I be the condition of being in the first or second component of the mixture, the variance, by the conditional variance formula, is Var ( N )  Var E[N | I] + E Var ( N | I )  Var (0, m ) + E[0, v]  c (1 − c ) m 2 + cv





f

g

where Var (0, m ) was computed by the Bernoulli shortcut (page 54). Here is a summary of the mean and variance formulas for zero-modified distributions: For N a zero-modified random variable, (11.4)

E[N]  cm 2

Var ( N )  c (1 − c ) m + cv

(11.5)

where • c is 1 − p 0M .

• m is the mean of the corresponding zero-truncated distribution. • v is the variance of the corresponding zero-truncated distribution. Anyhow, the tables include instructions for calculating the mean and variance of zero-modified distributions. Example 11G For a discrete random variable N, you are given (i) p k  Pr ( N  k ) . (ii) p0  0.8 (iii) p k  p k−1 / (4k ) for k > 1 Calculate the mean and variance of the distribution. Answer: a  0 making this a zero-modified Poisson. b  λ  14 . For the zero-truncated distribution, the mean is λ 0.25   1.130203 −λ 1 − e −0.25 1−e C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 11

195

and the variance is λ (1 − ( λ + 1) e −λ ) 0.25 (1 − 1.25e −0.25 ) 0.00662476    0.135395 0.0489291 (1 − e −0.25 ) 2 (1 − e −λ ) 2 The mean of the zero-modified Poisson is E[N]  (1 − p0 )(1.130203)  0.2 (1.130203)  0.22604 and the variance is Var ( N )  (0.2)(0.8)(1.1302032 ) + (0.2)(0.135395)  0.23146 

Exercises ( a, b, 0) class 11.1. I.

[4B-S92:1] (1 point) Which of the following are true? The mean of the binomial distribution is less than the variance.

II.

The mean of the Poisson distribution is equal to the variance.

III.

The mean of the negative binomial distribution is greater than the variance.

(A) I only (B) II only (C) III only (E) The correct answer is not given by (A) , (B) , (C) , or (D) . 11.2.

(D) I, II, and III

For a distribution in the ( a, b, 0) class, you are given that p1  0.6p0 and p2  0.4p1 .

Determine p0 . 11.3.

For a distribution from the ( a, b, 0) class you are given p2 /p1  0.25 and p4 /p 3  0.225.

Determine p2 . 11.4. [4B-F98:2] (2 points) The random variable X has a Poisson distribution with mean n − 12 , where n is a positive integer greater than 1. Determine the mode of X. (A) n − 2

(B) n − 1

(C) n

(D) n + 1

(E) n + 2

11.5. For a distribution from the ( a, b, 0) class, you are given that p1  0.02835, p2  0.1323, and p 3  0.3087. Determine p4 . 11.6. For an auto collision coverage, frequency of claims follows a distribution from the ( a, b, 0) class. You are given: (i) The probability of 1 claim is 0.19245. (ii) The probability of 2 claims is 0.09623. (iii) The probability of 3 claims is 0.05346. Determine the average number of claims.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

11. DISCRETE DISTRIBUTIONS

196

11.7.

[3-S01:25] For a discrete probability distribution, you are given the recursion relation p (k ) 

2 × p ( k − 1) , k

k  1, 2, . . .

Determine p (4) . (A) 0.07

(B) 0.08

(C) 0.09

(D) 0.10

(E) 0.11

11.8. [3-F02:28] X is a discrete random variable with a probability function which is a member of the ( a, b, 0) class of distributions. You are given: (i) (ii)

P ( X  0)  P ( X  1)  0.25 P ( X  2)  0.1875

Calculate P ( X  3) . (A) 0.120

(B) 0.125

(C) 0.130

(D) 0.135

(E) 0.140

11.9. [CAS3-F03:14] The Independent Insurance Company insures 25 risks, each with a 4% probability of loss. The probabilities of loss are independent. On average, how often would 4 or more risks have losses in the same year? (A) (B) (C) (D) (E)

Once in 13 years Once in 17 years Once in 39 years Once in 60 years Once in 72 years

11.10. [CAS3-F04:22] An insurer covers 60 independent risks. Each risk has a 4% probability of loss in a year. Calculate how often 5 or more risks would be expected to have losses in the same year. (A) (B) (C) (D) (E)

Once in 3 years Once in 7 years Once in 11 years Once in 14 years Once in 17 years

11.11. [CAS3-F03:12] A driver is selected at random. If the driver is a “good” driver, he is from a Poisson population with a mean of 1 claim per year. If the driver is a “bad” driver, he is from a Poisson population with a mean of 5 claims per year. There is equal probability that the driver is either a “good” driver or a “bad” driver. If the driver had 3 claims last year, calculate the probability that the driver is a “good” driver. (A) (B) (C) (D) (E)

Less than 0.325 At least 0.325, but less than 0.375 At least 0.375, but less than 0.425 At least 0.425, but less than 0.475 At least 0.475

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 11

197

11.12. [CAS3-F03:18] A new actuarial student analyzed the claim frequencies of a group of drivers and concluded that they were distributed according to a negative binomial distribution and that the two parameters, r and β, were equal. An experienced actuary reviewed the analysis and pointed out the following: “Yes, it is a negative binomial distribution. The r parameter is fine, but the value of β is wrong. Your parameters indicate that 91 of the drivers should be claim-free, but in fact, 94 of them are claim-free.” Based on this information, calculate the variance of the corrected negative binomial distribution. (A) 0.50

(B) 1.00

(C) 1.50

(D) 2.00

(E) 2.50

11.13. [CAS3-S04:32] Which of the following statements are true about the sums of discrete, independent random variables? 1.

The sum of two Poisson random variables is always a Poisson random variable.

2.

The sum of two negative binomial random variables with parameters ( r, β ) and ( r 0 , β0 ) is a negative binomial random variable if r  r 0.

3.

The sum of two binomial random variables with parameters ( m, q ) and ( m 0 , q 0 ) is a binomial random variable if q  q 0.

(A) (B) (C) (D) (E)

None of 1, 2, or 3 is true. 1 and 2 only 1 and 3 only 2 and 3 only 1, 2, and 3

11.14. [CAS3-F04:23] Dental Insurance Company sells a policy that covers two types of dental procedures: root canals and fillings. There is a limit of 1 root canal per year and a separate limit of 2 fillings per year. The number of root canals a person needs in a year follows a Poisson distribution with λ  1, and the number of fillings a person needs in a year is Poisson with λ  2. The company is considering replacing the single limits with a combined limit of 3 claims per year, regardless of the type of claim. Determine the change in the expected number of claims per year if the combined limit is adopted. (A) (B) (C) (D) (E)

No change More than 0.00, but less than 0.20 claims At least 0.20, but less than 0.25 claims At least 0.25, but less than 0.30 claims At least 0.30 claims

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

11. DISCRETE DISTRIBUTIONS

198

11.15. [4-F01:36] For an insurance policy, you are given: (i) The policy limit is 1,000,000 per loss, with no deductible. (ii) Expected aggregate losses are 2,000,000 annually. (iii) The number of losses exceeding 500,000 follows a Poisson distribution. (iv) The claim severity distribution has Prf(Loss > 500,000)  g0.0106 E min (Loss; 500,000)  20,133 E min (Loss; 1,000,000)  23,759

f

g

Determine the probability that no losses will exceed 500,000 during 5 years. (A) 0.01 11.16.

(B) 0.02

(C) 0.03

(D) 0.04

(E) 0.05

For a discrete probability distribution, you are given that p k  p k−1



1 1 − k 10



k>0

Calculate the mean of the distribution. 11.17. For a frequency distribution in the ( a, b, 0) class, you are given (i) (ii) (iii)

p k  0.0768 p k+1  p k+2  0.08192 p k+3  0.0786432

Determine the mean of this distribution. 11.18. [3-F00:13] A claim count distribution can be expressed as a mixed Poisson distribution. The mean of the Poisson distribution is uniformly distributed over the interval [0, 5]. Calculate the probability that there are 2 or more claims. (A) 0.61

(B) 0.66

(C) 0.71

(D) 0.76

(E) 0.81

11.19. The number of insurance applications arriving per hour varies according to a negative binomial distribution with parameters r  6 and β. The parameter β varies by hour according to a Poisson distribution with mean 4. Within any hour, β is constant. Using the normal approximation, estimate the probability of more than 600 applications arriving in 24 hours.

( a, b, 1) class 11.20. The random variable N follows a zero-modified Poisson distribution. You are given: (i) Pr ( N  1)  0.25. (ii) Pr ( N  2)  0.10. Calculate the probability of 0. 11.21. A zero-truncated geometric distribution has a mean of 3. Calculate the probability of 5. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 11

199

11.22. A claim count distribution can be expressed as an equally weighted mixture of two logarithmic distributions, one with β  0.2 and one with β  0.4. Determine the variance of claim counts. 11.23. A claim count distribution has a zero-truncated binomial distribution with m  4, q  0.2. Determine the probability of 2 or more claims. Use the following information for questions 11.24 and 11.25: For a zero-modified ETNB distribution, you are given (i) (ii) (iii)

p1M  0.72 p2M  0.06

p3M  0.01.

11.24. Determine the probability of 0. 11.25. Determine the variance of the distribution. 11.26. For a zero-modified geometric random variable N, (i) (ii)

E[N]  3 E[N 2 ]  63

Calculate Pr ( N  0) . 11.27. A random variable follows a zero-truncated Poisson distribution with λ  0.8. Calculate the third raw moment of the distribution. 11.28. For a zero-modified logarithmic distribution, p2M  0.2 and p3M  0.1. Determine p0M .

11.29. A discrete random variable follows a zero-modified Sibuya distribution. You are given that p1M  0.15 and p 2M  0.06. Determine p0M .

11.30. [3-S00:37] Given: (i) (ii)

p k denotes the probability that the number of claims equals k for k  0, 1, 2, . . . pn m!  , m ≥ 0, n ≥ 0 pm n!

Using the corresponding zero-modified claim count distribution with p0M  0.1, calculate p1M . (A) 0.1

(B) 0.3

(C) 0.5

(D) 0.7

(E) 0.9

Additional released exam questions: CAS3-S05:15,16,28 SOA M-S05:19, CAS3-S06:31, CAS3-F06:23

Solutions 11.1. For a binomial, E[N]  mq > Var ( N )  mq (1 − q ) since q < 1. For a Poisson, mean and variance are both λ. For a negative binomial, E[N]  rβ < Var ( N )  rβ (1 + β ) , since β > 0. (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

200

11.2.

Plug into the ( a, b, 0) class defining formula: a + b  0.6 b a +  0.4 2 a  0.2 b  0.4

a is positive so the distribution is negative binomial.

β 1+β

 0.2, so β  0.25. Then plugging into the formula

for b, r  3. p0  (1.25) −3  0.512 . 11.3.

We solve for a and b. b 2 b 0.225  a + 4 b  0.1 0.25  a +

a  0.2

Positive a means the distribution is negative binomial. b  0.5 a r  1.5

r−1

β  0.2 1+β 2.5 p2  (0.8) 1.5 (0.2) 2  0.05367 2

!

11.4. For a Poisson distribution, a  0 and b  λ, so p n  ( λ/n ) p n−1 . We want to know when λ/n < 1, since then p n < p n−1 and n − 1 is the mode. But λ/n < 1 means n > λ. So the mode will occur at the greatest integer less than λ. Here, λ  n − 12 , and the greatest integer less than n − 21 is n − 1 . (B) 11.5.

We solve for a and b.

p2 14  a+ p1 3 p3 7  a+ p2 3 b  14

b 2 b 3

a−

7 3

Negative a means the distribution is binomial q 7  3 1−q



b  − ( m + 1) a

q  0.7 ⇒

m

p4  5 (0.74 )(0.3)  0.36015 C/4 Study Manual—17th edition Copyright ©2014 ASM

14 −15 7/3

EXERCISE SOLUTIONS FOR LESSON 11

201

Actually, we do not have to back out m and q. We can use the ( a, b, 0) relationship to get p4 : p4  a + 11.6.

b 7 14 p3  − + (0.3087)  0.36015 4 3 4

!





We solve for a and b. b p2 0.09623 1    2 p1 0.19245 2 b p3 0.05346 5  a+   3 p2 0.09623 9 b 1 5 1  − − 6 2 9 18 6 1 b− − 18 3 1 b 2 a −  2 2 3 a+

By formula (11.2), E[N]  11.7.

Since a  0, this is a Poisson distribution, and b  λ  2. Then p (4)  e −2

11.8.

2/3 − 1/3  1 1 − 2/3

24  0.0902 4!

(C)

We have 0.25 1 0.25 b 0.1875 a+   0.75 2 0.25 1 1 a b 2 2 a+b

It is negative binomial, but we don’t even have to calculate r and β. We can proceed with the ( a, b, 0) class relationship. ! b 2 p3  a + p2  p2  0.125 (B) 3 3 11.9. This is a binomial distribution with m  25, q  0.04, so p 0  0.9625  0.360397. You could calculate the other p k ’s directly, or use the ( a, b, 0) class formula: a−

q  −0.041667 1−q

b  − ( m + 1) a  1.083333

p1  ( a + b ) p0  0.375413

b p2  a + p1  0.187707 2

!

p3  a + C/4 Study Manual—17th edition Copyright ©2014 ASM

b p2  0.059962 3

!

11. DISCRETE DISTRIBUTIONS

202

1 − p0 − p1 − p2 − p3  0.016522 So 4 or more risks would have losses in the same year once in 0.9660

1 0.016522

 60.53 years. (D)

11.10. This is binomial with m  60, q  0.04, p 0   0.086352. As in the previous problem, we can use the ( a, b, 0) class rather than calculating the probabilities directly. a−

q  −0.041667 1−q

b  − ( m + 1) a  2.541667

p1  ( a + b )(0.086352)  0.215881 p2  a +

b (0.215881)  0.265353 2

!

b p3  a + (0.265353)  0.213757 3

!

b (0.215881)  0.126918 p4  a + 4

!

Pr ( N ≥ 5)  1 − 0.086352 − 0.215881 − 0.265353 − 0.213757 − 0.126918  0.091738 The answer is once every 11.11.

1 0.091738

 10.9006 years. (C)

By Bayes’ Theorem, since Pr (good)  Pr (bad)  0.5, Pr (good | 3 claims) 

Pr (good) Pr (3 claims | good) Pr (good) Pr (3 claims | good) + Pr (bad) Pr (3 claims | bad)

Pr (3 claims | good)  e

−1

13  0.06131 3!

!

53 Pr (3 claims | bad)  e  0.14037 3! 0.06131 Pr (good | 3 claims)   0.3040 0.06131 + 0.14037 −5

!

(A)

r

1 11.12. p 0  1+β  19 . “By inspection”, since r  β, they both equal 2—that’s what the official solution says. You could try different values of r to find r  2 by trial and error. Then, since r doesn’t change, we need

!2

1 4  1+β 9 1 2  1+β 3 1 β 2 Var ( N )  rβ (1 + β )  1.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

EXERCISE SOLUTIONS FOR LESSON 11

203

11.13. If you didn’t remember which ones are true, you could look at the probability generating functions. For a Poisson, P ( z )  e λ ( z−1) . When you multiply two of these together, you get P ( z )  e ( λ1 +λ2 )( z−1) , which has the same form. For a negative binomial, P ( z )  [1 − β ( z − 1) ]−r . If the β’s are the same, then multiplying two of these together means summing up r’s, but you get the same form. On the other hand, if the β’s are different, multiplying 2 of these together does not get you a function of the same form, so 2 is false. For a binomial, P ( z )  [1 + q ( z − 1) ]m , so multiplying two with the same q results in adding the m’s and you get something of the same form. (C) 11.14. Before the change, the probability of 0 root canals is e −1 , so the expected number of root canals paid for is 1 − e −1  0.6321. The probability of 0 fillings is e −2 and the probability of 1 filling is 2e −2 , so the expected number of fillings paid for is p1 + 2 Pr ( N > 1)  2e −2 + 2 1 − e −2 − 2e −2  2 − 4e −2  1.4587





After the change, the combined annual number of fillings and root canals follows a Poisson distribution with mean 3. The expected number of paid claims is the probability of one event, plus twice the probability of two events, plus three times the probability of more than two events, where an event can be either a root canal or a filling and occurs at the rate of 3 per year: p 1 + 2p 2 + 3 Pr ( N > 2)  3e −3 + 2 (4.5e −3 ) + 3 1 − e −3 − 3e −3 − 4.5e −3  3 − 13.5e −3  2.3279





The change in the expected number of claims per year is 2.3279 − 0.6321 − 1.4587  0.2371 . (C)

11.15. Since aggregate losses are 2,000,000 and the average loss size is 23,759 (20,133 is irrelevant), the average number of losses per year is 2,000,000 23,759  84.1786. Since the probability of a loss above 500,000 is 0.0106, the average number of such losses is 84.1786 (0.0106)  0.8923. In 5 years, the average number of such losses is 5 (0.8923)  4.4615. They have a Poisson distribution, so the probability of none is e −4.4615  0.0115 . (A) 11.16.

pk p k−1

1  a + bk , so a  − 10 making this a binomial, and b  1. By formula (11.2),

E[N] 

−0.1 + 1 9  1 + 0.1 11

11.17. Dividing subsequent p k ’s, we have 16 b a+ 15 k+1 b 1a+ k+2 24 b a+ 25 k+3 Subtracting the second equation from the first and the third equation from the second 1 b  15 ( k + 1)( k + 2)  k+1 k+3  15 25 25k + 25  15k + 45 C/4 Study Manual—17th edition Copyright ©2014 ASM

1 25 ( k

+ 2)( k + 3)

11. DISCRETE DISTRIBUTIONS

204

k2 1 15 (3)(4)

 0.8 b 0.8 a 1− 1−  0.8 k+2 2+2

b

By formula (11.2),

0.8 + 0.8  8 1 − 0.8 11.18. We’ll calculate the probability of 0 claims or 1 claim. For a Poisson distribution with mean λ, this is e −λ (1 + λ ) . Now we integrate this over the uniform distribution: E[N] 

Z 0 5

Z 0

5

5

Z

Pr ( N < 2)  0.2

0

e −λ dλ +

5

Z

λe −λ dλ

0

!

e −λ dλ  1 − e −5

λe

−λ

dλ 

5 −λe −λ 0

+

5

Z

e −λ dλ

0 −5

 −5e −5 + 1 − e

 1 − 6e −5

Pr ( N < 2)  0.2 (1 − e −5 ) + (1 − 6e −5 )  0.2 2 − 7e −5  0.3906





Pr ( N ≥ 2)  1 − 0.3906  0.6094





(A)

11.19. Let S be the number of applications in 24 hours. The expected value of S is E[S]  24 6 (4)  576





To calculate the variance of S, we use the conditional variance formula, equation 4.2 on page 64, for one hour’s applications, by conditioning on β. Let S1 be the number of applications in one hour. Var ( S1 )  Var E[S1 | β] + E Var ( S1 | β )





f

g

Given β, the number of applications in one hour is negative binomial with mean rβ  6β and variance rβ (1 + β )  6β (1 + β ) . Var ( S1 )  Var (6β ) + E 6β (1 + β )

f

g

 62 Var ( β ) + 6 E[β] + E[β 2 ]





The parameter β is Poisson with mean 4 and variance 4 E[β]  Var ( β )  4 E[β2 ]  Var ( β ) + E[β]2  4 + 42  20 Var ( S1 )  36 (4) + 6 (4 + 20)  288 The variance of S is 24 times the variance of S1 . Var ( S )  24 (62 )(4) + 24 (6)(4 + 20)  24 (288) Using the normal approximation with continuity correction: 600.5 − 576  1 − Φ (0.29)  1 − 0.6141  0.3859 Pr ( S > 600)  1 − Φ √ (288)(24)

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 11 M 11.20. Since p kM  p k−1





λ k

205

for a zero-modified Poisson, we have λ 0.1  (0.25) 2

!

λ  0.8 For the corresponding zero-truncated distribution, p1T  so

λ 0.8  0.652773  e λ − 1 e 0.8 − 1 p1M  (1 − p0M ) p1T

0.25  (1 − p0M )(0.652773) 0.25 1 − p0M   0.382982 0.652773 p0M  0.617018 11.21. A zero-truncated geometric is a geometric shifted over by 1, or starting at 1 instead of at 0. Some authors consider this to be the true geometric distribution. (This is not true for the other truncated distributions. A zero-truncated negative binomial is not, in general, the same as a negative binomial distribution shifted by 1.) Therefore, the mean is 1 more than the mean of a geometric with the same parameter. The parameter of a geometric is its mean, so the parameter of the unshifted distribution here is 3 − 1  2. Then p5T  p 4 

β4 16   0.06584 5 243 (1 + β )

11.22. The means and variances of the two components of the mixture (N1 being the component with β  0.2 and N2 being the component with β  0.4) are

E[N2 ] 

0.2 1.2 − 0.2/ ln (1.2)



0.2 E[N1 ]   1.096963 ln 1.2

Var ( N1 ) 



ln 1.2  0.4 1.4 − 0.4/ ln (1.4)

 0.113028



0.4  1.188805 ln 1.4

Var ( N2 ) 

ln 1.4

 0.251069

So the second moments are E N12  0.113028 + 1.0969632  1.316356 and E N22  0.251069 + 1.1888052  1.664328. The mean of the mixture is

f

g

f

g

0.5 (1.096963 + 1.188805)  1.142884 and the second moment is

0.5 (1.316356 + 1.664328)  1.490342.

The variance is 1.490342 − 1.1428842  0.184157 . 3

)(0.8 )  0.693767. This is easy enough to figure out if you know the 11.23. The probability of 1 claim is 4 (0.2 1−0.84 binomial even without the tables, since the numerator is the non-truncated binomial and the denominator is the complement of the non-truncated binomial’s probability of 0. Therefore the probability of 2 or more claims is 1 − 0.693767  0.306233 . C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

206

11.24. We back out a and b. b 0.72 0.06  a + 2

!

0.01  a +

b 0.06 3 1  12 1  6

!

b 0.06  2 0.72 b 0.01 a+  3 0.06 b 1 − 6 12 1 b− 2 1 1/2 1 a +  12 2 3

a+

Then r − 1 

b a

 − 23 so r  − 21 . a  p1T 

(1 +

Since p1M  0.72, we have c 

β 1+β

 13 , so β  21 . This implies that

rβ −1/4 −0.25 √   0.908249 − (1 + β ) 3/2 − 3/2 −0.275255

β ) r+1

0.72 0.908249

 0.79274, so p0M  1 − 0.79274  0.20726 .

11.25. The distribution is a mixture of a degenerate distribution at 0, weight 0.20726 (as determined in the previous problem), which has mean 0 and variance 0, and a zero-truncated distribution with weight 0.79274. Using conditional variance, the variance will be 0.79274 times the variance of the zerotruncated distribution plus the variance of the means of the two components, which is (Bernoulli shortcut) (0.79274)(0.20726) times the mean of the zero-truncated distribution squared. The mean and variance of the zero-truncated distribution N are E[N] 

rβ (−0.5)(0.5) −0.25    1.11237 √ −r 1 − (1 + β ) −0.224745 1 − 1.5 rβ (1 + β ) − (1 + β + rβ )(1 + β ) −r



Var ( N ) 



1 − (1 + β ) −r 2  √  −0.5 (0.5) 1.5 − 1.25 1.5  √ (1 − 1.5) 2 0.0077328   0.15309 0.050510



The answer is (0.79273)(0.15309) + (0.79274)(0.20726)(1.112372 )  0.32467 . 11.26. Using the given information about the first two moments, Var ( N )  63 − 32  54. The zero-truncated geometric distribution corresponding to our zero-modified distribution is equivalent to an unmodified geometric distribution shifted 1, and the mean of an unshifted geometric distribution is β, so the mean of a zero-truncated geometric distribution is 1 + β, and its variance is the same as the variance of an unshifted geometric distribution, or β (1 + β ) . For a zero-modified distribution with p0M  1 − c, the mean is cm and the variance is c (1 − c ) m 2 + cv, using m and v of the zero-truncated distribution. Here, m  1 + β and v  β (1 + β ) , so the two equations C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 11

207

for mean and variance are: c (1 + β )  3 c (1 − c )(1 + β ) 2 + cβ (1 + β )  54 and we now solve for c in terms of β from the first equation and substitute into the second equation. 3 (1 + β ) + 3β  54 3 1− 1+β

!

1−

3 (1 + β ) + β  18 1+β

!

1 + β − 3 + β  18 2β  20

β  10 3 c 11 The answer is p0M  1 − c  8/11 .

11.27. The probability generating function of a zero-truncated Poisson, from the tables, is P (z )  Setting λ  0.8 and differentiating three times,

e λz − 1 eλ − 1

0.8e 0.8z e 0.8 − 1 0 P (1)  1.452773

P0 ( z ) 

P 00 (1)  0.8P 0 (1)  1.162218 P 000 (1)  0.8P 00 (1)  0.929775 These are the first three factorial moments. If N is the zero-truncated Poisson variable, then it follows that E[N]  1.452773 and E[N ( N − 1) ]  1.162218, so E[N 2 ]  1.452773 + 1.162218  2.614991. Then E[N ( N − 1)( N − 2) ]  0.929775

E[N 3 ] − 3 E[N 2 ] + 2 E[N]  0.929775

E[N 3 ]  0.929775 + 3 (2.614991) − 2 (1.452773)  5.8692 11.28. For a zero-modified logarithmic distribution, p kM 

0.1 

2 3

( u )(0.2) and u  43 . Then u  p2T 

implying that c 

p 2M p 2T



C/4 Study Manual—17th edition Copyright ©2014 ASM

0.2 0.202879

β 1+β

u M k p1

so β  3. We then have

for some u, so p3  23 up2 which implies

β2 9   0.202879 2 (1 + β ) 2 ln (1 + β ) 2 (16) ln 4

 0.985809 and p0M  1 − c  0.014191 .

11. DISCRETE DISTRIBUTIONS

208

11.29. Since a  1 in a Sibuya, the ( a, b, 1) equation is b 0.06 1+ 0.15 2 so b  −1.2. Then since b  r − 1 in a Sibuya, it follows that r  −0.2. In a Sibuya p 1T  −r, so p1T  0.2. We are given that p1M  0.15, so c  0.75 and p 0M  1 − c  0.25 . 11.30. You may recognize this as zero-modified Poisson with λ  1 because the pattern 1!1 , 2!1 , 3!1 satisfies pn )!  ( n−1  n1 , so a  0 and λ  b  1. Since condition (ii). If not, just take m  n − 1 and you have p n−1 n! p0M  0.1, c  1 − p0M  0.9. Then (letting p1 be the probability of 1 for a non-modified Poisson) p1T 

p1 e −1 0.3679    0.5820 −1 −1 0.6321 1−e 1−e

and p 1M  cp 1T  0.9 (0.5820)  0.5238 (C)

Quiz Solutions 11-1.

The ratios are b 3 b a+ 4 b 12 b

a+

p3 0.137828   0.533333 p2 0.258428 p4 0.055131    0.4 p3 0.137828 

 0.533333 − 0.4  0.133333

 12 (0.133333)  1.6 1.6 0 a  0.4 − 4

Therefore the distribution is Poisson with λ  b  1.6, and p 0  e −1.6  0.201897 . 11-2. b 2 b a+ 3 b 6 b

a+

0.0768  0.3 0.256 0.02048 8   0.0768 30 1  30  0.2 

a  0.2 The mean is E[N] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

a + b 0.2 + 0.2   0.5 1−a 1 − 0.2

QUIZ SOLUTIONS FOR LESSON 11

209

11-3. Setting M  N − 1, this is equivalent to calculating Pr ( M > 9) given that p0  0.6 for M. However, M is an unmodified geometric distribution, so p0  1/ (1 + β )  0.6. It follows that β/ (1 + β )  0.4. By formula (11.1), ! 10 β  0.410  0.000104858 Pr ( M > 9)  Pr ( M ≥ 10)  1+β

C/4 Study Manual—17th edition Copyright ©2014 ASM

210

C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

Lesson 12

Poisson/Gamma Reading: Loss Models Fourth Edition 6.3 The negative binomial can be derived as a gamma mixture of Poissons. Assume that in a portfolio of insureds, loss frequency follows a Poisson distribution with parameter λ, but λ is not fixed but varies by insured. Suppose λ varies according to a gamma distribution over the portfolio of insureds. The conditional loss frequency of an insured, if you are given who the insured is, is Poisson with parameter λ. The unconditional loss frequency for an insured picked at random is a negative binomial. The parameters of the negative binomial (r, β) are the same as the parameters of the gamma distribution (α, θ); that is, r  α and β  θ. Loss Models uses an unusual parametrization of the negative binomial in order to make this work. The special case of α  1, the exponential distribution, corresponds to the special case of r  1, the geometric distribution. Since the sum of negative binomials P with parameters r i and β (β the same for all of them) is a negative binomial whose parameters are r  r i and β, if the portfolio of insureds has n exposures, the distribution of total number of claims for the entire portfolio will be negative binomial with parameters nr and β. There used to be an average of one question per exam on the Poisson/gamma model. This topic is now too simple for the exam, and doesn’t appear as often. Still, you should be ready to quickly get the Poisson/gamma question out of the way if it appears. This information may also be useful in connection with Bayesian estimation and credibility (Lesson 47). To make things a little harder, you probably won’t be given the parameters, but instead will be given the mean and variance. The Loss Models appendix has the means and variances of the distributions (actually, the second moment of the gamma, from which you can derive the variance), but you may want to memorize them anyway to save yourself lookup time. For a gamma distribution with parameters α and θ, the mean is αθ and the variance is αθ2 . For a negative binomial distribution, the mean is rβ and the variance is rβ (1 + β ) . Example 12A The number of claims for a glass insurance policy follows a negative binomial distribution with mean 0.5 and variance 1.5. For each insured, the number of claims has a Poisson distribution with mean λ. The parameter λ varies by insured according to a gamma distribution. Determine the variance of this gamma distribution. Answer: Since rβ  0.5 and rβ (1 + β )  1.5, it follows that 1 + β  3, β  2, and r  0.25. Then the gamma distribution has parameters α  0.25 and θ  2. The variance of the gamma is αθ 2  0.25 (22 )  1 . However, a faster way to work this out is to use the conditional variance formula. If we let X be the unconditional number of claims, then Var ( X )  E[Var ( X | λ ) ] + Var (E[X | λ]) X | λ is Poisson, with mean and variance λ, so Var ( X )  E[λ] + Var ( λ ) where the moments are over the gamma distribution. But

f

g

E[X]  E E[X | λ]  E[λ] C/4 Study Manual—17th edition Copyright ©2014 ASM

211

12. POISSON/GAMMA

212

so

Var ( X )  E[X] + Var ( λ )

or In our case, Var ( λ )  1.5 − 0.5  1 .

?

Var ( λ )  Var ( X ) − E[X]

(12.1)



Quiz 12-1 The number of insurance applications arriving per hour follows a Poisson distribution with mean λ. The distribution of λ over all hours is a gamma distribution with α  0.2 and θ  20. Within any hour, λ is constant. Calculate the probability of receiving exactly 2 applications in one hour.

Exercises 12.1. [3-F01:27] On his walk to work, Lucky Tom finds coins on the ground at a Poisson rate. The Poisson rate, expressed in coins per minute, is constant during any one day, but varies from day to day according to a gamma distribution with mean 2 and variance 4. Calculate the probability that Lucky Tom finds exactly one coin during the sixth minute of today’s walk. (A) 0.22

(B) 0.24

(C) 0.26

(D) 0.28

(E) 0.30

Use the following information for questions 12.2 and 12.3: Customers arrive at a bank at a Poisson rate of λ per minute. The parameter λ varies by day. The distribution of λ over all days is exponential with mean 2. 12.2.

Calculate the probability of at least 2 customers arriving in a minute on a random day.

12.3.

Calculate the probability of at least 4 customers arriving in 2 minutes on a random day.

12.4. Customers arrive at a bank at a Poisson rate of λ per minute. The parameter λ varies by minute. The distribution of λ over all minutes is exponential with mean 2. Calculate the probability of at least 4 customers arriving in 2 minutes. 12.5. [4B-S91:31] (2 points) The number of claims a particular policyholder makes in a year has a Poisson distribution with mean λ. λ follows a gamma distribution with variance equal to 0.2. The resulting distribution of policyholders by number of claims is a negative binomial with parameters r and β such that the variance is equal to 0.5. What is the value of r (1 + β ) ? (A) (B) (C) (D) (E)

Less than 0.6 At least 0.6, but less than 0.8 At least 0.8, but less than 1.0 At least 1.0, but less than 1.2 At least 1.2

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 12

12.6.

213

[4B-S95:24] (2 points) You are given the following:



The random variable representing the number of claims for a single policyholder follows a Poisson distribution.



For each class of policyholders, the Poisson parameters follow a gamma distribution representing the heterogeneity of risks within that class.



For four distinct classes of risks, the random variable representing the number of claims of a policyholder, chosen at random, follows a negative binomial distribution with parameters r and β, as follows: r β



Class 1 5.88 1/49

Class 2 1.26 1/9

Class 3 10.89 1/99

Class 4 2.47 1/19

The lower the standard deviation of the gamma distribution, the more homogeneous the class. Which of the four classes is most homogeneous?

(A) Class 1 (B) Class 2 (C) Class 3 (E) Cannot be determined from the given information

(D) Class 4

12.7. [3-S01:15] An actuary for an automobile insurance company determines that the distribution of the annual number of claims for an insured chosen at random is modeled by the negative binomial distribution with mean 0.2 and variance 0.4. The number of claims for each individual insured has a Poisson distribution and the means of these Poisson distributions are gamma distributed over the population of insureds. Calculate the variance of this gamma distribution. (A) 0.20 12.8.

(B) 0.25

(C) 0.30

(D) 0.35

(E) 0.40

[4B-F96:15] (2 points) You are given the following:



The number of claims for a single policyholder follows a Poisson distribution with mean λ.



λ follows a gamma distribution.



The number of claims for a policyholder chosen at random follows a distribution with mean 0.10 and variance 0.15. Determine the variance of the gamma distribution.

(A) 0.05 12.9.

(B) 0.10

(C) 0.15

(D) 0.25

(E) 0.30

For a fire insurance coverage, you are given



Claim frequency per insured on the coverage over the entire portfolio of insureds follows a negative binomial distribution with mean 0.2 and variance 0.5.



For each insured, claim frequency follows a Poisson distribution with mean λ.



λ varies by insured according to a gamma distribution. Calculate the variance of λ over the portfolio of insureds.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

12. POISSON/GAMMA

214

12.10. For an insurance coverage, the number of claims per insured follows a negative binomial. You are given the following information on probabilities of claims: Number of claims

Probability

0 1 2

0.8 0.144 0.03888

Claims for each insured follow a Poisson distribution with mean λ. The distribution of λ over the insureds is a gamma distribution. Determine the variance of the gamma distribution. 12.11. The number of claims on an insurance coverage follows a Poisson distribution with mean λ for each insured. The means λ vary by insured and overall follow a gamma distribution. You are given: (i) The probability of 0 claims for a randomly selected insured is 0.04. (ii) The probability of 1 claim for a randomly selected insured is 0.064. (iii) The probability of 2 claims for a randomly selected insured is 0.0768. Determine the variance of the gamma distribution. 12.12. [4B-F96:26] (2 points) You are given the following: •

The probability that a single insured will produce 0 claims during the next exposure period is e −θ .



θ varies by insured and follows a distribution with density function f ( θ )  36θe −6θ

0 < θ < ∞.

Determine the probability that a randomly selected insured will produce 0 claims during the next exposure period. (A) (B) (C) (D) (E)

Less than 0.72 At least 0.72, but less than 0.77 At least 0.77, but less than 0.82 At least 0.82, but less than 0.87 At least 0.87

12.13. You are given: (i)

The number of claims on an auto comprehensive policy follows a geometric distribution with mean 0.6. (ii) The number of claims for each insured follows a Poisson distribution with parameter λ. (iii) λ varies by insured according to a gamma distribution. Calculate the proportion of insureds for which the expected number of claims per year is less than 1.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 12

215

12.14. [1999 C3 Sample:12] The annual number of accidents for an individual driver has a Poisson distribution with mean λ. The Poisson means, λ, of a heterogeneous population of drivers have a gamma distribution with mean 0.1 and variance 0.01. Calculate the probability that a driver selected at random from the population will have 2 or more accidents in one year. (A) 1/121

(B) 1/110

(C) 1/100

(D) 1/90

(E) 1/81

12.15. [3-S00:4] You are given: (i) The claim count N has a Poisson distribution with mean Λ. (ii) Λ has a gamma distribution with mean 1 and variance 2. Calculate the probability that N  1. (A) 0.19

(B) 0.24

(C) 0.31

(D) 0.34

(E) 0.37

12.16. [3-S01:3] Glen is practicing his simulation skills. He generates 1000 values of the random variable X as follows: (i)

He generates the observed value λ from the gamma distribution with α  2 and θ  1 (hence with mean 2 and variance 2). (ii) He then generates x from the Poisson distribution with mean λ. (iii) He repeats the process 999 more times: first generating a value λ, then generating x from the Poisson distribution with mean λ. (iv) The repetitions are mutually independent. Calculate the expected number of times that his simulated value of X is 3. (A) 75

(B) 100

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 125

(D) 150

(E) 175

Exercises continue on the next page . . .

12. POISSON/GAMMA

216

12.17. [CAS3-F03:15] Two actuaries are simulating the number of automobile claims for a book of business. For the population they are studying: (i) The claim frequency for each individual driver has a Poisson distribution. (ii) The means of the Poisson distributions are distributed as a random variable, Λ. (iii) Λ has a gamma distribution. In the first actuary’s simulation, a driver is selected and one year’s experience is generated. This process of selecting a driver and simulating one year is repeated N times. In the second actuary’s simulation, a driver is selected and N years of experience are generated for that driver. Which of the following is/are true? I.

The ratio of the number of claims the first actuary simulates to the number of claims the second actuary simulates should tend towards 1 as N tends to infinity.

II.

The ratio of the number of claims the first actuary simulates to the number of claims the second actuary simulates will equal 1, provided that the same uniform random numbers are used.

III.

When the variances of the two sequences of claim counts are compared the first actuary’s sequence will have a smaller variance because more random numbers are used in computing it.

(A) (B) (C) (D) (E)

I only I and II only I and III only II and III only None of I, II, or III is true

12.18. [3-F02:5] Actuaries have modeled auto windshield claim frequencies. They have concluded that the number of windshield claims filed per year per driver follows the Poisson distribution with parameter λ, where λ follows the gamma distribution with mean 3 and variance 3. Calculate the probability that a driver selected at random will file no more than 1 windshield claim next year. (A) 0.15

(B) 0.19

(C) 0.20

(D) 0.24

(E) 0.31

12.19. The number of customers arriving in a store in an hour follows a Poisson distribution with mean λ. The parameter λ varies by day, and has a Weibull distribution with parameters τ  1, θ  8.5. Determine the probability of 8 or more customers arriving in an hour. Additional released exam questions: CAS3-S05:10

Solutions 12.1. Since αθ  2 and αθ 2  4, β  θ  2 and r  α  1. Then the probability of finding a coin in one minute (nothing special about the sixth minute) is p1 

C/4 Study Manual—17th edition Copyright ©2014 ASM

rβ β 2    0.2222 2 r+1 9 (1 + β ) 1! (1 + β )

(A)

EXERCISE SOLUTIONS FOR LESSON 12

217

12.2. An exponential distribution is a gamma distribution with α  1. It follows that the corresponding negative binomial distribution is a geometric distribution (r  1) with β  θ  2. Using formula (11.1), Pr ( N ≥ 2) 

β 1+β

!2 

2 3

!2

4 . 9



12.3. Since λ only varies by day, it will not vary over the 2 minute period. Thus conditional on λ, the customer count over 2 minutes is Poisson with mean µ  2λ. Since λ is exponential with mean 2 and exponentials are scale families, µ is exponential with mean 4, and the exponential mixture has a geometric distribution with parameter β  4. Using formula (11.1), β 1+β

Pr ( N ≥ 4) 

!4 

4 5

!4  0.4096

12.4. The number of customers arriving in one minute has a geometric distribution with β  2, as derived two exercises ago. Each minute is independent, and the sum of two of these geometric random variables is a negative binomial with parameters r  2, β  2. (In general, the sum of negative binomial random variables having the same β is obtained by summing the r’s.) The probability of at least 4 is Pr ( N ≥ 4)  1 − p0 − p1 − p 2 − p3

!2

!2

!2

1 2 1 2 3 1 2 1− − − 3 1 3 3 2 3 3 1 4 4 32 1− − − −  0.4609 9 27 27 243

12.5.

!

!

!

!2

4 − 3

!

1 3

!2

2 3

!3

From the variance of the gamma, we have αθ 2  0.2, so rβ 2  0.2

(*)

We want the variance of the negative binomial to equal 0.5, or rβ (1 + β )  0.5  rβ 2 + rβ

(**)

Combining these two equations, rβ  0.3. Plugging this into (*), we get β  2/3. Then from (**), r (1 + β )  0.5/ (2/3)  0.75 . (B) 12.6. We don’t have to take the square root of the variance; we can just compare the variances. The variance of the gamma distribution is αθ2  rβ 2 . The variances are: Class 1 0.002449

Class 2 0.015556

Class 3 0.001111

Class 4 0.006842

Class 3 is lowest. (C) 12.7. As discussed in Example 12A, the variance of the gamma is the overall variance minus the overall mean (equation (12.1)), or 0.4 − 0.2  0.2 . (A) 12.8. As discussed in Example 12A, the variance of the gamma is the overall variance minus the overall mean (equation (12.1)), or 0.15 − 0.10  0.05 . (A) 12.9.

By formula (12.1), the answer is 0.5 − 0.2  0.3 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

12. POISSON/GAMMA

218

12.10. Dividing 0.144 by 0.8 and 0.03888 by 0.144, we have 0.18  a + b b 0.27  a + 2 b  −0.18 a  0.36 

β 1+β

0.36  0.64β β  0.5625  θ r  0.5  α The variance is 0.5 (0.56252 )  0.158203125 . 12.11. First we calculate a and b. 0.0768 b  1.2  a + 0.064 2 0.064  1.6  a + b 0.04 So a  0.8, b  0.8. We now calculate r and β of the negative binomial distribution. β  0.8 ⇒β4 1+β ! β b  ( r − 1)  0.8⇒ r  2 1+β a

The parameter α of the gamma distribution is r of the negative binomial distribution and the parameter θ of the gamma distribution is β of the negative binomial distribution. So α  2, θ  4, and the variance of the gamma is αθ 2  2 (42 )  32 . 12.12. Although we are not given that the model is Poisson, it could be Poisson with parameter θ, so we might as well cheat and assume it. The mixing density f ( θ ) is a gamma with α  2 and θ  1/6, so r  2 and β  1/6 for the negative binomial, and 1 p0  1+β

!r

6  7

!2 

36  0.7347 49

(B)

If we didn’t want to cheat, we would have to calculate the probability directly: Pr ( X  0) 



Z 0

 36

e −θ 36θe −6θ dθ ∞

Z 0

θe −7θ dθ

We recognize the integral as a gamma with parameters α  2 and β  1/7. The constant to make it integrate to 1 is 72 /Γ (2)  49. So we have 36 Pr ( X  0)  49

C/4 Study Manual—17th edition Copyright ©2014 ASM



Z 0

1 Γ (2)

θe 1 2 7

−7θ

dθ 

36  0.7347 49

EXERCISE SOLUTIONS FOR LESSON 12

219

12.13. r  1  α and β  0.6  θ. The parameter λ therefore has an exponential distribution with mean 0.6. The probability that λ is below 1 is F (1)  1 − e −1/0.6  0.8111 .

12.14. Since for the gamma distribution αθ  0.1 and αθ 2  0.01, it follows that θ  0.1 and α  1. Then for the corresponding negative binomial r  1 and β  0.1. It’s a geometric distribution. By formula (11.1), the probability of at least 2 accidents is β Pr ( N ≥ 2)  1+β

!2

0.1  1.1

!2

1 121



(A)

12.15. Since for the gamma distribution αθ  1 and αθ 2  2, it follows that β  θ  2 and r  α  1/2 for the negative binomial distribution. Then 0.5 Pr ( N  1)  1

!

1 1+β

! 0.5

β 1  0.5 1+β 3

!

! 0.5

2  0.1925 3

!

(A)

12.16. The resulting distribution of the simulated values is negative binomial with r  2, β  1. Then 4 3

Pr ( X  3) 

!

1 2

!2

1 2

!3 

1 8

Hence the expected number of 3’s in 1000 times is 1000/8  125 . (C) 12.17. The first actuary is simulating a negative binomial distribution with a mean of E[Λ]. The second actuary is simulating a Poisson distribution with a random mean, namely whatever the mean of the selected driver is. That mean is not necessarily the same as E[Λ] (after all, it is selected randomly), so I is false. Statement II is also false; in fact, it’s hard to say what “the same uniform random numbers are used” means, since the first actuary uses almost twice as many numbers to generate the same number of years. The variances of the random sequences are random numbers, so there’s no way to make a statement like III in general. However, to the extent a statement can be made, one would expect the variance of the first model to be larger, because the negative binomial variance includes both the variance of the driver and the variance of the parameter Λ; as the conditional variance theorem states, the variance of the first model is E[Λ] + Var (Λ) , whereas the variance of the second model is just Var (Λ) . (E) 12.18. Since αθ  3 and αθ2  3, it follows that β  θ  1 and r  α  3. Then 1 p0  1+1 p1  3 p0 + p1 

1 2

5 16

!3

!3 

1 8

1 3  2 16

!

(E)

12.19. A Weibull with τ  1 is an exponential, but an exponential is also a gamma distribution with α  1. Accordingly, the exponential mixture of Poissons is negative binomial with r  1, and in this case β  8.5, or in other words a geometric distribution with β  8.5. The probability of 8 or more is then, by formula (11.1), !8 !8 β 8.5 Pr ( N ≥ 8)    0.4107 1+β 9.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

12. POISSON/GAMMA

220

Quiz Solutions 12-1.

The negative binomial parameters are r  0.2 and β  20, so the probability of 2 applications is1 1.2 p2  2

!

1 1 + 20

! 0.2

20 1 + 20

!2

 0.12 (0.543946)(0.907029)  0.0592

1See page 187 for a discussion on how to evaluate a binomial coefficient in which the numerator is not an integer. C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 13

Frequency Distributions—Exposure and Coverage Modifications Reading: Loss Models Fourth Edition 8.6

13.1

Exposure modifications

Exposure modifications are off the syllabus. But they are easy enough to handle, and you should be able to distinguish between them and coverage modifications, so I will discuss them briefly. Exposure can refer to the number of members of the insured group or the number of time units (e.g., years) they are insured for. Doubling the size of the group or the period of coverage will double the number of claims. If the number of exposures change, how can we adapt our frequency model? Suppose the model is based on n1 exposures, and you now want a model for n2 exposures. If the original model is Poisson with parameter λ, the new model is Poisson with parameter λn2 /n1 . If the original model is negative binomial, the new model is negative binomial with the first parameter, r, multiplied by n2 /n1 . If the original model is binomial, you are supposed to multiply the first parameter, m, by n 2 /n1 , but this is only acceptable if the revised m is still an integer. Otherwise, the binomial model cannot be used with the new exposures.

13.2

Coverage modifications

The topic of coverage modifications is on the syllabus, and exams frequently have questions about this topic. The most common type of coverage modification is changing the deductible, so that the number of claims for amounts greater than zero changes. Another example would be uniform inflation, which would affect frequency if there’s a deductible. If you need to calculate aggregate losses, and don’t care about payment frequency, one way to handle a coverage modification is to model the number of losses (rather than the number of paid claims), in which case no modification is needed for the frequency model. Instead, use the payment per loss random variable, Y L . This variable is adjusted, and will be zero with positive probability. For distributions of the ( a, b, 0) and ( a, b, 1) classes, however, an alternative method is available. The frequency distribution is modified. The modified frequency—the frequency of positive claims—has the same form as the original frequency, but with different parameters. Suppose the probability of paying a claim, i.e., severity being greater than the deductible, is v. If the model for loss frequency is Poisson with parameter λ, the new parameter for frequency of paid claims is vλ. (Having only one parameter makes things simple.) If the original model is negative binomial, the modified frequency is negative binomial with the same parameter r but with β multiplied by v. Notice how this contrasts with exposure modification. If the original model is binomial, once again the modified frequency is binomial with the same parameter m but with q multiplied by v. This time, since q doesn’t have to be an integer, the binomial model can always be used with the new severity distribution. If you need to calculate aggregate losses, C/4 Study Manual—17th edition Copyright ©2014 ASM

221

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

222

the modified frequency is used in conjunction with the modified payment per payment random variable, YP . Example 13A Losses on a major medical coverage on a group of 50 follow a Pareto distribution with parameters α  1 and θ  1000. The number of claims submitted by the group in a year has a negative binomial distribution with mean 16 and variance 20. The group then expands to 60, and a deductible of 250 per loss is imposed. Calculate the variance of claim frequency for the group for the revised coverage. Answer: The original negative binomial parameters are β  0.25 and r  64. The exposure modification 60/50 sends r to r 0  64 (60/50)  76.8. The probability that a loss will be greater than 250 is 1 − FX (250) 

1000  0.8. 1250

The deductible sends β to β0  0.25 (0.8)  0.2. The variance of claim frequency is then 76.8 (0.2)(1.2)  18.432



As the next example demonstrates, it is also possible to adjust claim frequency based on an adjustment to the modification. Example 13B For an auto collision coverage with ordinary deductible 500, ground-up loss amounts follow a Weibull distribution with τ  0.2 and θ  2000. The number of losses follows a geometric distribution. The expected number of claims for non-zero payment amounts per year is 0.3. Calculate the probability of exactly one claim for a non-zero payment amount in one year if the deductible is changed to 1000. Answer: The number of claims over 500 per year is geometric with β  0.3. The number of claims over 1000 per year is also geometric, with a revised β. The revised β can be computed by scaling down the β for a 500 deductible with the ratio of S (1000) /S (500) , where the survival function is for the Weibull distribution. S (500)  e − (500/2000) S (1000)  e

0.2

− (1000/2000) 0.2

Revised β  0.3

 0.468669  0.418721

0.418721  0.268028 0.468669

!

The revised probability of exactly one claim in a year is p1 

?

1 1.268028

!

0.268028  0.1667 1.268028

!

Quiz 13-1 For an insurance coverage, you are given: (i) Losses in 2009 follow an inverse exponential distribution with θ  1000. (ii) Losses are subject to a 500 deductible. (iii) The annual number of losses follows a binomial distribution with m  5. (iv) The expected number of paid claims in 2009 is 0.5. (v) Losses in 2010 are inflated by 10% over losses in 2009. Calculate the variance of the number of paid claims in 2010. C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISES FOR LESSON 13

223

Let’s discuss the ( a, b, 1) class now. Please note that in the following discussion, the word “modified” is used in the sense of an ( a, b, 1) class zero-modified distribution with p kM ; we will say “revised” when referring to the severity modification. The same parameter that gets multipliedPby v in the ( a, b, 0) class gets multiplied by v in the ( a, b, 1) class. The balancing item is then p0M  1 − ∞ k1 p k . The textbook (Table 8.3) gives formulas for directly calculating p0M∗ in all cases. Rather than memorizing that table, use the following formula: 1−

p0M∗

 (1 −

p 0M )

1 − p0∗

!

(13.1)

1 − p0

where asterisks indicate distributions with revised parameters. In other words, Pr ( N > 0) in the revised distribution equals the original Pr ( N > 0) times the ratio of the Pr ( N > 0) ’s for the corresponding unmodified ( a, b, 0) class distributions. This formula works even when the unmodified distribution is improper (so that unmodified probabilities are negative or greater than 1), as in the ETNB family. For the logarithmic distribution, use the following: ln (1 + vβ ) 1 − p0M∗  (1 − p0M ) (13.2) ln (1 + β ) The use of formula (13.1) for an improper distribution is illustrated in the following example: Example 13C Frequency of claims per year follows a zero-modified negative binomial distribution with r  −0.5, β  1, and p0M  0.7. Claim size follows a Pareto with α  1, θ  1000, and is independent of claim frequency. A deductible of 500 is imposed. Calculate the probability of no claims payments in a year. Answer: The probability of a payment given a claim is the Pareto survival function at 500: S (500) 

θ 1000 2   θ + 500 1500 3

The revised negative binomial parameters are r ∗  −0.5, β ∗  2/3. By equation (13.1), 1 − p0M  0.3

1 − p0  1 − 1− 1−

p0∗

p 0M∗

1 1+β

1 1− 5/3

!r

1−

! −0.5

1 2

! −0.5

 −0.4142

 −0.2910

−0.2910  0.3  0.2108 −0.4142

!

p 0M∗  1 − 0.2108  0.7892



Exercises 13.1. The losses on an auto comprehensive coverage follow a Pareto distribution with parameters α  2 and θ  1000. The number of losses follows a Bernoulli distribution with an average of 0.2 losses per year. Loss sizes are affected by 10% inflation. A 250 deductible is imposed. Calculate the variance of the frequency of paid losses after inflation and the deductible. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

224

Table 13.1: Formula summary—Exposure and coverage modifications

Model Poisson Binomiala Negative binomial

Original Parameters Exposure n1 Pr ( X > 0)  1 λ m, q r, β

Exposure Modification Exposure n2 Pr ( X > 0)  1 ( n2 /n1 ) λ ( n2 /n1 ) m, q ( n2 /n1 ) r, β

Coverage Modification Exposure n1 Pr ( X > 0)  v vλ m, vq r, vβ

These adjustments work for ( a, b, 1) distributions as well as ( a, b, 0) distributions. For ( a, b, 1) distributions, p0M is adjusted as follows: ! 1 − p0∗ M∗ M (13.1) 1 − p 0  (1 − p 0 ) 1 − p0 a (n

2 /n 1 ) m

must be an integer for exposure modification formula to work

13.2. The losses on an auto comprehensive coverage follow a Pareto distribution with parameters α  2 and θ  1000. The number of losses follows a geometric distribution with an average of 0.2 losses per year. Loss sizes are affected by 10% inflation. A 250 deductible is imposed. Calculate the variance of the frequency of paid losses after inflation and the deductible. 13.3. Aggregate claim frequency for an employee dental coverage covering 10 individuals follows a negative binomial distribution with mean 2 and variance 5. Loss size follows an exponential distribution with mean 500. The group expands to 20 individuals and a deductible of 100 is imposed. Calculate the probability of 2 or more paid claims from the group after these revisions. 13.4. The number of losses for the insurer follows a Poisson distribution with parameter λ. λ varies by year according to a gamma distribution with parameters α  6, θ  31 . Loss sizes are independent of the number of losses and are lognormally distributed with parameters µ  10, σ  2. A reinsurance agreement provides that the reinsurer reimburses the insurer for the excess of each loss over 1,000,000. Determine the probability that the reinsurer will pay exactly one loss in a year. 13.5. The number of students taking an actuarial exam has a negative binomial distribution with parameters r  10, β  1.5. Each student has a 0.4 probability of passing. Determine the probability that 3 or more students pass. 13.6.

You are given the following information regarding windstorms:

(i)

The frequency of windstorms causing more than 1,000,000 damage before inflation follows a negative binomial distribution with mean 0.2 and variance 0.4. (ii) Uniform inflation of 5% affects the amount of damage. (iii) The severity of windstorms before inflation follows an exponential distribution with mean 1,000,000. Calculate the probability that there will be exactly 1 windstorm in a year causing more than 2,000,000 damage after one year’s inflation.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 13

225

[SOA3-F03:19] Aggregate losses for a portfolio of policies are modeled as follows:

13.7. (i)

The number of losses before any coverage modifications follows a Poisson distribution with mean λ. (ii) The severity of each loss before any coverage modifications is uniformly distributed between 0 and b. The insurer would like to model the impact of imposing an ordinary deductible, d (0 < d < b) on each loss and reimbursing only a percentage, c (0 < c ≤ 1) , of each loss in excess of the deductible. It is assumed that the coverage modifications will not affect the loss distribution.

The insurer models its claims with modified frequency and severity distributions. The modified claim amount is uniformly distributed on the interval [0, c ( b − d ) ]. Determine the mean of the modified frequency distribution.

(A) λ

(B) λc

(C) λ

b d

(D) λ

b−d b

(E) λc

b−d b

[CAS3-F04:17] You are given:

13.8. •

Claims are reported at a Poisson rate of 5 per year.



The probability that a claim will settle for less than $100,000 is 0.9. What is the probability that no claim of $100,000 or more is reported during the next 3 years?

(A) 20.59%

(B) 22.31%

(C) 59.06%

(D) 60.63%

(E) 74.08%

[CAS3-S04:17] Payfast Auto insures sub-standard drivers.

13.9. •

Each driver has the same non-zero probability of having an accident.



Each accident does damage that is exponentially distributed with θ  200.



There is a $100 per accident deductible and insureds only report claims that are larger than the deductible.



Next year each individual accident will cost 20% more.



Next year Payfast will insure 10% more drivers. Determine the percentage increase in the number of reported claims next year.

(A) (B) (C) (D) (E)

Less than 15% At least 15%, but less than 20% At least 20%, but less than 25% At least 25%, but less than 30% At least 30%

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

226

13.10. [SOA3-F04:8] For a tyrannosaur with a taste for scientists: (i) The number of scientists eaten has a binomial distribution with q  0.6 and m  8. (ii) The number of calories of a scientist is uniformly distributed on (7000, 9000) . (iii) The numbers of calories of scientists eaten are independent, and are independent of the number of scientists eaten. Calculate the probability that two or more scientists are eaten and exactly two of those eaten have at least 8000 calories each. (A) 0.23

(B) 0.25

(C) 0.27

(D) 0.30

(E) 0.35

13.11. An insurance coverage is subject to an ordinary deductible of 500. You are given: (i) The number of losses above the deductible follows a binomial distribution with m  10, q  0.05. (ii) Payment sizes follow a paralogistic distribution with α  3 and θ  800. The deductible is raised to 1000. Calculate the probability of making exactly one payment with the revised deductible. 13.12. You are given: (i) An insurance coverage is subject to an ordinary deductible of 500. (ii) The number of losses per year above the deductible follows a negative binomial distribution with r  2, β  0.8. (iii) The severity of each loss before coverage modifications is uniform on (0, 2500]. The deductible is raised to x. The probability of zero claims under the new deductible is 0.390625. Determine x. 13.13. An insurance coverage is subject to an ordinary deductible of 100 during 2010. You are given: (i)

The annual number of losses above 100 in 2010 follows a negative binomial distribution with r  1.5, β  1. (ii) Ground up loss sizes in 2010 follow a lognormal distribution with µ  4, σ  2. (iii) Losses are subject to 10% uniform inflation in 2011. (iv) The deductible is raised to 150 in 2011. Calculate the probability of 3 or more nonzero payments on this insurance coverage in 2011. 13.14. You are given (i) An automobile collision coverage has a 500 deductible. (ii) The number of paid claims follows a negative binomial distribution with parameters r  0.5 and β  0.4. (iii) Ground-up Loss sizes follow a single-parameter Pareto distribution with parameters θ  100 and α  1. Determine the deductible needed in order to reduce the variance of paid claim counts to 0.2.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 13

227

13.15. [151-82-92:9] N2 is the number of claims of size 2 in a compound negative binomial distribution in which: (i) The primary distribution is negative binomial with parameters r  5 and β  1/2. (ii) The claim size distribution is: x Pr ( x ) 1 2 3 4

1/2 1/4 1/8 1/8

Determine Var ( N2 ) . (A) 5/64

(B) 5/16

(C) 5/8

(D) 45/64

(E) 15/4

13.16. You are given: (i) The number of losses follows a zero-modified Poisson distribution with λ  0.2 and p 0M  0.4. (ii) Loss size follows a single-parameter Pareto distribution with θ  100, α  0.5. (iii) An insurance coverage has a deductible of 250. Calculate the probability of one paid claim. 13.17. You are given: (i)

The number of losses follows a zero-modified negative binomial distribution with r  2, β  0.5, and p0M  0.25. (ii) Loss size follows a loglogistic distribution with γ  0.2, θ  1000. (iii) An insurance coverage has a deductible of 300. Calculate the probability of no paid claims. 13.18. For an insurance coverage: (i) Each loss is subject to a 250 deductible. (ii) The annual number of nonzero payments follows a zero-modified geometric distribution with β  4. (iii) The average number of nonzero payments per year is 2. (iv) Ground-up severity follows an inverse exponential distribution with θ  100. The deductible is raised to 300. Calculate the probability of 3 or more nonzero payments with the revised deductible.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

228

13.19. Losses follow an inverse Pareto distribution with τ  2 and θ  300. The number of nonzero payments N on a coverage with ordinary deductible of 200 has the following distribution: n

Pr ( N  n )

0 1 2 3

0.60 0.30 0.05 0.05

The deductible is raised to 500. Calculate the probability of exactly one nonzero payment on the coverage with the revised deductible. 13.20. Losses follow a Weibull distribution with τ  0.5 and θ  1000. The number of nonzero payments N on a coverage with ordinary deductible 100 follows a negative binomial distribution with r  3 and β  0.1. The deductible is increased to 200. Calculate the probability of exactly one nonzero payment on the coverage with the revised deductible. 13.21. Losses follow a uniform distribution on (0, 1000) . The number of losses greater than 200 follows a binomial distribution with m  10, q  0.1. Calculate the variance of the number of losses above 400. 13.22. Losses follow a uniform distribution on (0, 1000) . The number of losses greater than 200 follows a zero-modified logarithmic distribution with p 0M  0.6, β  2. Calculate the probability that the number of losses greater than 500 is 0.

Additional released exam questions: CAS3-F05:24, CAS3-S06:32, CAS3-F06:24,31, C-S07:39

Solutions 2

13.1. The inflated Pareto has parameters α  2 and θ  1000 (1.1)  1100. S (250)  1100  0.6639. 1350 The revised Bernoulli parameter is then q  (0.6639)(0.2)  0.1328. The variance is (0.1328)(1 − 0.1328)  0.1151 . 13.2. The Pareto parameters are the same as in the previous exercise. We calculated there that S (250)  0.6639. Therefore, the revised geometric parameter β is 0.2 (0.6639)  0.1328. The variance of the geometric is β (1 + β )  (0.1328)(1.1328)  0.1504 . 13.3. The original parameters of the negative binomial are r  4/3, β  3/2. The probability that a loss is greater than 100 is S (100)  e −100/500  0.8187. Doubling the size of the group doubles r so that the new r is 8/3. Eliminating claims with the deductible multiplies β by the probability that a loss will be greater than 100, so the new β is 32 (0.8187)  1.2281. Using these new parameters, and letting N be the number of claims for the group, 1 Pr ( N  0)  2.2281

C/4 Study Manual—17th edition Copyright ©2014 ASM

! 8/3

 0.11807

EXERCISE SOLUTIONS FOR LESSON 13

229

1 8 Pr ( N  1)  3 2.2281

! 8/3

1.2281  0.17356 2.2281

!

Pr ( N ≥ 2)  1 − 0.11807 − 0.17356  0.7084 13.4.

The probability that a loss will be greater than 1,000,000 is 13.8155 − 10 ln 1,000,000 − 10 1−Φ 1−Φ 2 2

!

!

 1 − Φ (1.91)  1 − 0.9719  0.0281

The parameters of the negative binomial distribution for the number of all losses are r  6, β  1/3. For  losses above 1,000,000, the revised parameters are r  6, β  31 (0.0281)  0.0093667. We then have p1  6

1 1.0093667

!6

0.0093667  0.0526 . 1.0093667

!

13.5. The modified negative binomial distribution has r  10 and β  1.5 (0.4)  0.6. The probabilities of 0, 1, and 2 students of passing, p0 , p1 , p2 , are 1 p0  1.6

! 10

 0.009095

10 1

!

1 1.6

! 10

p1 

0.6  0.034106 1.6

11 2

!

1 1.6

! 10

p2 

0.6 1.6

!

!2

 0.070344

The probability that 3 or more students pass is 1 − 0.009095 − 0.034106 − 0.070344  0.886455 13.6. The inflated variable has θ  1,050,000. Also, the frequency of windstorms causing inflated damage of 1,050,000 or more is a negative binomial with mean 0.2 and variance 0.4, which translates into r  0.2 and β  1. The probability of a windstorm greater than 2,000,000 divided by the probability of a windstorm greater than 1,050,000 is S (2,000,000) /S (1,050,000) , or e −2/1.05 /e −1.05/1.05  e −19/21 . So the revised β is e −19/21 . The probability of exactly 1 windstorm is p1 

rβ 0.2e −19/21   0.05383 (1 + β ) 1.2 (1 + e −19/21 ) 1.2

13.7. Coinsurance (c) does not affect claim frequency, but the deductible does. Only ( b − d ) /b of losses will result in claims. So the answer is (D). 13.8. The modified Poisson parameter for claims of $100,000 or more is 0.1 (5)  0.5. The probability of no claims over $100,000 in 1 year is e −0.5 , and the probability for three years is e −1.5  0.2231 . (B) 13.9. For each driver, the number of reported accidents is S (100)  e −100/200 . Next year, it will be e −100/240 , and there will be 1.1 times as many drivers, so the ratio is 1.1e −100/240  1.1956 e −100/200 There will be a 19.56% increase. (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

230

13.10. The probability that a scientist has 8000 calories or more is 12 , so the number of such scientists eaten is a binomial distribution with modified q  0.3. Then 8 (0.32 )(0.76 )  0.296475 2

!

(D)

If exactly two such scientists are eaten, it is automatically true that at least two scientists were eaten, so the fact that at least two scientists are eaten does not add any additional condition. 13.11. Since both frequency and severity pertain to payments rather than to losses, this situation is equivalent to an insurance coverage having no deductible originally, having the same frequency and severity distributions for ground-up losses, and then imposing a 500 deductible. The probability that a payment is greater than 500 under the original 500 deductible is (see the distribution tables for the definition of u) S (500)  u α 

1 1 + (500/θ ) α

!α 

1 1 + (500/800) 3

!3

 0.8037683  0.519268

Therefore, the revised frequency for a 1000 deductible has m  10 and q  0.05 (0.519268)  0.025963. The revised probability of one payment is 10 Pr ( N  1)  (0.025963)(1 − 0.025963) 9  0.2049 1

!

P

13.12. Let primes indicate values modified for a deductible of x. Then p00

1  1 + β0

!2

 0.390625

1  0.625 1 + β0 1 β0  − 1  0.6 0.625 Since β0/β  0.6/0.8  0.75, the probability that a loss is above x is 0.75 times the probability that it is over 500. Under the uniform distribution, S ( x )  1 − x/2500, so S (500)  0.8 and S ( x )  0.6. We conclude that x  1000 . 13.13. Let’s calculate the probability that a loss is above 100 in 2010 and the probability that a loss is above 150 in 2011. For the latter, remember that to scale a lognormal distribution, you add ln (1 + r ) to µ, as mentioned on page 29. In the following, primes indicated inflated variables. Pr ( X > 100)  1 − Φ

ln 100 − 4  1 − Φ (0.30)  0.3821 2

!

ln 150 − ln 1.1 − 4 Pr ( X > 150)  1 − Φ  1 − Φ (0.46)  0.3228 2

!

0

The relative probability is 0.3228/0.3821  0.844805. We multiply β by this number, resulting in β0  0.844805. Then the probabilities of 0, 1, and 2 are p0  C/4 Study Manual—17th edition Copyright ©2014 ASM

1 1.844805

! 1.5

 0.399093

EXERCISE SOLUTIONS FOR LESSON 13

231

1 p1  1.5 1.844805 p2 

(1.5)(2.5)

! 1.5

!

2

0.844805  0.274139 1.844805

!

1 1.844805

! 1.5

0.844805 1.844805

!2

 0.156923

The probability of 3 or more nonzero payments is 1 − 0.399093 − 0.274139 − 0.156923  0.1698 .

13.14. We want r ( vβ )(1 + vβ )  0.2. Let’s solve for v.

0.4v (1 + 0.4v )  0.4 v (1 + 0.4v )  1 0.4v 2 + v − 1  0 √ −1 + 1 + 1.6 v  0.765564 0.8 Now we calculate the deductible such that the probability of a claim is 0.76556 times the current probability. 100/d v 100/500 500 v d 500 500 d   653.11 v 0.765564 13.15.  Counting only claims of size 2 is a severity modification. Thus we multiply β by p2  1/4 to obtain  β0  12 14  18 . The variance of the modified negative binomial distribution is 1 8

rβ (1 + β )  (5)

!

45 9  8 64

!

(D)

13.16. The probability of a loss greater than 250 is (100/250) 0.5  0.632456. The modified parameter of the Poisson is λ  0.632456 (0.2)  0.126491. The modified value of the probability of at least one payment is ! ! 1 − p 0∗ 1 − e −0.126491 1 − p0M∗  (1 − p0M )  0.6  0.393287 1 − p0 1 − e −0.2 The probability of one claim is Pr ( N  1)  ∗

0.393287p1T∗

 0.393287

λ e λ−1

!

0.126491  0.393287 0.126491  0.3689 e −1

!

13.17. The probability of a loss greater than 300 is Pr ( X > 300)  1 −

(300/1000) 0.2  0.559909 1 + (300/1000) 0.2

so the revised β is β∗  0.5 (0.559909)  0.279955. Thus the revised p0M is calculated from the following: p0  C/4 Study Manual—17th edition Copyright ©2014 ASM

1 1.5

!2

 0.444444

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

232

p0∗  1−

p 0M∗

1 1.279955

!2

 0.610395

1 − 0.610395  (1 − 0.25)  0.525967 1 − 0.444444

!

p 0M∗  1 − 0.525967  0.474033

13.18. Let’s calculate the probabilities that a loss is above 250 and 300. Pr ( X > 250)  1 − e −100/250  0.329680

Pr ( X > 300)  1 − e −100/300  0.283469

The revised β is β ∗  4 (0.283469/0.329680)  3.439320. The mean of a zero-truncated geometric with β  4 is β + 1  5. The mean of our geometric is 2, which is 1 − p0M times the mean of a zero-truncated geometric. It follows that 1 − p0M  0.4. By formula (13.1), the revised probability that the frequency is nonzero is 1  0.2 5 1 p0∗   0.225260 1 + 3.439320 ! 1 − 0.225260 M∗ 1 − p0  0.4  0.387370 1 − 0.2 p0 

n

For a geometric distribution, Pr ( N ≥ n )  β/ (1+ β ) (see formula (11.1)). A zero-truncated geometric distribution is a geometric distribution shifted 1, so for zero-truncated N T , and n ≥ 1, Pr ( N T ≥ n ) 



 n−1

β/ (1 + β ) . For a zero-modified distribution, these probabilities are multiplied by 1 − p0M . Therefore, in our case, the probability of 3 or more nonzero payments is



Pr ( N M∗ ≥ 3)  (1 − p0M∗ )

β∗ 1 + β∗

!2

 (0.387370)

3.439320 4.439320

!2  0.2325

13.19. The probability of one loss above 500 is the sum of the probability of: 1.

1 loss above 200 which is also above 500.

2.

2 losses above 200, one below 500 and one above 500.

3.

3 losses above 200, two below 500 and one above 500.

Using the tables for an inverse Pareto, the probability that a loss is greater than 500 given that it is greater than 200 is Pr ( X > 500) 1 − (500/800) 2 0.609375    0.725446 Pr ( X > 200) 1 − (200/500) 2 0.84

Now we’ll evaluate the probabilities of the three cases enumerated above. Let N1 be the number of losses above 200 and N2 the number of losses above 500. Pr ( N1  1&N2  1)  0.30 (0.725446)  0.217634 Pr ( N1  2&N2  1)  0.05 C/4 Study Manual—17th edition Copyright ©2014 ASM

2 (0.725446)(1 − 0.725446)  0.019917 1

!

QUIZ SOLUTIONS FOR LESSON 13

233

3 (0.725446)(1 − 0.725446) 2  0.008203 Pr ( N1  3&N2  1)  0.05 1

!

The answer, the sum of the three probabilities, is 0.217634 + 0.019917 + 0.008203  0.24575 . 13.20. Calculate the relative probability of a loss greater than 200, given that it is greater than 100. 0.5

Pr ( X > 200 | X > 100) 

S (200) e −0.2  0.877230  S (100) e −0.10.5

Thus the modified β of the negative binomial is β0  0.1 (0.877230)  0.087723. Then the probability of one nonzero payment is r 1

!

1 1 + β0

!r

β0 1 3 1 + β0 1.087723

!

!3

0.087723  0.18800 1.087723

!

13.21. The probability of a loss greater than 400 given that it is greater than 200 is (1 − 0.4) / (1 − 0.2)  0.75. Therefore the modified q in the binomial is (0.1)(0.75)  0.075, and the variance of the modified distribution is 10 (0.075)(0.925)  0.69375 . 13.22. The probability of a loss greater than 500 given that it is greater than 200 is (1 − 0.5) / (1 − 0.2)  0.625. Thus the revised β is 2 (0.625)  1.25. For a logarithmic distribution, use formula (13.2): 1−

p0M∗

 (1 −

ln (1 + vβ ) p0M ) ln (1 + β )

ln 2.25  0.295256  0.4 ln 3

!

p0M∗  0.704744

Quiz Solutions 13-1. The probability of a loss over 500 in 2009 is 1−e −1000/500  0.864665. In 2010, the new θ  1.1 (1000)  1100, and the probability of a loss over 500 is 1 − e −1100/500  0.889197. The original q for the binomial distribution of paid claims was 0.5/5  0.1. Therefore, the q for the inflated binomial distribution of paid claims is ! 0.889197 0.1  0.102837 0.864665 and the variance of the inflated distribution of paid claims is 5 (0.102837)(1 − 0.102837)  0.461308 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

234

C/4 Study Manual—17th edition Copyright ©2014 ASM

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

Lesson 14

Aggregate Loss Models: Compound Variance Reading: Loss Models Fourth Edition 9.1–9.3, 9.5, 9.8.1

14.1

Introduction

Aggregate losses are the total losses paid by an insurance for a defined set of insureds in one period, say a year. They are the sum of the individual losses for the year. There are two ways to model aggregate losses. One way is to only consider the number of claims and the size of each claim. In other words, the size of the group is only relevant to the extent it affects the number of claims. In this model, aggregate losses S can be expressed as: S

N X

Xi

i1

where N is the number of claims and X i is the size of each claim. We make the following assumptions: 1. X i ’s are independent identically distributed random variables. In other words, every claim size has the same probability distribution and is independent of any other claim size. 2. X i ’s are independent of N. The claim counts are independent of the claim sizes. This model is called the collective risk model. S is a compound distribution: a distribution formed by summing up a random number of identical random variables. For a compound distribution, N is called the primary distribution and X is called the secondary distribution. The alternative is to let n be the number of insureds in the group, and X i be the aggregate claims of each individual member. We assume 1. X i ’s are independent, but not necessarily identically distributed random variables. Different insureds could have different distributions of aggregate losses. Typically, Pr ( X i  0) > 0, since an insured may not submit any claims. This is unlike the collective risk model where X i is a claim and therefore not equal to 0. 2. There is no random variable N. Instead, n is a fixed number, the size of the group. The equation for aggregate losses is then S

n X

Xi

i1

This model is called the individual risk model. For certain severity distributions, the individual risk model’s aggregate losses have a familiar parametric distribution. For example, if the X i ’s are all exponential with the same θ, then S has a gamma distribution with parameters n and θ. Calculating the distribution function for the aggregate distribution C/4 Study Manual—17th edition Copyright ©2014 ASM

235

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

236

Proof of Compound Variance Formula

Condition S on the number of claims N. By equation (4.2), the conditional variance formula

 

Var ( S )  EN [VarS ( S | N ) ] + VarN (ES [S | N])  EN Var *

N X

, i1

 -

X N

X i | N + + VarN *E 

,  i1

  -

X i | N  +

By mutual independence of X i , we can calculate the variance of the sum as the sum of the variances. We can always calculate the expected value of the sum as the sum of the expected values. We will drop the subscript on X since they are identically distributed. We will drop the condition on N, since the X i ’s are independent of N.

 

EN Var *

N X

, i1

 -

X N

X i | N + + VarN *E 

,  i1

  -

X N

X i | N  +  EN 

 i1

 

Var ( X )  + VarN *

N X

, i1

E[X]+

 EN [N Var ( X ) ] + VarN ( N E[X])

-

Since Var ( X ) and E[X] are constants, they can be factored from the expression. When factoring a constant from variance, it gets squared. EN [N Var ( X ) ] + VarN ( N E[X])  E[N] Var ( X ) + Var ( N ) E[X]2 We’re done.

in a collective risk model, on the other hand, is difficult. An alternative is to use an approximating distribution. In order to do so, we need the mean and variance of the aggregate loss distribution. We proceed to discuss how to calculate them.

14.2

Compound variance

This topic appears frequently on exams, and it’s easy. Assume we have a collective risk model. We assume that aggregate losses have a compound distribution, with frequency being the primary distribution and severity being the secondary distribution. If P N is the frequency random variable, X the severity random variable, and S  N n1 X n , and the X n ’s are identically distributed and independent of each other and of N, then (14.1)

E[S]  E[N] E[X] 2

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]

f

E S − E[S]

3g

f

 E[N] E X − E[X]

3g

(14.2)

+ 3 Var ( N ) E[X] Var ( X ) + E N − E ( N )

f

3g

E[X]3

(14.3)

A proof of equation (14.2) is given in a sidebar. Equation (14.2) is important, so let’s repeat it: Compound Variance Formula Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2 C/4 Study Manual—17th edition Copyright ©2014 ASM

(14.2)

14.2. COMPOUND VARIANCE

237

For a compound Poisson distribution, one where the primary distribution is Poisson with parameter λ, the compound variance formula reduces to Compound Variance Formula for Poisson Primary Var ( S )  λ E[X 2 ]

(14.4)

Equation (14.3), which enables calculating skewness, is unlikely to be required on an exam, especially since the formulas for the third moments of discrete distributions are not in the Loss Models appendix. Example 14A For a group of 100 insureds, the number of losses per insured follows a negative binomial distribution with r  3, β  0.01. Claim sizes follow an inverse gamma distribution with α  6, θ  1000. The number of losses is independent of claim sizes, and claim sizes are independent of each other. Determine the mean and variance of aggregate losses. Answer: To model frequency of losses on 100 insureds, we will use a negative binomial with r  300, β  0.01. This is not strictly necessary; an alternative would be to calculate the mean and variance for one insured, and then multiply them by 100. We have E[N]  rβ  300 (0.01)  3 Var ( N )  rβ (1 + β )  300 (0.01)(1.01)  3.03 1000 θ   200 E[X]  α−1 5 θ2 10002 E[X 2 ]    50,000 ( α − 1)( α − 2) (5)(4) Var ( X )  50,000 − 2002  10,000 E[S]  E[N] E[X]  (3)(200)  600

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  (3)(10,000) + (3.03)(2002 )  151,200



In Lesson 9, we mentioned that we can sometimes use the compound variance formula to calculate the variance of a coverage with a deductible. To use it, we set the frequency random variable to be Bernoulli with probability equal to the probability of a claim being greater than 0, and the severity random variable is the loss variable left-shifted and truncated by the deductible, or the payment amount for nonzero payments. This method is especially useful if severity is exponential, since left shifting has no effect on the memoryless exponential distribution or its variance. Example 14B (Repeat of Example 9B) The loss severity random variable follows an exponential distribution with mean 1000. A coverage for this loss has a deductible of 500. Calculate the variance of the payment per loss random variable. Answer: Let X be the loss random variable and Y L the payment per loss random variable. Also let p  Pr ( X > 500) . Then Y L is a compound distribution with primary Bernoulli with parameter p and secondary exponential with mean 1000, since the excess payment per payment is exponential with mean 1000. The variance of Y L is therefore Var ( Y L )  p Var ( X ) + p (1 − p ) E[X]2  e −0.5 (10002 ) + e −0.5 1 − e −0.5 (10002 )  845,182





The following example goes further and uses what we learned in Lesson 13 as well. Example 14C [1999 C3 Sample:20] You are given: C/4 Study Manual—17th edition Copyright ©2014 ASM



14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

238

• An insured’s claim severity distribution is described by an exponential distribution: F ( x )  1 − e −x/1000 • The insured’s number of claims is described by a negative binomial distribution with β  2 and r  2. • A 500 per claim deductible is in effect. Calculate the standard deviation of the aggregate losses in excess of the deductible. Answer: In this question, the word “claim” is used synonymously with “loss”, and is before the deductible is removed. The frequency of losses has a negative binomial distribution with β  2 and r  2. However, only losses above 500 are paid. The probability that a loss will be paid is e −500/1000  e −0.5 . The distribution of the number of paid claims is then negative binomial with parameters β  2e −0.5 and r  2. The distribution of individual losses above the deductible is exponential with mean 1000, since the exponential distribution is memoryless. Let N be the distribution of paid claims, and X the distribution of individual losses above the deductible. Then E[X]  1000

Var ( X )  10002

E[N]  2 (2e −0.5 )  2.42612

Var ( N )  2 (2e −0.5 )(1 + 2e −0.5 )  5.36916

Var ( S )  2.42612 (10002 ) + 5.36916 (10002 )  7,795,281 √ The standard deviation of S is 7,795,281  2792 .



The compound variance formula can only be used when N and X i are independent. If N | θ and X | θ are conditionally independent, then they are probably not unconditionally independent. However, the compound variance formula may be used on N | θ and X | θ to evaluate Var ( S | θ ) , and then the conditional variance formula can be used to evaluate Var ( S ) . The next example illustrates this. Example 14D You are given: (i) Claim counts, N, follow a Poisson distribution with mean 2. (ii) Claim sizes, X, are exponential with mean θ, and are independent given θ. (iii) θ varies by insured, and is uniform on [0, 12]. (iv) Claims counts and claim sizes are independent. Calculate the variance of aggregate losses. Wrong answer: Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  2 Var ( X ) + E[X]2



E[X]  E E[X | θ]  6

f



g

Var ( X )  E Var ( X | θ ) + Var E[X | θ]

f

g





 E θ 2 + Var ( θ )

f



g

122 122 + 3 12

because for a uniform distribution, the second moment is the range squared over 3 and the variance is the range squared over 12

 60 Everything above is correct except for the first line. That formula may not be used, since X is only independent of N given θ.  C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 14

239

The correct answer is Answer: First use the conditional variance formula. Var ( S )  E Var ( S | θ ) + Var E[S | θ]

f

g





E[S | θ]  2θ

Var ( S | θ )  E[N | θ] Var ( X | θ ) + Var ( N | θ ) E[X | θ]2

This is the right place to use the compound variance formula.  2θ 2 + 2θ 2  4θ 2 Var ( S )  E[4θ 2 ] + Var (2θ ) 122 122 4 +4 3 12

!

!

 240



If frequency and severity are conditioned on a parameter that varies by insured, do not use the compound variance formula on the overall compound distribution. You may use the compound variance formula on the conditional distribution, and then use the conditional variance formula to calculate the overall variance.

?

Quiz 14-1 The number of losses on an automobile comprehensive coverage has the following distribution: Number of losses

Probability

0 1 2 3

0.4 0.3 0.2 0.1

Loss sizes follow a Pareto distribution with parameters α  5 and θ  1200 and are independent of loss counts and of each other. Calculate the variance of aggregate losses.

Exercises 14.1. The number of claims on a homeowner’s policy has a binomial distribution fwithg parameters m  3 and q. The parameter q varies by policyholder and has a uniform distribution on 0, 12 . Calculate the probability of no claims for a policy.

14.2. The number of claims on an insurance policy has a Poisson distribution with mean λ. λ varies by insured according to a uniform distribution on [0, 3]. Calculate the probability of 2 or more claims for a policy.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

240

14.3.

[4B-S99:8] You are given the following:

(i) Each loss event is either an aircraft loss or a marine loss. (ii) The number of aircraft losses has a Poisson distribution with a mean of 0.1 per year. Each loss is always 10,000,000. (iii) The number of marine losses has a Poisson distribution with a mean of 0.2 per year. Each loss is always 20,000,000. (iv) Aircraft losses occur independently of marine losses. (v) From the first two events each year. the insurer pays the portion of the combined losses that exceeds 10,000,000. Determine the insurer’s expected annual payments. (A) (B) (C) (D) (E)

Less than 1,300,000 At least 1,300,000, but less than 1,800,000 At least 1,800,000, but less than 2,300,000 At least 2,300,000, but less than 2,800,000 At least 2,800,000

14.4. [SOA3-F04:32] Bob is a carnival operator of a game in which a player receives a prize worth W  2N if the player has N successes, N  0, 1, 2, 3,. . . . Bob models the probability of success for a player as follows: (i) (ii)

N has a Poisson distribution with mean Λ . Λ has a uniform distribution on the interval (0, 4) .

Calculate E[W]. (A) 5

(B) 7

(C) 9

(D) 11

(E) 13

14.5. [151-81-96:15] (2 points) An insurer issues a portfolio of 100 automobile insurance policies. Of these 100 policies, one-half have a deductible of 10 and the other half have a deductible of zero. The insurance policy pays the amount of damage in excess of the deductible subject to a maximum of 125 per accident. Assume: (i)

the number of automobile accidents per year per policy has a Poisson distribution with mean 0.03; and (ii) given that an accident occurs, the amount of vehicle damage has the distribution: x

Pr ( X  x )

30 150 200

1/3 1/3 1/3

Compute the total amount of claims the insurer expects to pay in a single year. (A) 270

(B) 275

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 280

(D) 285

(E) 290

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

241

[4B-S92:31] (2 points) You are given that N and X are independent random variables where:

14.6.

N is the number of claims, and has a binomial distribution with parameters m  3 and q  16 . X is the size of claim and has the following distribution: Pr ( X  100)  2/3

Pr ( X  1100)  1/6

Pr ( X  2100)  1/6

Determine the coefficient of variation of the aggregate loss distribution. (A) (B) (C) (D) (E) 14.7.

Less than 1.5 At least 1.5, but less than 2.5 At least 2.5, but less than 3.5 At least 3.5, but less than 4.5 At least 4.5 [151-81-96:4] (1 point) For an insurance portfolio:

(i)

the number of claims has the probability distribution n Pr ( N  n ) 0 1 2 3

0.4 0.3 0.2 0.1

(ii) each claim amount has a Poisson distribution with mean 4; and (iii) the number of claims and claim amounts are mutually independent. Determine the variance of aggregate claims. (A) 8 14.8.

(B) 12

(C) 16

(D) 20

[151-82-98:13] (2 points) For aggregate claims S 

(i)

(ii)

X i has distribution

PN

i1

(E) 24

X i , you are given:

x

Pr ( X  x )

1 2

p 1−p

Λ is a Poisson random variable with parameter p1 ;

(iii) given Λ  λ, N is Poisson with parameter λ; (iv) the number of claims and claim amounts are mutually independent; and (v) Var ( S )  19 2 . Determine p (A) 1/6

(B) 1/5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1/4

(D) 1/3

(E) 1/2

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

242

14.9. (i) (ii)

[4B-S90:43] (2 points) You are given: N is a random variable for the claim count with Pr ( N  4)  41 , Pr ( N  5)  12 , and Pr ( N  6)  14 . X is a random variable for claim severity with probability density function f ( x )  3 · x −4 for 1 ≤ x < ∞.

Determine the coefficient of variation, R, of the aggregate loss distribution, assuming that claim severity and frequency are independent. (A) (B) (C) (D) (E)

R < 0.35 0.35 ≤ R < 0.50 0.50 ≤ R < 0.65 0.65 ≤ R < 0.70 0.70 ≤ R

14.10. [151-82-93:7] (2 points) For an insured, Y is the total time spent in the hospital in a year. The distribution of the number of hospital admissions in a year is: Number of Admissions

Probability

0 1 2

0.60 0.30 0.10

The distribution of the length of stay for each admission is gamma with α  1 and θ  5. Determine the variance of Y. (A) 20

(B) 24

(C) 28

(D) 32

(E) 36

14.11. For an auto bodily injury coverage, the number of accidents each year has a binomial distribution with parameters m  5, q  0.02. The number of people injured per accident has the following distribution: P (0)  0.7 P (1)  0.2 P (2)  0.1 The number of people injured in an accident is independent of the number of accidents. The loss size for each person injured has a lognormal distribution with parameters µ  10, σ  2. Loss size is independent of number of people injured. Calculate the variance in aggregate losses for a year.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

243

14.12. [4-F02:36] You are given: Number of Claims

Probability

0

1/5

1

3/5

2

1/5

Claim Size

Probability

25

1/3

150

2/3

50

2/3

200

1/3

Claim sizes are independent. Determine the variance of the aggregate loss. (A) 4,050

(B) 8,100

(C) 10,500

(D) 12,510

(E) 15,612

14.13. [3-S00:19] An insurance company sold 300 fire insurance policies as follows: Number of Policies

Policy Maximum

Probability of Claims Per Policy

100 200

400 300

0.05 0.06

You are given: (i) The claim amount for each policy is uniformly distributed between 0 and the policy maximum. (ii) The probability of more than one claim per policy is 0. (iii) Claim occurrences are independent. Calculate the variance of the aggregate claims. (A) 150,000

(B) 300,000

(C) 450,000

(D) 600,000

(E) 750,000

14.14. [3-F00:8] The number of claims, N, made on an insurance portfolio follows the following distribution: n

Pr ( N  n )

0 2 3

0.7 0.2 0.1

If a claim occurs, the benefit is 0 or 10 with probability 0.8 and 0.2, respectively. The number of claims and the benefit for each claim are independent. Calculate the probability that aggregate benefits will exceed expected benefits by more than 2 standard deviations. (A) 0.02

(B) 0.05

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.07

(D) 0.09

(E) 0.12

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

244

14.15. [3-S01:29] You are the producer of a television quiz show that gives cash prizes. The number of prizes, N, and the prize amounts, X, have the following distributions: n

Pr ( N  n )

x

Pr ( X  x )

1 2

0.8 0.2

0 100 1000

0.2 0.7 0.1

Your budget for prizes equals the expected prizes plus the standard deviation of prizes. Calculate your budget. (A) 306

(B) 316

(C) 416

(D) 510

(E) 518

14.16. [3-S01:36] The number of accidents follows a Poisson distribution with mean 12. Each accident generates 1, 2, or 3 claimants with probabilities 21 , 13 , 16 , respectively. Calculate the variance in the total number of claimants.

(A) 20

(B) 25

(C) 30

(D) 35

(E) 40

14.17. [CAS3-F04:31] The mean annual number of claims is 103 for a group of 10,000 insureds. The individual losses have an observed mean and standard deviation of 6,382 and 1,781, respectively. The standard deviation of the aggregate claims is 22,874. Calculate the standard deviation for the annual number of claims. (A) 1.47

(B) 2.17

(C) 4.72

(D) 21.73

(E) 47.23

14.18. For an insurance coverage, you are given: (i) Claim frequency follows a geometric distribution with mean 0.15. (ii) Claim severity follows a distribution that is a mixture of two lognormal distributions, the first with parameters µ  3 and σ  1 and the second with parameters µ  5 and σ  2. The first distribution is given 70% weight. (iii) Claim frequency and severity are independent. Calculate the variance of aggregate claims. 14.19. [CAS3-S04:22] An actuary determines that claim counts follow a negative binomial distribution with unknown β and r. It is also determined that individual claim amounts are independent and identically distributed with mean 700 and variance 1,300. Aggregate losses have mean 48,000 and variance 80 million. Calculate the values for β and r. (A) (B) (C) (D) (E)

β β β β β

 1.20, r  57.19  1.38, r  49.75  2.38, r  28.83  1,663.81, r  0.04  1,664.81, r  0.04

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

245

14.20. [CAS3-F03:24] Zoom Buy Tire Store, a nationwide chain of retail tire stores, sells 2,000,000 tires per year of various sizes and models. Zoom Buy offers the following road hazard warranty: “If a tire sold by us is irreparably damaged in the first year after purchase, we’ll replace it free, regardless of the cause.” The average annual cost of honoring this warranty is $10,000,000, with a standard deviation of $40,000. Individual claim counts follow a binomial distribution, and the average cost to replace a tire is $100. All tires are equally likely to fail in the first year, and tire failures are independent. Calculate the standard deviation of the replacement cost per tire. (A) (B) (C) (D) (E)

Less than $60 At least $60, but less than $65 At least $65, but less than $70 At least $70, but less than $75 At least $75

14.21. [CAS3-F03:25] Daily claim counts are modeled by the negative binomial distribution with mean 8 and variance 15. Severities have mean 100 and variance 40,000. Severities are independent of each other and of the number of claims. Let σ be the standard deviation of a day’s aggregate losses. On a certain day, 13 claims occurred, but you have no knowledge of their severities. Let σ0 be the standard deviation of that day’s aggregate losses, given that 13 claims occurred. Calculate (A) (B) (C) (D) (E)

σ σ0

− 1.

Less than −7.5% At least −7.5%, but less than 0 0 More than 0, but less than 7.5% At least 7.5%

14.22. [3-F00:21] A claim severity distribution is exponential with mean 1000. An insurance company will pay the amount of each claim in excess of a deductible of 100. Calculate the variance of the amount paid by the insurance company for one claim, including the possibility that the amount paid is 0. (A) 810,000

(B) 860,000

(C) 900,000

(D) 990,000

(E) 1,000,000

14.23. Loss sizes for an insurance coverage, before taking any deductible into account, are uniformly distributed on [0, 100]. Coverage is subject to an ordinary deductible of 5. Calculate the variance of payments after the deductible, taking into account payments of 0 on losses at or below the deductible. 14.24. Loss sizes for an insurance coverage, before taking any deductible into account, are exponentially distributed with a mean of 50. Coverage is subject to an ordinary deductible of 5. Calculate the variance of payments after the deductible, taking into account payments of 0 on losses at or below the deductible.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

246

14.25. [151-81-96:7] (2 points) For a certain insurance, individual losses in 1994 were uniformly distributed over (0, 1000) . A deductible of 100 is applied to each loss. In 1995, individual losses have increased 5%, and are still uniformly distributed. A deductible of 100 is still applied to each loss. Determine the percentage increase in the standard deviation of amount paid per loss. (A) 5.00%

(B) 5.25%

(C) 5.50%

(D) 5.75%

(E) 6.00%

14.26. [4-F01:29] In order to simplify an actuarial analysis Actuary A uses an aggregate distribution S  X1 + · · · + X N , where N has a Poisson distribution with mean 10 and X i  1.5 for all i. Actuary A’s work is criticized because the actual severity distribution is given by Pr ( Yi  1)  Pr ( Yi  2)  0.5,

for all i,

where the Yi ’s are independent. Actuary A counters this criticism by claiming that the correlation coefficient between S and S∗  Y1 + · · · + YN is high. Calculate the correlation coefficient between S and S∗ .

(A) 0.75

(B) 0.80

(C) 0.85

(D) 0.90

(E) 0.95

Use the following information for questions 14.27 and 14.28: The probability function of claims per year for an individual risk is Poisson with a mean of 0.10. There are four types of claims. The annual number of claims of each type has a Poisson distribution. The table below describes the characteristics of the four types of claims. Mean frequency indicates the average number of claims of that type per year. Severity Type of Claim

Mean Frequency

Mean

Variance

W X Y Z

0.02 0.03 0.04 0.01

200 1,000 100 1,500

2,500 1,000,000 0 2,000,000

You are also given: •

Claim sizes and claim counts are independent for each type of claim.



Claim sizes are independent of each other.

14.27. Calculate the variance of a single claim whose type is unknown. 14.28. [4B-S91:26] (2 points) Calculate the variance of annual aggregate losses. (A) (B) (C) (D) (E)

Less than 70,000 At least 70,000, but less than 80,000 At least 80,000, but less than 90,000 At least 90,000, but less than 100,000 At least 100,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

247

14.29. For an insurance coverage, there are four types of policyholder: W, X, Y, and Z. For each type of policyholder, the number of claims per year has a Poisson distribution with a mean of 0.10. The probability that a policyholder is of specific type and the mean and variance of severity of claims for each type of policyholder is described in the following table. Policyholder

Severity

Type

Probability

Mean

Variance

W X Y Z

0.2 0.3 0.4 0.1

200 1,000 100 1,500

2,500 1,000,000 0 2,000,000

Calculate the variance of aggregate losses for a policyholder whose type is unknown. Additional released exam questions: SOA M-S05:17,31, CAS3-S05:7,8,9, SOA M-F05:38,39, CAS3-F06:29

Solutions 14.1.

If N is the number of claims, Pr ( N  0)  2 2

14.2.

1/2

Z 0

3

1/2

Z

(1 − q ) dq  −0.5 (1 −

0

(1 − q ) 3 dq.

1/2 q ) 4 0

!4 15 1 + *  0.5 1 −  2 32 ,

Let N be the number of claims. 1 3

Z

1 Pr ( N  1)  3

Z

Pr ( N  0) 

3 0

0

3

e −λ dλ  13 (1 − e −3 )  0.3167 λe −λ dλ

3 1 3  −λe −λ + e −λ dλ 0 3 0  1  −3e −3 + 1 − e −3 3  1  1 − 4e −3  0.2670 3 Pr ( N ≥ 2)  1 − 0.3167 − 0.2670  0.4163

Z

!

14.3. Losses are a mixture distribution with a 1/3 weight on aircraft losses and a 2/3 weight on marine losses. The expected value of each loss is 31 (10,000,000) + 23 (20,000,000)  16,666,666 23 . If there is one loss the insurer’s expected annual payment is 6,666,666 32 . If there are two or more losses, the insurer’s expected annual payment is 2 (16,666,666 23 ) − 10,000,000  23,333,333. The number of losses is Poisson with λ  0.3. Let p n be the probability of n losses. Expected annual losses are p1 (6,666,666 23 ) + (1 − p0 − p1 )(23,333,333 13 ) C/4 Study Manual—17th edition Copyright ©2014 ASM

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

248

with p 1  0.3e −0.3  0.222245 and 1 − p0 − p1  1 − e −0.3 − 0.3e −0.3  1 − 1.3e −0.3  0.036936. Expected annual payments are 0.222245 (6,666,666 23 ) + 0.036936 (23,333,333 13 )  2,343,484 14.4.

(D)

We use the conditional expectation formula: E[W]  E 2N  EΛ E[2N | Λ]

f

g

f

g

The inner expectation, E[2N | Λ], is the probability generating function evaluated at 2. From the tables, the pgf of a Poisson with mean Λ is P ( z )  e Λ ( z−1) so P (2)  e Λ . The expectation of W is therefore

f

E[W]  E e

Λ

g

 0.25

 0.25 e 4 − 1



4

Z 0

e λ dλ



 0.25 (53.5982)  13.3995

(E)

14.5. We can calculate the expectation for the two halves (with and without the deductible) separately. For each of the 50 policies without a deductible, E[X]  30 (1/3) +125 (2/3)  280/3. The way the question is phrased, the policies with the deductible can get as much as 125; 125 is the policy limit, not the maximum covered loss. So for policies with a deductible, E[X]  20 (1/3) + 125 (2/3)  90. We then add everything up: ! 280 50 (0.03) + 50 (0.03)(90)  275 (B) 3 14.6.

Let S be the aggregate loss random variable. Then E[N]  mq  3

1 1  6 2

!

1 5 5 Var ( N )  mq (1 − q )   2 6 12 2 1 E[X]  (100) + (1100 + 2100)  600 3 6  1,750,000 2 1 Var ( X )  (100 − 600) 2 + (1100 − 600) 2 + (2100 − 600) 2  3 6 3

!

Using the compound variance formula, equation (14.2), E[S]  E[N] E[X]  12 (600)  300

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2 1 1,750,000 5 5,300,000  + (360,000)  2 3 12 12 √ √ Var ( S ) 5,300,000/12   2.2153 (B) E[S] 300

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 14

249

14.7. The only tricky thing here is that the secondary, rather than the primary, distribution is Poisson. We calculate E[N]  0.3 (1) + 0.2 (2) + 0.1 (3)  1 E[N 2 ]  0.3 (12 ) + 0.2 (22 ) + 0.1 (32 )  2 Var ( N )  1 Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  1 (4) + 1 (42 )  20 14.8. tion,

(D)

In this exercise, the frequency distribution is a compound distribution. For the frequency distribu-

f

g

E[N]  E E[N | Λ]  E[Λ] 

1 . p

We evaluate variance using the conditional variance formula: Var ( N )  Var E[N | Λ] + E Var ( N | Λ)





f

g

 Var (Λ) + E[Λ] 1 1 2  +  p p p

The severity distribution has mean ( p )(1) + (1 − p )(2)  2 − p. It is Bernoulli (shifted by 1), so its variance is p (1 − p ) . Frequency and severity are unconditionally independent, so we can use the compound variance formula. 1 2 19  Var ( S )  ( p )(1 − p ) + (2 − p ) 2 2 p p 19p  2p − 2p 2 + 16 − 16p + 4p 2 2p 2 − 33p + 16  0

(2p − 1)( p − 16)  0 p  16 is impossible, leaving the other possibility, p  14.9.

1 2

. (E)

N is a binomial variable, shifted 4. Shifting does not affect variance. E[N]  5 Var ( N )  2

1 1 2 2



1 2

X has a single parameter Pareto distribution with α  3, θ  1. E[X] 

3 2

E X2  3

f

g

Var ( X )  3 − E[S]  5

3 2

3 2  34 2  15 2 1 3 2 +2 2

Var ( S )  5 43  39 8 √ 39/8 R  0.2944 15/2



C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

250

14.10. Use the compound variance formula. E[N]  0.3 (1) + 0.1 (2)  0.5 E[N 2 ]  0.3 (1) + 0.1 (4)  0.7 Var ( N )  0.7 − 0.52  0.45 E[X]  5

Var ( X )  25

Var ( Y )  0.5 (25) + 0.45 (52 )  23.75

(B)

14.11. The number of losses is itself a compound distribution with primary distribution (number of accidents, which we’ll call L) binomial and secondary distribution (number of people injured in an accident, which we’ll call M) having the specified discrete distribution. If we let N be the number of people injured per year, then E[L]  mq  5 (0.02)  0.1 Var ( L )  mq (1 − q )  5 (0.02)(0.98)  0.098 E[M]  0.2 (1) + 0.1 (2)  0.4 E[N]  E[L] E[M]  (0.1)(0.4)  0.04 E M 2  0.2 (12 ) + 0.1 (22 )  0.6

f

g

Var ( M )  0.6 − 0.42  0.44

Var ( N )  E[L] Var ( M ) + Var ( L ) E[M]2  (0.1)(0.44) + (0.098)(0.42 )  0.05968

Loss size, X, has the following mean and variance: E[X]  exp ( µ + σ2 /2)  e 12 E[X 2 ]  exp (2µ + 2σ2 )  e 28 Var ( X )  e 28 − e 12



2

 e 28 − e 24

We use the compound variance formula once again to calculate the variance of aggregate losses: Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  0.04 e 28 − e 24 + 0.05968 e 24  58,371,588,495









14.12. Claim size and number of claims are not independent, so the compound variance formula cannot be used. Either the conditional variance formula (4.2) can be used, or the problem can be done directly by calculating the first and second moments. The official solution does the latter, so we shall do the former so you can look at both solutions and decide which method you prefer. The variance of aggregate claims given number of claims is, by the Bernoulli shortcut (see Section 3.3 on page 54): •

If there are 0 claims, 0.



If there is 1 claim, (150 − 25) 2

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 2 3 3



2 (1252 ) . 9

EXERCISE SOLUTIONS FOR LESSON 14 •

251

If there are 2 claims, 2 (200 − 50) 2 23 31  10,000. (We have to multiply the variance of each claim by 2, since there are 2 claims and we want the variance of aggregate claims, not of claim size.)



 

The expected value of the variance is 3 5

2 (1252 ) 1 + (10000)  4083 13 9 5

!

!

!

The average aggregate claims is 0 if zero claims, 325/3 if one claim, and 200 if there are two claims. The variance (no Bernoulli shortcut since there are three possibilities) is calculated by calculating the mean and the second moment. The mean is 3 5 and the second moment

3 5

!

325 1 + (200)  105 3 5

325 3

!

!

!2

!

1 + (200) 2  15,041 32 5

!

so the variance of the means is 15,041 23 −1052  4016 23 . So the variance of aggregate losses is 4083 13 +4016 23  8100 . (B) 14.13. The number of claims is Bernoulli, variance q (1 − q ) . For a uniform distribution, the mean is the maximum over 2 and the variance is the maximum squared over 12. Therefore, for the first 100 policies, if S is aggregate claims for one policy, 4002 Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]  (0.05) + (0.05)(0.95)(2002 )  2566 32 12

!

2

and for one policy from the second 200 policies: 3002 Var ( S )  (0.06) + (0.06)(0.94)(1502 )  1719 12

!

For all 300 policies, add up the variances: 100 (2566 32 ) + 200 (1719)  600,467 . (D)

14.14. Let claim size be X and aggregate benefits S. X is Bernoulli.

E[N]  0.2 (2) + 0.1 (3)  0.7 E[N 2 ]  0.2 (4) + 0.1 (9)  1.7 Var ( N )  1.7 − 0.72  1.21 E[X]  2

Var ( X )  102 (0.2)(0.8)  16 E[S]  (0.7)(2)  1.4 Var ( S )  0.7 (16) + 1.21 (22 )  16.04 E[S] + 2 Var ( S )  9.4100

p

The only possibility for claims less than 10 is 0, whose probability is Pr ( N  0) + Pr ( N  2) Pr ( X  0)



2

+ Pr ( N  3) Pr ( X  0)

 0.7 + (0.2)(0.64) + (0.1)(0.512)  0.8792 The probability of non-zero claims is 1 − 0.8792  0.1208 (E) C/4 Study Manual—17th edition Copyright ©2014 ASM



3

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

252

14.15. We calculate the mean and variance of N and X and use the compound variance formula. For the variance of N we use the Bernoulli shortcut. In the following, S is aggregate prizes. E[N]  0.8 (1) + 0.2 (2)  1.2 Var ( N )  (2 − 1) 2 (0.8)(0.2)  0.16

E[X]  0.7 (100) + 0.1 (1000)  170

E[X 2 ]  0.7 (10,000) + 0.1 (1,000,000)  107,000 Var ( X )  107,000 − 1702  78,100 E[S]  (1.2)(170)  204

Var ( S )  1.2 (78,100) + 0.16 (1702 )  98,344 E[S] + σS  204 + 98,344  517.60

p

(E)

14.16. By the variance formula for compound Poisson distributions, equation (14.4), the varif compound g 2 ance is 12 E X , so we just have to calculate the second moment of the severity distribution. 1 1 1 4 9 10 1 2 (1 ) + (22 ) + (32 )  + +  2 3 6 2 3 6 3 Then Var ( S )  12

10 3

 40 . (E)

14.17. Using the usual N, X, S notation, 2 22,8742  103 (1,7812 ) + σN (6,3822 ) 2 σN 

22,8742 − 103 (1,7812 )  4.8247 6,3822

σN  2.1965

(B)

2.17 may be funny rounding. 14.18. E[N]  0.15 E[X]  0.7e

3.5

Var ( N )  0.15 (1.15)  0.1725 + 0.3e 7  352.17

E[X 2 ]  0.7e 8 + 0.3e 18  19,700,077.41 Var ( X )  E[X 2 ] − 352.172  19,576,053.70

Var ( S )  0.15 (19,576,053.70) + 0.1725 (352.172 )  2,957,802.15

14.19. First of all, 48,000  700 E[N], so E[N]  rβ 

480 7 .

Then

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2 480 480 (1,300) + (1 + β )(7002 ) 80 × 106  7 7 480 79,910,857.14  (1 + β )(7002 ) 7 ! ! 7 1 1 + β  79,910,857.14  2.38 480 7002 So β  1.38 and the answer is (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 14

253

14.20. To make the numbers easier to handle, we’ll work them out per tire. Since 2,000,000 tires are sold, letting S be the aggregate cost per tire, 10,000,000 5 2,000,000 40,0002 Var ( S )   800 2,000,000 E[S] 

Also, since total expected cost is $10,000,000 and $100 per tire, an average of 100,000 tires are damaged, or a probability of 100,000/2,000,000  0.05 of damage for a single tire. Using the compound variance formula, with X the damage per tire, 2 + (0.05)(0.95)(1002 ) 800  (0.05) σX 2  (0.05) σX + 475 325 2 σX   6500 0.05 σX  80.6226 (E)

14.21. The variance with no knowledge of claim count is Var ( S )  8 (40,000) + 15 (1002 )  470,000 With knowledge of 13 claims, the variance is 13 times the individual variance, or 13 (40,000)  520,000. √ The ratio of standard deviations is 47/52  0.9507, making the answer 0.9507 − 1  −0.0493 . (B)

14.22. The best way to do this is to treat it as a compound distribution with a Bernoulli frequency. The frequency of claims over 100 is e −100/1000  0.904837, and the severity of claims above 100 is exponential with mean 1000 since exponentials are forgetful. By the compound variance formula Var ( S )  0.904837 (10002 ) + (0.904837)(1 − 0.904837)(10002 )  990,944

(D)

14.23. N is the frequency random variable. It is Bernoulli, and equal to 1 if the loss is greater than 5. E[N]  0.95, Var ( N )  (0.95)(0.05)  0.0475. X is the payment random variable, and it is uniform on [0, 95], so E[X]  47.5, Var ( X )  952 /12  752.0833. Then Var ( S )  0.95 (752.0833) + 0.0475 (47.52 )  821.6510 . 14.24. Now E[N]  e −0.1  0.90484 and Var ( N )  e −0.1 (1 − e −0.1 )  0.08611. X after the deductible is still exponential with mean 50, variance 2500. So Var ( S )  0.90484 (2500) + 0.08611 (2500)  2477.36 . 14.25. The number of payments per loss, N, is Bernoulli and equals 1 whenever the loss is greater than 100 and 0 otherwise. Payment size given a payment is made is uniform on (0, 900) or (0, 950) after inflation. If Y is the payment random variable before inflation, and Y 0 after inflation. For a uniform variable, the variance is the range squared divided by 12. Var ( Y )  E[N] Var ( Y | N ) + Var ( N ) E[Y | N]2  0.9 C/4 Study Manual—17th edition Copyright ©2014 ASM

9002 + (0.9)(0.1)(4502 )  78,975 12

!

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

254

Var ( Y 0 )  E[N 0] Var ( Y 0 | N 0 ) + Var ( N 0 ) E[Y 0 | N 0]2 

p

19 21

!

9502 19 + 12 21

!

87,487/78,975 − 1  0.052513

!

2 (4752 )  87,487 21

!

(B)

14.26. The correlation coefficient ρ is E[SS ∗ ] − E[S] E[S∗ ] ρ √ Var ( S ) Var ( S∗ ) We first calculate the denominator. Var ( S∗ )  E ( N ) Var ( Y ) + Var ( N ) E ( Y ) 2  10 (0.52 ) + 10 (1.52 )  25 Var ( S )  1.52 Var ( N )  22.5 E[S]  E[S∗ ]  15

X  N N X   E[SS ]  E  E[X i Yj ]  i1 j1  ∗

E[X i Yj ]  E[X i ] E[Yj ]

because X j is constant and therefore independent of any other random variable

 (1.5)(1.5)  2.25

X  N N X   E[SS ]  E  2.25  i1 j1  f g   2.25 E N 2  (2.25)(100 + 10)  247.5 ∗

The last line is because E[N 2 ]  Var ( N ) + E[N]2 , and for a Poisson Var ( N )  E[N]  10. So 247.5 − 152  0.9487 ρ√ (25)(22.5)

(E)

14.27. We’ll use the conditional variance formula. Given that a claim occurs, the probability that it is type W is 0.02/0.1  0.2 the probability that it is type X is 0.03/0.1  0.3 the probability that it is type Y is 0.04/0.1  0.4 the probability that it is type Z is 0.01/0.1  0.1 Let U be claim size. We would like to calculate Var (U )  VarI EU [U | I] + EI VarU (U | I )





Let’s calculate VarI EU [U | I] .





EI EU [U | I]  EI [200, 1000, 100, 1500]

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

f

g

EXERCISE SOLUTIONS FOR LESSON 14

255

 0.2 (200) + 0.3 (1,000) + 0.4 (100) + 0.1 (1,500)  530 2

EI EU [U | I]

f

g

 EI [2002 , 10002 , 1002 , 15002 ]  0.2 (2002 ) + 0.3 (1,0002 ) + 0.4 (1002 ) + 0.1 (1,5002 )  537,000

VarI EU [U | I]  EI EU [U | I]2 − EI EU [U | I]





f

g

f

g2

 537,000 − 5302  256,100

Now let’s calculate EI VarU (U | I ) .

f

g

EI VarU (U | I ) ]  EI [2500, 1,000,000, 0, 2,000,000]

f

 0.2 (2,500) + 0.3 (1,000,000) + 0.4 (0) + 0.1 (2,000,000)  500,500

The variance of a claim U is 256,100 + 500,500  756,600 . 14.28. Annual aggregate losses is the sum of four compound Poisson distributions. Claim counts and claim sizes are independent, and claim sizes are independent of each other, so we can apply the compound variance formula (14.4) to each type of claim: Var (W )  0.02 (2002 + 2,500)  850 Var ( X )  0.03 (1,0002 + 1,000,000)  60,000 Var ( Y )  0.04 (1002 )  400 Var ( Z )  0.01 (1,5002 + 2,000,000)  42,500 The variance of annual aggregate losses is the sum, 850 + 60,000 + 400 + 42,500  103,750 . (E) 14.29. Unlike in the previous exercise, there are four types of policyholder, not one, and the claim size distributions are not identical for all policyholders; they vary by type of policyholder. We can condition aggregate losses on the type of policyholder since the distribution of aggregate losses is distinct for each type of policyholder. Therefore, we use the conditional variance formula. Let I be the type of policyholder, and let S be aggregate losses. E[S]  E E[S | I]  0.2 (0.1)(200) + 0.3 (0.1)(1,000) + 0.4 (0.1)(100) + 0.1 (0.1)(1,500)

f

g

















 0.2 (20) + 0.3 (100) + 0.4 (10) + 0.1 (150)  53

f

E E[S | I]

2

g

 0.2 (202 ) + 0.3 (1002 ) + 0.4 (102 ) + 0.1 (1502 )  5,370

Var E[S | I]  5370 − 532  2,561





To calculate the conditional variance for each type of policyholder, we use the Poisson compound variance formula. Var ( S | W )  0.1 (2002 + 2,500)  4,250

Var ( S | X )  0.1 (1,0002 + 1,000,000)  200,000 Var ( S | Y )  0.1 (1002 )  1,000

Var ( S | Z )  0.1 (1,5002 + 2,000,000)  425,000

E Var ( S | I )  0.2 (4,250) + 0.3 (200,000) + 0.4 (1,000) + 0.1 (425,000)  103,750

f

g

The variance of aggregate losses is then

Var E[S | I] + E Var ( S | I )  2,561 + 103,750  106,311

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

f

g

256

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

Quiz Solutions 14-1. E[N]  0.3 + 0.2 (2) + 0.1 (3)  1 Var ( N )  0.4 (0 − 1) 2 + 0.2 (2 − 1) 2 + 0.1 (3 − 1) 2  1 1200 E[X]   300 4 2 (12002 ) Var ( X )  − 3002  150,000 4·3 Var ( S )  (1)(150,000) + (1)(3002 )  240,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 15

Aggregate Loss Models: Approximating Distribution Reading: Loss Models Fourth Edition 9.3, 9.8.2 The aggregate distribution may be approximated with a normal distribution. This may be theoretically justified by the Central Limit Theorem if the group is large. If severity is discrete, then the aggregate loss distribution is discrete, and a continuity correction is required. This means that whenever the distribution X assumes values a and b (a < b) but no value in between, all of the following statements are equivalent: • X>a • X≥b

• X > c for any c ∈ ( a, b )

To assure they all resultin the same answer, you evaluate the probability that X is greater than the mid point of the interval, Pr X > ( a + b ) /2 . Similarly the following statements are all equivalent: • X≤a

• X 100) , you instead evaluate Pr ( S > 105) , and if you want to evaluate Pr ( S ≤ 100) , you instead evaluate Pr ( S ≤ 105) .





Example 15A For a group insurance policy, the number of claims from the group has a binomial distribution with mean 100 and variance 20. The size of each claim has the following distribution: Claim Size Probability 1 2 3 4

0.50 0.35 0.10 0.05

Using the approximating normal distribution, calculate the probability that aggregate claims from this group will be greater than 180. Answer: The mean of the secondary distribution is E[X]  0.50 (1) + 0.35 (2) + 0.10 (3) + 0.05 (4)  1.70. The second moment is E[X 2 ]  0.50 (12 ) + 0.35 (22 ) + 0.10 (32 ) + 0.05 (42 )  3.60. C/4 Study Manual—17th edition Copyright ©2014 ASM

257

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

258

Therefore Var ( X )  3.6 − 1.72  0.71. We then calculate the moments of S: E[S]  100 (1.7)  170 Var ( S )  100 (0.71) + 20 (1.72 )  128.8 Then the probability that aggregate losses are greater than 180, with the continuity correction, is 1−Φ

180.5 − 170  1 − Φ (0.93)  1 − 0.8238  0.1762 √ 128.8

!

  √ Without the continuity correction, the answer would have been 1 − Φ (180 − 170) / 128.8  1 − Φ (0.88)  1 − 0.8106  0.1894. 

If severity has a continuous distribution, no continuity correction is made since S has a continuous distribution when X is continuous. Example 15B [1999 C3 Sample:25] For aggregate losses S  X1 + X2 + · · · + X N , you are given: • N has a Poisson distribution with mean 500. • X1 , X2 , . . . have mean 100 and variance 100. • N, X1 , X2 . . . are mutually independent. You are also given:

• For a portfolio of insurance policies, the loss ratio is the ratio of aggregate losses to aggregate premiums collected. • The premium collected is 1.1 times the expected aggregate losses. Using the normal approximation to the compound Poisson distribution, calculate the probability that the loss ratio exceeds 0.95. Answer: E[S]  E[N] E[X]  500 (100)  50,000. Therefore, premium is 1.1 (50,000)  55,000. For the loss ratio to equal 0.95, losses must be (0.95)(55,000)f  52,250. Next we calculate the variance of S. For a g 2 compound Poisson distribution, this reduces to λ E X  500 (100 + 1002 )  5,050,000. The probability we require is then 1−Φ

?

52,250 − 50,000 2,250 1−Φ  1 − Φ (1.00)  1 − 0.8413  0.1587 . √ 2,247.22 5,050,000

!

!



Quiz 15-1 Claim counts follow a geometric distribution with β  0.15. Claim sizes follow an inverse gamma distribution with parameters α  3, θ  1000. Calculate the Value-at-Risk of aggregate losses at the 95% security level using the normal approximation. When the sample isn’t large enough and has a heavy tail, the symmetric normal distribution isn’t appropriate. Sometimes the lognormal distribution is used instead, even though there is no theoretical justification. The parameters of the lognormal are selected by matching the mean and the variance. This is a special case of fitting a parametric distribution using the method of moments. Fitting a distribution using the method of moments will be discussed in more generality in Lesson 30. C/4 Study Manual—17th edition Copyright ©2014 ASM

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

259

Example 15C The number of claims on a policy has a Poisson distribution with λ  0.1. Claim sizes have a gamma distribution with α  0.5, θ  1000. Aggregate losses S for the policy are approximated with a lognormal distribution matching the mean and variance of the aggregate distribution. Calculate FS (80) using this approximation. Answer: The primary distribution is Poisson, so we use equation (14.4) to compute the variance of the aggregate distribution. E[X]  (0.5)(1000)  500 E[X 2 ]  θ 2 ( α + 1)( α )  (10002 )(1.5)(0.5)  750,000 E[S]  λ E[X]  (0.1)(500)  50 Var ( S )  λ E X 2  (0.1)(750,000)  75,000

f

g

E[S2 ]  75,000 + 502  77,500 Equating the first two raw moments of the distributions is equivalent to equating the means and variances of the distributions. We now equate the first two raw moments of a lognormal to the corresponding moments of the aggregate distribution. 2

E[S]  e µ+0.5σ  50 2

E[S2 ]  e 2µ+2σ  77,500 Taking logarithms, µ + 0.5σ2  ln 50 2µ + 2σ2  ln 77,500 Subtracting twice the first expression from the second expression, σ2  ln 77,500 − 2 ln 50  3.4340 √ σ  3.4340  1.8531 Solving the first expression for µ, µ  ln 50 − 0.5 (3.4340)  2.1950 So FS (80)  Φ

ln 80 − 2.1950  Φ (1.18)  0.8810 . 1.8531

!



So far, we have been discussing the collective risk model. Similar approximating distributions can be used for the individual risk model. The following example deals with frequency only. Example 15D For a group life insurance policy, you have the following statistics for the group:

C/4 Study Manual—17th edition Copyright ©2014 ASM

Age group

Number in group

Assumed mortality rate

30–34 35–39 40–44 45–49

22 18 15 10

0.005 0.006 0.009 0.013

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

260

Using a normal approximation and a lognormal approximation, calculate the probability of 2 or more deaths in a year. Answer: Since the number of deaths is discrete, a continuity correction is appropriate, so we will calculate the probability Pr ( N ≥ 1.5) . The individual distributions are binomial with m  1 and q varying. All lives are assumed to be independent. Assume this even if the question doesn’t say it, unless the question says otherwise. Therefore, the mean is the sum of the means and the variance is the sum of the variances. For a binomial, the mean is mq and the variance is mq (1 − q ) . We have E[N]  22 (0.005) + 18 (0.006) + 15 (0.009) + 10 (0.013)  0.483 Var ( N )  22 (0.005)(0.995) + 18 (0.006)(0.994) + 15 (0.009)(0.991) + 10 (0.013)(0.987)  0.478897 E[N 2 ]  0.4832 + 0.478897  0.712186 For the normal distribution, the probability of 1.5 or more deaths is 1.5 − 0.483  1 − Φ (1.47)  1 − 0.9292  0.0708 Pr ( N ≥ 1.5)  1 − Φ √ 0.478897

!

For the lognormal distribution, as in the previous example, µ + 0.5σ2  ln 0.483 2µ + 2σ2  ln 0.712186 σ2  ln 0.712186 − 2 ln 0.483  1.1161 √ σ  1.1161  1.0564 µ  ln 0.483 − 0.5 (1.1161)  −1.2858

ln 1.5 − (−1.2858)  1 − Φ (1.60)  1 − 0.9452  0.0548 Pr ( S ≥ 1.5)  1 − Φ 1.0564

!

You may be curious how close these approximations are to the actual probability. You can calculate the probability of 0 deaths (0.99522 )(0.99418 )(0.99115 )(0.98710 )  0.6157 and the probability of 1 death, which works out to 0.3000. The probability of 2 or more deaths is then 0.0843. So neither approximation was that good. 

Exercises 15.1. The annual number of losses for each insured follows a Poisson distribution with parameter λ. The parameter λ varies by insured according to a gamma distribution with mean 12 , variance 12 , but does not vary by year for any single insured. There are 1500 insureds. Using the normal approximation, calculate the probability of more than 1600 claims in two years.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

261

[4B-S96:7] (3 points) You are given the following:

15.2. •

The number of claims follows a negative binomial distribution with mean 800 and variance 3,200.



Claim sizes follow a transformed gamma distribution with mean 3,000 and variance 36,000,000.



The number of claims and claim sizes are independent.

Using the Central Limit Theorem, determine the approximate probability that the aggregate losses will exceed 3,000,000. (A) (B) (C) (D) (E)

Less than 0.005 At least 0.005, but less than 0.01 At least 0.01, but less than 0.10 At least 0.10, but less than 0.50 At least 0.50 [4B-F96:31] (2 points) You are given the following:

15.3. •

A portfolio consists of 1,600 independent risks.



For each risk, the probability of at least one claim is 0.5.

Using the Central Limit Theorem, determine the approximate probability that the number of risks in the portfolio with at least one claim will be greater than 850. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.20 At least 0.20

15.4. The number of claims for a portfolio of insureds has a negative binomial distribution with parameters r  10 and β  3. The size of claims has the following distribution: 2000 F ( x )  1 − 0.5 2000 + x

!3

10,000 − 0.5 10,000 + x

!3

x ≥ 0.

Claim counts and claim sizes are independent. Using the normal approximation, determine the probability that aggregate claims will be greater than 100,000. 15.5. The number of claims for an insurance coverage averages 3 per month, with standard deviation 2. Each claim has a gamma distribution with parameters α  20 and θ  0.1. Claim counts and claim sizes are independent. Using the normal approximation, determine the probability that the aggregate losses for a year will be less than 60.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

262

15.6. The number of claims for an insurance coverage has a Poisson distribution with mean λ. Claim size has the following distribution: F (x )  1 −

3000 3000 + x

!3

x ≥ 0.

Claim counts and claim sizes are independent. Using the normal approximation, the probability that aggregate losses will be greater than 4000 is 0.2743. Determine λ. 15.7. For a group of insureds, you are given the following information regarding losses per individual per year: Mean Standard deviation

Number

Amount

0.2 0.1

1000 700

Number of losses and amount of losses are independent. The insurance group has 76 insureds. Using the normal approximation, determine the probability that aggregate losses for a year will exceed 15,000. 15.8. The annual number of claims on an insurance coverage has a Poisson distribution with mean 4. Claim size is uniformly distributed on [0, u]. Number of claims and claim sizes are independent. Using the normal approximation, you calculate that the probability that aggregate claims for a year will be greater than 50,000 is 0.2743. Determine u. 15.9. [151-83-94:17] (3 points) An insurer offers group term life insurance coverage on 250 mutually independent lives for a premium of 350. The probability of a claim is 0.02 for each life. The distribution of number of lives by face amount is: Face Amount

Number of Lives

20 50 100

100 100 50

Reinsurance is purchased which costs 120% of expected claims above a retention limit of 40 per life. Using the normal approximation to the distribution of retained claims, determine the probability that the total of retained claims and reinsurance premiums exceeds the premium. (A) 0.10

(B) 0.12

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.14

(D) 0.16

(E) 0.18

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

263

Use the following information for questions 15.10 and 15.11: You are given the following: (i) The number of claims per year follows a Poisson distribution with mean 300. (ii) Claim sizes follow a Generalized Pareto distribution with parameters θ  1000, α  3, and τ  2. (iii) The number of claims and claim sizes are independent. 15.10. [4B-F99:12] (2 points) Determine the probability that annual aggregate losses will exceed 360,000. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.03 At least 0.03, but less than 0.05 At least 0.05, but less than 0.07 At least 0.07

15.11. [4B-F99:13] (2 points) After a number of years, the number of claims per year still follows a Poisson distribution, but the expected number of claims per year has been cut in half. Claim sizes have increased uniformly by a factor of two. Determine the probability that annual aggregate losses will exceed 360,000. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.03 At least 0.03, but less than 0.05 At least 0.05, but less than 0.07 At least 0.07

15.12. [3-S00:16] You are given:

Number of Claims Individual Losses

Mean 8 10,000

Standard Deviation 3 3,937

Using the normal approximation, determine the probability that the aggregate loss will exceed 150% of the expected loss. (A) Φ (1.25)

(B) Φ (1.5)

(C) 1 − Φ (1.25)

(D) 1 − Φ (1.5)

(E) 1.5Φ (1)

15.13. [3-F00:2] In a clinic, physicians volunteer their time on a daily basis to provide care to those who are not eligible to obtain care otherwise. The number of physicians who volunteer in any day is uniformly distributed on the integers 1 through 5. The number of patients that can be served by a given physician has a Poisson distribution with mean 30. Determine the probability that 120 or more patients can be served in a day at the clinic, using the normal approximation with continuity correction. (A) 1 − Φ (0.68)

(B) 1 − Φ (0.72)

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1 − Φ (0.93)

(D) 1 − Φ (3.13)

(E) 1 − Φ (3.16)

Exercises continue on the next page . . .

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

264

15.14. [3-F00:32] For an individual over 65: (i) The number of pharmacy claims is a Poisson random variable with mean 25. (ii) The amount of each pharmacy claim is uniformly distributed between 5 and 95. (iii) The amounts of the claims and the number of claims are mutually independent. Determine the probability that aggregate claims for this individual will exceed 2000 using the normal approximation. (A) 1 − Φ (1.33)

(B) 1 − Φ (1.66)

(C) 1 − Φ (2.33)

(D) 1 − Φ (2.66)

(E) 1 − Φ (3.33)

15.15. [3-F02:6] The number of auto vandalism claims reported per month at Sunny Daze Insurance Company (SDIC) has mean 110 and variance 750. Individual losses have mean 1101 and standard deviation 70. The number of claims and the amounts of individual losses are independent. Using the normal approximation, calculate the probability that SDIC’s aggregate auto vandalism losses reported for a month will be less than 100,000. (A) 0.24

(B) 0.31

(C) 0.36

(D) 0.39

(E) 0.49

15.16. [3-S01:16] A dam is proposed for a river which is currently used for salmon breeding. You have modeled: (i)

For each hour the dam is opened the number of salmon that will pass through and reach the breeding grounds has a distribution with mean 100 and variance 900. (ii) The number of eggs released by each salmon has a distribution with mean of 5 and variance of 5. (iii) The number of salmon going through the dam each hour it is open and the numbers of eggs released by the salmon are independent. Using the normal approximation for the aggregate number of eggs released, determine the least number of whole hours the dam should be left open so the probability that 10,000 eggs will be released is greater than 95%. (A) 20

(B) 23

(C) 26

(D) 29

(E) 32

15.17. [3-F02:27] At the beginning of each round of a game of chance the player pays 12.5. The player then rolls one die with outcome N. The player then rolls N dice and wins an amount equal to the total of the numbers showing on the N dice. All dice have 6 sides and are fair. Using the normal approximation, calculate the probability that a player starting with 15,000 will have at least 15,000 after 1000 rounds. (A) 0.01

(B) 0.04

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.06

(D) 0.09

(E) 0.12

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

265

15.18. [3-F01:7] You own a fancy light bulb factory. Your workforce is a bit clumsy—they keep dropping boxes of light bulbs. The boxes have varying numbers of light bulbs in them, and when dropped, the entire box is destroyed. You are given: Expected number of boxes dropped per month: Variance of the number of boxes dropped per month: Expected value per box: Variance of the value per box:

50 100 200 400

You pay your employees a bonus if the value of light bulbs destroyed in a month is less than 8000. Assuming independence and using the normal approximation, calculate the probability that you will pay your employees a bonus next month. (A) 0.16

(B) 0.19

(C) 0.23

(D) 0.27

(E) 0.31

15.19. [SOA3-F03:4] Computer maintenance costs for a department are modeled as follows: (i)

The distribution of the number of maintenance calls each machine will need in a year is Poisson with mean 3. (ii) The cost for a maintenance call has mean 80 and standard deviation 200. (iii) The number of maintenance calls and the costs of the maintenance calls are all mutually independent. The department must buy a maintenance contract to cover repairs if there is at least a 10% probability that aggregate maintenance costs in a given year will exceed 120% of the expected costs. Using the normal approximation for the distribution of the aggregate maintenance costs, calculate the minimum number of computers needed to avoid purchasing a maintenance contract. (A) 80

(B) 90

(C) 100

(D) 110

(E) 120

15.20. [SOA3-F03:33] A towing company provides all towing services to members of the City Automobile Club. You are given: (i)

Towing Distance 0–9.99 miles 10–29.99 miles 30+ miles

Towing Cost 80 100 160

Frequency 50% 40% 10%

(ii)

The automobile owner must pay 10% of the cost and the remainder is paid by the City Automobile Club. (iii) The number of towings has a Poisson distribution with mean of 1000 per year. (iv) The number of towings and the cost of individual towings are all mutually independent. Using the normal approximation for the distribution of aggregate towing costs, calculate the probability that the City Automobile Club pays more than 90,000 in any given year. (A) 3%

(B) 10%

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 50%

(D) 90%

(E) 97%

Exercises continue on the next page . . .

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

266

15.21. [CAS3-F03:30] Speedy Delivery Company makes deliveries 6 days a week. The daily number of accidents involving Speedy vehicles follows a Poisson distribution with mean 3 and are independent. In each accident, damage to the contents of Speedy’s vehicles is distributed as follows: Amount of damage

Probability

$ 0 $2,000 $8,000

1/4 1/2 1/4

Using the normal approximation, calculate the probability that Speedy’s weekly aggregate damages will not exceed $63,000. (A) 0.24

(B) 0.31

(C) 0.54

(D) 0.69

(E) 0.76

15.22. [CAS3-F04:32] An insurance policy provides full coverage for the aggregate losses of the Widget Factory. The number of claims for the Widget Factory follows a negative binomial distribution with mean 25 and coefficient of variation 1.2. The severity distribution is given by a lognormal distribution with mean 10,000 and coefficient of variation 3. To control losses, the insurer proposes that the Widget Factory pay 20% of the cost of each loss. Calculate the reduction in the 95th percentile of the normal approximation of the insurer’s loss. (A) (B) (C) (D) (E)

Less than 5% At least 5%, but less than 15% At least 15%, but less than 25% At least 25%, but less than 35% At least 35%

15.23. [SOA3-F04:15] Two types of insurance claims are made to an insurance company. For each type, the number of claims follows a Poisson distribution and the amount of each claim is uniformly distributed as follows: Type of Claim

Poisson Parameter λ for Number of Claims

Range of Each Claim Amount

I II

12 4

(0, 1) (0, 5)

The numbers of claims of the two types are independent and the claim amounts and claim numbers are independent. Calculate the normal approximation to the probability that the total of claim amounts exceeds 18. (A) 0.37

(B) 0.39

(C) 0.41

(D) 0.43

(E) 0.45

15.24. For an insurance coverage, the number of claims follows a Poisson distribution with mean 2 and the size of claims follows a Poisson distribution with mean 10. Number of claims and claim sizes are independent. Calculate the probability that aggregate losses will be less than or equal to 5, using the normal approximation.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

267

15.25. For an insurance coverage, you are given (i) The number of claims for each insured follows a Poisson distribution with mean λ. (ii) λ varies by insured according to a gamma distribution with parameters α  4, θ  0.2. (iii) Claim size, before application of policy limits, follows a single-parameter Pareto with parameters α  3, θ  10,000. (iv) Coverage is subject to a maximum covered loss of 20,000. (v) Number of claims and claim sizes are independent. Calculate the probability that aggregate losses will be greater than 30,000, using the normal approximation. 15.26. Aggregate losses follow a collective risk model. Each loss follows a lognormal distribution with parameters µ  5, σ  1.2. The number of losses per year follows a Poisson distribution with mean 0.7. Estimate the probability that aggregate losses will exceed 300 using the lognormal approximation of aggregate losses. 15.27. [CAS3-S04:38] You are asked to price a Workers’ Compensation policy for a large employer. The employer wants to buy a policy from your company with an aggregate limit of 150% of total expected loss. You know the distribution for aggregate claims is Lognormal. You are also provided with the following: Number of claims Amount of individual loss

Mean

Standard Deviation

50 4,500

12 3,000

Calculate the probability that the aggregate loss will exceed the aggregate limit. (A) (B) (C) (D) (E)

Less than 3.5% At least 3.5%, but less than 4.5% At least 4.5%, but less than 5.5% At least 5.5%, but less than 6.5% At least 6.5%

15.28. For an insurance company, claim counts per policy follow a Poisson distribution with λ  0.4. Claim sizes follow a Pareto distribution with α  3 and θ  10. Claim counts and claim sizes are independent. There are 500 policies in force. Using the normal approximation, estimate the Tail-Value-at-Risk at the 90% security level for the company. Additional released exam questions: SOA M-S05:40, CAS3-F05:30,34, SOA M-F05:18, C-S07:17

Solutions 15.1. For each insured, the Poisson parameter over two years is Λ  2λ. Since E[Λ]  2 E[λ] and Var (Λ)  4 Var ( λ ) , the parameter Λ follows a gamma distribution withg mean 1 and variance 2. Let N be the number f of losses over the two-year period. Then E[N]  E E[N | Λ]  E[Λ]  1 and the variance of N is Var ( N )  E Var ( N | Λ) + Var E[N | Λ]  E[Λ] + Var (Λ)  1 + 2  3

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g





15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

268

For 1500 insureds, the aggregate mean is 1500 and the aggregate variance is 1500 (3)  4500. We make a continuity correction and check the probability that a normal distribution with these parameters is greater than 1600.5: Pr ( N > 1600)  1 − Φ

1600.5 − 1500  1 − Φ (1.50)  1 − 0.9332  0.0668 √ 4500

!

15.2. The transformed gamma distribution is not in the tables, but all you need to do the calculation is the mean and variance of claim sizes. Let S be aggregate losses. Using equation (14.2), E[S]  E[N] E[X]  (800)(3000)  2,400,000 Var ( S )  E[N] Var ( X ) + E[X]2 Var ( N )

p

 800 (36 · 106 ) + (9 · 106 ) 3200  57,600 × 106

57,600 × 106  240,000

Pr ( S > 3,000,000)  1 − Φ

600,000  1 − Φ (2.5)  1 − 0.9938  0.0062 240,000

!

(B)

15.3. This is a binomial distribution with parameters m  1600, q  0.5. The mean is 800; the variance is 400. We will make a continuity correction. 1−Φ

850.5 − 800  1 − Φ (2.53)  1 − 0.9943  0.0057 √ 400

!

(A)

Even if you (by mistake) didn’t make a continuity correction, you would get the right range: 850 − 800 1−Φ √  1 − Φ (2.5)  0.0062. 400

!

15.4.

For a negative binomial distribution, the mean is rβ and the variance is rβ (1 + β ) . E[N]  (10)(3)  30 Var ( N )  (10)(3)(4)  120

Severity is a mixture of two Pareto distributions with parameters and (3, 10,000). For .  ( α, θ )  (3, 2000)  2 each one, the mean is θ/ ( α − 1) and the second moment is 2θ ( α − 1)( α − 2) . 2000 10,000 +  3000 2 2 ! 2 (20002 ) 2 (10,000) 2 2 E[X ]  0.5 +  52 (106 ) 2 2 E[X]  0.5





Var ( X )  52 (106 ) − 30002  43 (106 ) E[S]  (30)(3000)  90,000

Var ( S )  30 43 (106 ) + 120 (30002 )  2370 (106 )



Pr ( S > 100,000)  1 − Φ

C/4 Study Manual—17th edition Copyright ©2014 ASM



100,000 − 90,000

p

2370 (106 )

!

 1 − Φ (0.21)  1 − 0.5832  0.4168

EXERCISE SOLUTIONS FOR LESSON 15

269

15.5. Let N be the number of claims in a year. Then E[N]  12 (3)  36 and Var ( N )  12 (4)  48. Let X be claim severity. Then E[X]  (20)(0.1)  2 and Var ( X )  (20)(0.1) 2  0.2 E[S]  (36)(2)  72 Var ( S )  36 (0.2) + 48 (4)  199.2 60 − 72 Φ √  Φ (−0.85)  0.1977 199.2

!

15.6. Claim size X is Pareto with parameters α  3, θ  3000, so E[X]  3000/2  1500 and E[X 2 ]  30002 . Thus for aggregate losses S, E[S]  1500λ Var ( S )  (30002 ) λ Φ−1 (0.2743)  −0.6. Hence:

4000 − 1500λ

p



(30002 ) λ

by formula (14.4)

 0.6

(*)

√ √ 4000 − 1500λ  0.6 (3000) λ  1800 λ √ 1500λ + 1800 λ − 4000  0

−1800 ± 18002 + 4 (1500)(4000)  1.1397, −2.3397 λ 2 (1500) λ  1.299, 5.474

p

However, 5.474 gives −0.6 when plugged back into (*), so the correct answer is 1.299 . 15.7.

Let S be the aggregate losses for 76 insureds.

E[S]  76 (200)  15,200 Var ( S )  76 0.2 (7002 ) + 0.01 (10002 )  8,208,000





15,200 − 15,000  Φ (0.07)  0.5279 Pr ( S > 15,000)  Φ √ 8,208,000

!

15.8.

Let X be severity. E[X] 

u 2

and E X 2 

f

g

u2 3 .

Let S be aggregate losses.

E[S]  4 Var ( S )  4

u 2  2u 4u 2 u2  3  3

Since Φ−1 (0.2743)  −0.60, we have

50,000 − 2u  0.60 √ 2u/ 3





(0.60)(2u )  50,000 3 − 2 3u u

C/4 Study Manual—17th edition Copyright ©2014 ASM

√ 50,000 3

√  18,568

(0.60)(2) + 2 3

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

270

15.9.

Claims for each life have a binomial distribution with q  0.02. Expected retained claims are 0.02 20 (100) + 40 (150)  160





Expected reinsured claims are 0.02 100 (10) + 50 (60)  80





Reinsurance premium is therefore 1.2 (80)  96, and total expected expenses are 160 + 96  256. Variance of retained claims is   (0.02)(0.98) 100 (202 ) + 150 (402 )  5488 Using the normal approximation, we want 350 − 256  1 − Φ (1.27)  1 − 0.8980  0.1020 1−Φ √ 5488

!

(A)

Strictly speaking, we should make a continuity correction. Expenses are always 96 plus a multiple of 20. In order to be more than 350, they would have to be 356, or 20 more than 336. We should therefore calculate the probability that they are more than the midpoint, or 346. Then we get 346 − 256  1 − Φ (1.21)  0.1131 1−Φ √ 5488

!

Since this is not one of the five answer choices, they apparently didn’t expect you to make a continuity correction. 15.10. We need to use the normal approximation, although the question didn’t mention the normal approximation. 1000Γ (3) Γ (2)  1000 Γ (3) Γ (2) 10002 Γ (4) Γ (1) 10002 (6) E[X 2 ]    3,000,000 Γ (3) Γ (2) 2 E[S]  300 (1000)  300,000 E[X] 

p

Var ( S ) 

p

300 (3,000,000)  30,000

Pr ( S > 360,000)  1 − Φ

360,000 − 300,000  1 − Φ (2)  0.0228 30,000

!

(B)

15.11. We multiply θ by 2 to inflate claim sizes, so θ  2000. E[S] doesn’t change. E[X 2 ] is multiplied by 22 so Var ( S ) is multiplied by 22 /2  2. Then 60,000 Pr ( X > 360,000)  1 − Φ √  1 − Φ (1.41)  0.0793 30,000 2

!

(E)

15.12. The overall mean is 8 (10,000)  80,000. The overall variance is 8 (39372 ) + 32 (10,000) 2  1,023,999,752 √ and the standard deviation is 1,023,999,752  32,000. The probability of exceeding 120,000 is 1 −  Φ 40,000 32,000  1 − Φ (1.25) . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 15

271

15.13. The discrete uniform distribution has a mean of 3 and a variance of

(1 − 3) 2 + (2 − 3) 2 + (4 − 3) 2 + (5 − 3) 2 5

2

The aggregate mean is 3 (30)  90 and the aggregate variance is Var ( S )  3 (30) + 2 (302 )  1890 Since the number of patients is discrete, we must make a continuity correction and calculate the probability that the normal random variable is greater than 119.5. The probability of 119.5 or more is 119.5 − 90 1−Φ √  1 − Φ (0.68) 1890

!

(A)

15.14. The mean of the uniform distribution is 50. The mean of aggregate claims is the product of the Poisson mean and the severity mean, or 25 (50)  1250. The second moment of the uniform distribution is mean squared plus variance ( (95 − 5) 2 /12). For aggregate claims S, by equation (14.4), the variance is the Poisson mean times the severity second moment, or 902 Var ( S )  25 50 +  25 (2500 + 675)  79,375 12 2

!

The probability of exceeding 2000, using the normal approximation, is 2000 − 1250  1 − Φ (2.66) 1−Φ √ 79,375

!

(D)

15.15. E[S]  (110)(1101)  121,110 Var ( S )  (110)(702 ) + 750 (11012 )  909,689,750

p

909,689,750  30,161

100,000 − 121,110 Pr ( S < 100,000)  Φ 30,161  Φ (−0.7)  0.2420

!

(A)

15.16. The mean number of eggs in 1 hour is (100)(5)  500 and the variance, by the compound variance formula, is Var ( S )  100 (5) + 900 (52 )  23,000 For n hours the mean is 500n and the variance is 23,000n. We’d like 10,000 to be the 5th percentile (so that the number of eggs will be higher 95% of the time), so we solve: 10,000 − 500n  −1.645 √ 23,000n p √ 500n − 1.645 23,000 n − 10,000  0 √ 500n − 249.477 n − 10,000  0 C/4 Study Manual—17th edition Copyright ©2014 ASM

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

272



√ 249.77 ± 20,062,239 n  4.729, −4.230 1000 n  22.359, 17.890

Naturally, the lower number is the 5th percentile and the higher number is the 95th percentile. You can in fact plug 17.890 back into the original equation and you will get 1.645 instead of −1.645. So the answer is 23 hours are needed. (B) 15.17. For one die, the mean toss is 3.5 and the variance is 35/12. For one round, using the compound variance formula: E[S]  3.52 − 12.5  −0.25

35  2  35 + 3.5  45.9375 Var ( S )  3.5 12 12

!

!

In 1000 rounds, the average gain is 1000 (−0.25)  −250. The probability of gaining at least 0 after 1000 rounds is (we make a continuity adjustment, but this has very little effect): Pr ( S > 0)  1 − Φ √

249.5

!

1000 (45.9375)

 1 − Φ (1.16)  1 − 0.8770  0.1230

(E)

15.18. Let S be the value of the light bulbs. E[S]  (50)(200)  10,000 Var ( S )  (50)(400) + (100)(2002 )  4,020,000 8000 − 10,000 Pr ( S < 8000)  Φ √  Φ (−1)  0.1587 4,020,000

!

(A)

15.19. Let S1 be the maintenance cost for one machine. The mean cost per machine is E[S1 ]  3 (80)  240. and the variance, by the compound variance formula, is E[S1 ]  3 (2002 ) + 3 (802 )  139,200 So the mean cost for n machines is 240n and the variance is 139,200n. If S is aggregate maintenance costs, we want Pr ( S < 1.2 E[S]) ≤ 0.9 Using the normal approximation,

1.2 E[S] − E[S] Φ √ Var ( S ) ! 0.2 (240n ) Φ √ 139,200n √ 48 n √ 139,200 √ 48 n

!

> 0.9 > 0.9 > 1.282 since Φ−1 (0.9)  1.282 > 1.282 139,200

n> 100 machines are needed. (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

p

1.2822 (139,200)  99.3 482

EXERCISE SOLUTIONS FOR LESSON 15

273

15.20. Let X be towing cost per tow, S aggregate towing cost. Then E[X]  0.5 (80) + 0.4 (100) + 0.1 (160)  96 E[X 2 ]  0.5 (802 ) + 0.4 (1002 ) + 0.1 (1602 )  9760 E[S]  1000 (96)  96,000 Var ( S )  1000 (9760)  9,760,000 Since the club pays 90%, claims must be more than 100,000 before the club pays 90,000. To be precise, since the severity distribution is discrete, we should add a continuity correction. The interval between towing fees is 20, so we should add half of that, or 10, to 100,000. It hardly makes a difference with numbers of this magnitude, especially with SOA rounding rules on the argument of the standard normal’s cumulative distribution function. 100,010 − 96,000  1 − Φ (1.28)  1 − 0.8997  0.1003 Pr ( S > 100,010)  1 − Φ √ 9,760,000

!

(B)

15.21. We’ll calculate the mean and variance for 6 days. Let S be weekly aggregate damage, and X the amount of damage per accident. Then 1 1 (2000) + (8000)  3000 2 4

!

E[X] 

!

1 1 E[X ]  (20002 ) + (80002 )  18,000,000 2 4 2

!

!

E[S]  18 (3000)  54,000 Var[S]  18 (18,000,000)  324,000,000 63,000 − 54,000  Φ (0.5)  0.6915 Pr ( S ≤ 63,000)  Φ √ 324,000,000

!

(D)

Since severity is discrete, a continuity correction should be made. The loss random variable is a multiple of 2000. Thus in order to be more than 63,000, it must be at least 64,000, or 2000 more than the next possible value of 62,000. The midpoint is 63,000. Thus the continuity correction has no effect. 15.22. No calculations are needed. The aggregate mean and standard deviation are both reduced by 20%, so the normal approximation, which is a linear combination of the mean and standard deviation, is also reduced by 20% . (C) 15.23. The second moment of a uniform distribution starting at 0 is the upper bound squared over 3. Let S1 be aggregate claims of type I, S2 aggregate claims of type II, and S  S1 + S2 . For type I, E[S1 ]  12 Var ( S1 )  12

1 2 1 3

6 4

For type II, E[S2 ]  4 Var ( S2 )  4 Adding together, total mean is 16 and total variance is

5 2  10 25 100 3  3 112 3 .

Then the normal approximation gives

18 − 16 Pr ( S > 18)  1 − Φ √  1 − Φ (0.33)  1 − 0.6293  0.3707 112/3

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

274

15.24. E[X 2 ]  100 + 10  110, so Var ( S )  2 (110)  220. We must make a continuity correction (use 5.5 instead of 5), so ! 5.5 − 20  Φ (−0.98)  0.1635 Pr ( S < 5.5)  Φ √ 220 15.25. Recall from the first paragraph of Lesson 12 that a gamma mixture of Poisson distributions is a negative binomial distribution with the same parameters: r  α (which is 4 here) and β  θ (which is 0.2 here). It follows that E[N]  0.8 Var ( N )  0.96 Alternatively, for variance, use the conditional variance formula, equation (4.2) on page 64, and the fact that for a gamma distribution with parameters α, θ, the mean is αθ and the variance is αθ2 : Var ( N )  E Var ( N | λ ) + Var E[N | λ]  E[λ] + Var ( λ )  (4)(0.2) + (4)(0.22 )  0.96

f

g





Now for severity. 3 · 10,000 10,0003 −  13,750 2 2 (20,0002 ) 3 · 10,0002 2 · 10,0003 E[ ( X ∧ 20,000) 2 ]  −  2 · 108 1 20,000 Var ( X )  2 · 108 − 13,7502  10,937,500 E[X ∧ 20,000] 

E[S]  (0.8)(13,750)  11,000

Var ( S )  0.8 (10,937,500) + 0.96 (13,7502 )  190,250,000 30,000 − 11,000 Pr ( X > 30,000)  1 − Φ √  1 − Φ (1.38)  0.0838 190,250,000

!

15.26. We calculate aggregate mean and variance. 2

E[S]  0.7e 5+0.5 (1.2 )  213.4334 2

Var ( S )  0.7e 10+2 (1.2 )  274,669.8 E[S2 ]  274,669.8 + 213.43342  320,223.7 We solve for the µ and σ parameters of the lognormal having this mean and second moment. µ + 0.5σ2  ln 213.4334 2µ + 2σ2  ln 320,223.7 σ2  ln 320,223.7 − 2 ln 213.4334  1.9501 σ  1.3965

µ  ln 213.4334 − 0.5 (1.9501)  4.3883

Now apply the lognormal approximation.

ln 300 − 4.3883 Pr ( S > 300)  1 − Φ  1 − Φ (0.94)  0.1736 1.3965

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 15

275

15.27. For the aggregate distribution S: E[S]  50 (4,500)  225,000 Var ( S )  50 (30002 ) + 122 (45002 )  3,366,000,000 We must fit a lognormal to this mean and variance. We have for the lognormal’s µ and σ µ + 0.5σ2  ln 225,000  12.32386 2µ + 2σ2  ln (3,366,000,000 + 225,0002 )  24.71208 σ2  24.71208 − 2 (12.32386)  0.06436 σ  0.2537

µ  12.32386 − 0.5 (0.06436)  12.2917 The probability that S is greater than 1.5 (225,000)  337,500 is ln 337,500 − 12.2917 1−Φ  1 − Φ (1.72)  1 − 0.9573  0.0427 0.2537

!

(B)

15.28. The formula for TVaR for a normal distribution is equation (8.7): TVaR0.9 ( X )  µ + σ

φ ( z0.9 ) 1 − 0.9

In our case, the 90th percentile of a standard normal distribution is z0.9  1.282, and 2

φ (1.282) 

e −1.282 /2 0.4399   0.1754 √ 2.5066 2π

φ (1.282)  1.754 1 − 0.9

Let n  500 be the sample size and let S be aggregate losses. Mean aggregate losses are 10 θ  (500)(0.4)  1000 E[S]  nλ α−1 2

!

!

and letting X be the claim size distribution, the variance of aggregate losses is, using the compound variance formula for a compound Poisson distribution, Var ( S )  nλ E[X 2 ]  (500)(0.4)

2θ 2 2 (102 )  (500)(0.4)  20,000 ( α − 1)( α − 2) (2)(1)

!

Therefore, TVaR at the 90% security level is approximated as 1000 + 1.754 20,000  1248

p

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

276

Quiz Solutions 15-1.

Calculate the aggregate mean and variance. E[N]  0.15 E[X] 

Var ( N )  (0.15)(1.15)  0.1725

1000  500 2 E[S]  0.15 (500)  75

Var ( X ) 

10002 − 5002  250,000 2·1

Var ( S )  0.15 (250,000) + 0.1725 (5002 )  80,625 √ The 95th percentile of aggregate losses is 75 + 1.645 80,625  542.09

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 16

Aggregate Losses: Severity Modifications Reading: Loss Models Fourth Edition 9.7 Individual losses may be subject to deductibles, limits, or coinsurance. These modifications reduce the expected annual payments made on losses, or the expected annual aggregate costs. In previous lessons, when losses were subject to per-claim deductibles, we needed to distinguish between expected payment per loss and expected payment per payment, since not every loss resulted in a payment. In the previous paragraph, we introduced a third concept: expected annual payments, or expected annual aggregate costs. This concept is neither payment per payment nor payment per loss; it is payment per year. The word “annual” means “per year”. Expected annual payments are the average total payments divided by the number of years. Do not divide by number of losses, nor by number of payments. Do not exclude years in which no payments are made from the denominator. However, the concept of “expected annual payments” is related to the concepts of “payment per payment” and “payment per loss”. You may calculate expected annual aggregate payments (sometimes called “expected annual aggregate costs”) in one of two ways: 1. You may calculate expected number of losses times expected payment per loss. In other words, do not modify the frequency distribution for the deductible. Calculate expected number of losses (not payments). But modify the severity distribution. Use the payment-per-loss random variable Y L for the severity distribution, so that there will be a non-zero probability of 0. Multiply expected payment per loss times expected number of losses. 2. You may calculate expected number of payments times expected payment per payment. In other words, modify the frequency distribution for the deductible. Calculate the expected number of payments (not losses). Modify the severity distribution; use the payment per payment variable Y P . Payments of 0 are excluded. Multiply expected payment per payment times the number of payments. To repeat, you have two choices: Expected payment per loss × Expected number of losses per year OR

Expected payment per payment × Expected number of payments per year Both of these formulas will result in the same answer. The one thing to avoid is mixing the two formulas. You may not use expected payment per payment times expected number of losses per year! In the first formula there is no modification to frequency. Therefore, it is usually easier to use the first formula for discrete severity distributions. In the second formula frequency must be modified. However, expected payment per payment is easier to calculate than expected payment per loss if severity is exponential, Pareto, or uniform, making the second formula preferable in those cases. C/4 Study Manual—17th edition Copyright ©2014 ASM

277

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

278

Example 16A Annual frequency of losses follows a Poisson distribution with mean 0.4. Sizes of loss have the following discrete distribution: Loss size Probability 5 10 20 40

0.4 0.3 0.2 0.1

An insurance coverage has a per-loss ordinary deductible of 8. Calculate expected annual aggregate payments on this coverage. Answer: The expected payment per loss after the deductible is (0.3)(10 − 8) + (0.2)(20 − 8) + (0.1)(40 − 8)  6.2. The expected number of losses per year is 0.4. Hence expected annual aggregate payments are (0.4)(6.2)  2.48 . This was the easier way to calculate it, but you could also use the other formula. Expected payment per payment, since only 0.6 of losses get any payment, is 6.2/0.6. Expected number of payments per year is (0.6)(0.4)  0.24. Hence expected annual aggregate payments are 0.24 (6.2/0.6)  2.48 .  Example 16B Annual number of losses follows a negative binomial distribution with parameters r  2, β  0.1. Size of individual loss follow a two-parameter Pareto distribution with α  2, θ  10. An insurance coverage has a per-claim ordinary deductible of 8. Calculate expected annual aggregate payments on this coverage. Answer: The expected payment per payment is ( θ + d ) / ( α − 1)  (10 + 8) / (2 − 1)  18. The modified frequency distribution for the number of payments per year has mean rβ Pr ( X > 8)  (2)(0.1)(10/18) 2 . Expected annual aggregate payments are 18 (0.2)(10/18) 2  10/9 . It was easier to use this formula than the expected payment per loss times number of losses formula in this case.  Everything we said for expected values holds for variances as well. You may either use N (number of losses) in conjunction with Y L , or N P (number of payments) in conjunction with Y P . Either way, you use the compound variance formula.

?

Quiz 16-1 Annual frequency of losses follows a negative binomial distribution with parameters r  1.5, β  0.2. Individual loss sizes follow an exponential distribution with θ  40. An insurance coverage has a per-loss ordinary deductible of 25. Calculate the variance of annual aggregate payments on this coverage.

Exercises 16.1. A company insures a fleet of vehicles. Aggregate losses have a compound Poisson distribution. The expected annual number of losses is 10. Loss amounts, regardless of vehicle type, have a two-parameter Pareto distribution with parameters α  1, θ  200. Insurance is subject to a per-loss deductible of 100 and a per-loss maximum payment of 500. Calculate expected annual aggregate payments under this insurance.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

279

16.2. [3-S01:26] A company insures a fleet of vehicles. Aggregate losses have a compound Poisson distribution. The expected number of losses is 20. Loss amounts, regardless of vehicle type, have exponential distribution with θ  200. In order to reduce the cost of the insurance, two modifications are to be made: (i)

a certain type of vehicle will not be insured. It is estimated that this will reduce loss frequency by 20%. (ii) a deductible of 100 per loss will be imposed. Calculate the expected aggregate amount paid by the insurer after the modifications. (A) 1600

(B) 1940

(C) 2520

(D) 3200

(E) 3880

Use the following information for questions 16.3 and 16.4: •

Losses follow a Pareto distribution with parameters θ  1000 and α  2.



10 losses are expected each year.



The number of losses and the individual loss amounts are independent.



For each loss that occurs, the insurer’s payment is equal to the entire amount of the loss if the loss is greater than 100. The insurer makes no payment if the loss is less than or equal to 100.

16.3. (A) (B) (C) (D) (E)

[4B-S95:22] (2 points) Determine the insurer’s expected annual payments. Less than 8000 At least 8000, but less than 9000 At least 9000, but less than 9500 At least 9500, but less than 9900 At least 9900

16.4. [4B-S95:23] (2 points) Determine the insurer’s expected number of annual payments if all loss amounts increased uniformly by 10%. (A) (B) (C) (D) (E)

Less than 7.9 At least 7.9, but less than 8.1 At least 8.1, but less than 8.3 At least 8.3, but less than 8.5 At least 8.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

280

16.5. [CAS3-S04:39] PQR Re provides reinsurance to Telecom Insurance Company. PQR agrees to pay Telecom for all losses resulting from “events”, subject to a $500 per event deductible. For providing this coverage, PQR receives a premium of $250. Use a Poisson distribution with mean equal to 0.15 for the frequency of events. Event severity is from the following distribution:



Loss

Probability

250 500 750 1,000 1,250 1,500

0.10 0.25 0.30 0.25 0.05 0.05

i  0%

Using the normal approximation to PQR’s annual aggregate losses on this contract, what is the probability that PQR will pay out more than it receives? (A) (B) (C) (D) (E)

Less than 12% At least 12%, but less than 13% At least 13%, but less than 14% At least 14%, but less than 15% At least 15%

Use the following information for questions 16.6 and 16.7: Auto collision coverage is subject to a 100 deductible. Claims on this coverage occur at a Poisson rate of 0.3 per year. Claim size after the deductible has the following distribution: Claim size

Probability

100 300 500 700

0.4 0.3 0.2 0.1

Claim frequency and severity are independent. 16.6.

The deductible is raised to 200.

Calculate the variance in claim frequency with the revised deductible. 16.7.

The deductible is raised to 400.

Calculate the variance in aggregate payments per year with the revised deductible.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

281

[4B-F98:28] (2 points) You are given the following:

16.8. •

Losses follow a lognormal distribution with parameters µ  10 and σ  1.



One loss is expected each year.



For each loss less than or equal to 50,000, the insurer makes no payment.



For each loss greater than 50,000, the insurer pays the entire amount of the loss up to the policy limit of 100,000. Determine the insurer’s expected annual payments.

(A) (B) (C) (D) (E)

Less than 7,500 At least 7,500, but less than 12,500 At least 12,500, but less than 17,500 At least 17,500, but less than 22,500 At least 22,500

Use the following information for questions 16.9 and 16.10: An insurer has excess-of-loss reinsurance on auto insurance. You are given: (i) Total expected losses in the year 2001 are 10,000,000. (ii) In the year 2001 individual losses have a Pareto distribution with

!2

2000 F (x )  1 − , x + 2000

x>0

(iii) Reinsurance will pay the excess of each loss over 3000. (iv) Each year, the reinsurer is paid a ceded premium, C year , equal to 110% of the expected losses covered by the reinsurance. (v) Individual losses increase 5% each year due to inflation. (vi) The frequency distribution does not change. 16.9.

[3-F00:41] Calculate C 2001 .

(A) 2,200,000

(B) 3,300,000

(C) 4,400,000

(D) 5,500,000

(E) 6,600,000

(C) 1.06

(D) 1.07

(E) 1.08

16.10. [3-F00:42] Calculate C 2002 /C 2001 . (A) 1.04

(B) 1.05

16.11. The number of losses per year on an insurance coverage follows a binomial distribution with m  9, q  0.1. The size of each loss is uniformly distributed on (0, 60]. The size of loss is independent of the number of losses. The insurance coverage has a per-loss ordinary deductible of 12. Calculate the variance of annual aggregate payments on the coverage. 16.12. Losses follow an exponential distribution with mean 1000. Insurance pays losses subject to a deductible of 500 and a maximum covered loss. The expected annual number of losses is 10. Determine the maximum covered loss needed so that expected aggregate annual payments equal 5000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

282

16.13. You are given the following: (i)

The underlying loss distribution for an individual claim amount is a single-parameter Pareto with α  1 and θ  500. (ii) An insurance coverage has a deductible of 1,000 and a maximum covered loss of 10,000. (iii) The expected number of losses per year for each policyholder is 3. (iv) Annual claim counts and claim amounts are independent. Calculate the expected annual claim payments for a single policyholder on this coverage. Use the following information for questions 16.14 and 16.15: You are given the following: •

Loss sizes for Risk 1 follow a Pareto distribution with parameters θ and α, α > 2.



Loss sizes for Risk 2 follow a Pareto distribution with parameters θ and 0.8α, α > 2.



The insurer pays all losses in excess of a deductible of k.



1 loss is expected for each risk each year.

16.14. [4B-F97:22] (2 points) Determine the expected amount of annual losses paid by the insurer for Risk 1. θ+k (A) α−1 θα (B) (θ + k )α αθ α (C) ( θ + k ) α+1 θ α+1 (D) ( α − 1)( θ + k ) α θα (E) ( α − 1)( θ + k ) α−1 16.15. [4B-F97:23] (1 point) Determine the limit of the ratio of the expected amount of annual losses paid by the insurer for Risk 2 to the expected amount of annual losses paid by the insurer for Risk 1 as k goes to infinity. (A) 0

(B) 0.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1

(D) 1.25

(E) ∞

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

283

16.16. [CAS3-S04:19] A company has a machine that occasionally breaks down. An insurer offers a warranty for this machine. The number of breakdowns and their costs are independent. The number of breakdowns each year is given by the following distribution: # of breakdowns

Probability

0 1 2 3

50% 20% 20% 10%

The cost of each breakdown is given by the following distribution: Cost

Probability

1,000 2,000 3,000 5,000

50% 10% 10% 30%

To reduce costs, the insurer imposes a per claim deductible of 1,000. Compute the standard deviation of the insurer’s losses for this year. (A) 1,359

(B) 2,280

(C) 2,919

(D) 3,092

(E) 3,434

16.17. The annual number of losses on an insurance coverage has the following distribution: Number of losses

Probability

0 1 2 3

0.45 0.25 0.20 0.10

Size of loss follows a two-parameter Pareto distribution with α  4, θ  100. The insurance coverage has a deductible of 80. Calculate the variance of aggregate annual payments on the coverage. 16.18. Annual claim counts follow a geometric distribution with β  0.2. Loss sizes follow a uniform distribution on [0, 1000]. Losses are independent of each other and are independent of claim counts. A per-claim deductible of 200 is imposed. Calculate the raw second moment of annual aggregate payments.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

284

16.19. [3-F01:6] A group dental policy has a negative binomial claim count distribution with mean 300 and variance 800. Ground-up severity is given by the following table: Severity 40 80 120 200

Probability 0.25 0.25 0.25 0.25

You expect severity to increase 50% with no change in frequency. You decide to impose a per claim deductible of 100. Calculate the expected total claim payment after these changes. (A) (B) (C) (D) (E)

Less than 18,000 At least 18,000, but less than 20,000 At least 20,000, but less than 22,000 At least 22,000, but less than 24,000 At least 24,000

16.20. [SOA3-F04:17] The annual number of losses follows a Poisson distribution with a mean of 5. The size of each loss follows a two-parameter Pareto distribution with θ  10 and α  2.5. Claims counts and sizes are independent. An insurance for the losses has an ordinary deductible of 5 per loss. Calculate the expected value of the aggregate annual payments for this insurance. (A) 8

(B) 13

(C) 18

(D) 23

(E) 28

16.21. On an auto collision coverage, the number of losses per year follows a Poisson distribution with mean 0.25. Loss size is exponentially distributed with mean 1200. An ordinary deductible of 500 is applied to each loss. Loss sizes and loss counts are independent. Calculate the probability that aggregate claim payments for a year will be greater than 100, using the normal approximation. 16.22. On an auto collision coverage, you are given: (i)

The number of claims per year per individual has the following distribution: f (0)  0.7 f (1)  0.2 f (2)  0.1

(ii) Loss sizes are exponentially distributed with mean 1200. (iii) Loss sizes and claim counts are independent. An ordinary deductible of 500 is applied to each loss. Calculate the probability that aggregate claim payments for a year will be greater than 100, using the normal approximation.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

285

Use the following information for questions 16.23 and 16.24: On an insurance coverage, loss size has the following distribution: 2000 F (x )  1 − x

!3

x ≥ 2000.

The number of claims has a negative binomial distribution with mean 0.3, variance 0.6. Claim counts and loss sizes are independent. 16.23. A deductible of 2000 is applied to each claim. Calculate the variance of aggregate payments. 16.24. A deductible of 3000 is applied to each claim. Calculate the variance of aggregate payments. 16.25. For an insurance coverage, you are given •

Losses in 2009, before application of any deductible or limit, have a distribution with density function 1      3000 fX ( x )   − ( x−1000) /2000    e  3000

0 < x ≤ 1000 x ≥ 1000



Losses in 2010, before application of any deductible or limit, are impacted by 10% uniform inflation.



Insurance coverage is subject to a deductible of 500.



Loss frequency is the same in both years. You may use the following: ∞

Z 0

Z 0



e −x/β dx  β xe −x/β dx  β 2

Determine the percentage increase in aggregate claim payments in 2010 over 2009. Additional released exam questions: CAS3-S05:6

Solutions 16.1.

Expected payment per loss is E[X ∧ 600] − E[X ∧ 100]  −200 ln (200/800) + 200 ln (200/300)  196.166

Multiplying by 10 losses per year, the answer is 1961.66 . You could also calculate expected payment per payment and modify frequency, but that is a harder way.

C/4 Study Manual—17th edition Copyright ©2014 ASM

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

286

16.2. The first change will reduce frequency to 16. The second change will multiply frequency by S (100)  e −100/200  0.606531. For an exponential, expected payment per payment with the deductible— mean excess loss—is still θ  200. So expected aggregate losses are expected payment per payment times number of payments, or 16 (0.606531)(200)  1940.90 . (B) You could also calculate expected payment per loss and multiply that by unmodified frequency of 20, but that is harder. 16.3.

There are two ways to calculate annual payments:



Multiply expected number of losses times expected payment per loss.



Multiply expected number of payments times expected payment per payment.

In this question, there is an easy formula for mean excess loss for a Pareto, equation (6.10), so the second method is preferable. The mean excess loss at 100 is ( θ + d ) / ( α − 1)  1100/1  1100. The expected payment per payment for a franchise deductible of 100 is therefore 1100 + 100  1200. The expected number of payments per year is !2   1000 10 Pr ( X > 100)  10 1 − F (100)  10 1100 Therefore, expected annual payments are 10 16.4.

1000 2 1100 (1200)

The inflated θ for the Pareto is 1000 (1.1)  1100. The expected number of annual payments is 10 Pr ( X > 100)  10

16.5.

 9917.36 . (E)

1100 1100 + 100

!2  8.403

(D)

Mean loss size and second moment are, after the $500 deductible: E[X]  0.3 (250) + 0.25 (500) + 0.05 (750) + 0.05 (1000)  287.5 E[X 2 ]  0.3 (2502 ) + 0.25 (5002 ) + 0.05 (7502 ) + 0.05 (10002 )  159,375

Expected losses are 0.15 (287.5)  43.125, and variance is 0.15 (159,375)  23,906.25. We need the probability of paying out more than 250. Since the aggregate distribution is discrete, this is the same as the probability of paying out at least 500, and we need to make a continuity correction. We’ll calculate the probability of paying out more than 375, the midpoint of (250, 500) . 375 − 43.125 1−Φ √  1 − Φ (2.15)  1 − 0.9842  0.0158 23,906.25

!

(A)

The answer is way out of the range of the choices. Even without the continuity correction it would be far out of the range. We’ll never know what the CAS was thinking about. 16.6. The probability of paying a claim is 0.6 of what it was before, so the new Poisson rate is (0.6)(0.3)  0.18 . 16.7. The deductible of 100 was increased to 400, so all claims are 300 less. Thus the revised payment per payment distribution is a 2/3 probability of paying 200 and a 1/3 probability of paying 400. The revised Poisson rate is (0.3)(0.3)  0.09. We calculate the variance using the compound Poisson formula. Let S be aggregate payments and Y P individual payments.) Var ( S )  λ P (E[X 2 ]) 2 1 E[ ( Y P ) 2 ]  (2002 ) + (4002 )  80,000 3 3 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 16

287

Var ( S )  0.09 (80,000)  7200 Alternatively, and easier, you could let the compound model be number of losses and payment per loss. Then the Poisson rate is 0.3 and the payment per loss has a 0.7 probability of 0, 0.2 probability of 200, and 0.1 probability of 400. Let Y L be individual payments. E[ ( Y L ) 2 ]  0.2 (2002 ) + 0.1 (4002 )  24,000 Var ( S )  0.3 (24,000)  7200 16.8.

We want the payment per loss for a coverage with a franchise deductible. Answer  E[X ∧ 100,000] − E[X ∧ 50,000] + 50,000 1 − F (50,000)





The last term is added in to cover the franchise deductible. As the following development shows, it cancels one of the terms in E[X ∧ 50,000]. E[X ∧ 100,000]  e 10.5 Φ (ln 100,000 − 11) + 100,000 1 − Φ (ln 100,000 − 10)



E[X ∧ 50,000]  e 10.5 Φ (ln 50,000 − 11) + 50,000 1 − Φ (ln 50,000 − 10)



50,000 1 − F (50,000)  50,000 1 − Φ (ln 50,000 − 10)













Answer  e 10.5 Φ (ln 100,000 − 11) + 100,000 1 − Φ (ln 100,000 − 10)



− e 10.5 Φ (ln 50,000 − 11)



 e 10.5 Φ (0.51) + 100,000 1 − Φ (1.51) − e 10.5 Φ (−0.18)





 36,316 (0.6950) + 100,000 (1 − 0.9345) − 36,316 (0.4286)  25,239 + 6,550 − 15,565  16,224

(C)

16.9. Each individual loss has a Pareto distribution has parameters α  2 and θ  2000, so the mean individual loss is θ E[X]   2000 α−1

Since expected (aggregate) losses are 10,000,000, this means the expected number of losses is 10,000,000  2000 5000. Frequency doesn’t change, so the expected number of losses in 2001 and 2002 is also 5000. We need to calculate the expected reinsurance payment per loss, not per payment, since fwe know theg expected number of losses, not the expected number of reinsurance payments. We need E ( X − 3000)+ . This can be calculated by using E[ ( X − 3000)+ ]  E[X] − E[X ∧ 3000]

using the formula in the Loss Models appendix for E[X ∧ 3000], or using the special formula for the mean excess loss of a Pareto, equation (6.10) on page 100, in conjunction with formula (6.7). Let’s do it the latter way, since these two formulas are easy to remember and use. θ + 3000  5000 α−1 !2 2000 S (3000)   0.16 2000 + 3000 e (3000) 

E[ ( X − 3000)+ ]  e (3000) S (3000)  800 Therefore, C 2001  5000 (800)(1.1)  4,400,000 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

288

16.10. The new Pareto parameters for expected individual loss size are α  2, θ  2100. Proceeding as in the previous exercise, e (3000)  5100 2100 S (3000)  5100

!2

21002  864.7059 5100 864.7059  1.080882  800

E[ ( X − 3000)+ ]  C 2002 C 2001

(E)

Note that the other factors (5000, 1.1) are the same in both years, and therefore don’t affect the ratio. 16.11. Payment per payment is uniform on (0, 48]. The number of payments per year follows a binomial distribution with m  9, q  0.1 Pr ( X > 12)  0.08. Using the compound variance formula for payments: E[N P ]  mq  9 (0.08)  0.72 Var ( N P )  mq (1 − q )  9 (0.08)(0.92)  0.6624 48  24 E[X P ]  2 482 Var ( X P )   192 12 Var ( S )  (0.72)(192) + 0.6624 (242 )  519.7824 16.12. The average payment per loss is E[X ∧ u] − E[X ∧ 500]  1000 1 − e −u/1000 − 1 − e −500/1000





 1000 e −1/2 − e −u/1000









and we set this final expression equal to 500 so that with 10 expected losses it will equal 5000. 1000 e −1/2 − e −u/1000  500





e −u/1000  e −1/2 −

1 2

 0.10653

u  2239.32

16.13. The formulas in the tables don’t work for a single-parameter Pareto with α  1, so we work this out from first principles by integration. We integrate the survival function from 1000 to 10,000.

Z

10,000 1000

500 dx 10,000  500 (ln 10,000 − ln 1000)  1151.29  500 ln x 1000 x

With 3 expected losses per year, expected annual claim payments are 3 (1151.29)  3453.87 . 16.14. Expected annual losses are expected number of losses per year (1) times expected payment per loss. Thus we want E[X] − E[X ∧ k], which from the tables is θ θ * θ − 1− α−1 α−1 θ+k

,

which is (E). C/4 Study Manual—17th edition Copyright ©2014 ASM

! α−1

θ + θ α−1 θ+k -

! α−1

EXERCISE SOLUTIONS FOR LESSON 16

289

16.15. Using the previous exercise’s solution, the ratio is

( α − 1)( θ + k ) α−1 θ 0.8α ∼ ( θ + k ) 0.2α → ∞ . θα (0.8α − 1)( θ + k ) 0.8α−1

(E)

16.16. To make the numbers more manageable, we’ll express cost in thousands. Let N be number of breakdowns, X cost per breakdown, S aggregate losses. E[N]  0.2 (1) + 0.2 (2) + 0.1 (3)  0.9 E[N 2 ]  0.2 (12 ) + 0.2 (22 ) + 0.1 (32 )  1.9 Var ( N )  1.9 − 0.92  1.09 The deductible makes each cost 1 (thousand) less. E[X]  0.1 (1) + 0.1 (2) + 0.3 (4)  1.5 E[X 2 ]  0.1 (12 ) + 0.1 (22 ) + 0.3 (42 )  5.3 Var ( X )  5.3 − 1.52  3.05

Var ( S )  0.9 (3.05) + 1.09 (1.52 )  5.1975 √ σS  5.1975  2.2798

The standard deviation is 1000 times 2.2798, or 2,280 . (B) 16.17. Frequency is not in the ( a, b, i ) family (i  0, 1), so it cannot be modified. We will use a compound model with number of losses and payment per loss. Frequency of losses has mean 0.25 + 2 (0.2) + 3 (0.1)  0.95 and second moment 0.25 + 4 (0.2) + 9 (0.1)  1.95, hence variance 1.95 − 0.952  1.0475. The modified distribution for payment per payment is a Pareto with α  4, θ  100 + 80  180, as we learned on page 100. Its mean is 180/3  60 and its variance is 2 (1802 ) / (3)(2) −602  7200. The payment per loss is a two-component mixture with probability (100/180) 4 of a payment and probability 1 − (100/180) 4 of 0. Let’s calculate the mean and variance of this twocomponent mixture random variable Y L , with I being the component of the mixture. 100 E[Y ]  60 180 L

!4

 5.715592

Var ( Y L )  E[Var ( Y L | I ) ] + Var (E[Y L | I])

!4



100 100 (7200) + 180 180

 996.1386

!4

!4 *1 − 100 + (602 ) 180 , -

Using the compound variance formula on our compound model, Var ( S )  (0.95)(996.1386) + (1.0475)(5.7155922 )  980.55 16.18. We’ll calculate the raw second moment as the variance plus the mean squared. The payment per payment is uniform on (0, 800]. The probability of a payment is 0.8, so the modified

C/4 Study Manual—17th edition Copyright ©2014 ASM

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

290

claim count distribution is geometric with β  0.2 (0.8)  0.16. Then E[S]  0.16 (400)  64 8002 Var ( S )  0.16 + (0.16)(1.16)(4002 )  38,229 31 12

!

E[S2 ]  38,229 13 + 642  42,325 13 16.19. The new expectation for severity, the modified payment per loss, is 0.25 1.5 (80) − 100 + 0.25 1.5 (120) − 100 + 0.25 1.5 (200) − 100  75













Then 75 (300)  22,500 . (D) 16.20. The easier method for solving this is to calculate the expected number of payments, the expected number of claims above the deductible, and multiply by the expected payment per payment,  e ( d ) ,which is easy to calculate for a Pareto. The expected number of claims above the deductible is 5 1 − F (5) , and e (d ) 

15 θ+d   10, α − 1 1.5

by equation (6.10) on page 100. Putting it together, the answer is 5

10 15

! 2.5

(10)  18.1444

(C)

The alternative is to calculate the expected number of losses and multiply by the expected payment per loss. 16.21. Using the ideas in Lesson 13, we modify the Poisson parameter to reflect the frequency of claim payments. For each loss, a payment will be made only if it is over 500, i.e. with probability S (500)  e −5/12 . Hence claim payment frequency has a Poisson distribution with parameter 0.25e −5/12 . The size of the payment, given a payment is made, is exponential with mean 1200, since the exponential distribution is memoryless. Letting N be the frequency of payments and X the claim payment distribution, we calculate Var ( S ) , aggregate payments, the usual way. E[S]  0.25e −5/12 (1200)  197.77 Var ( S )  0.25e −5/12 (12002 ) + 0.25e −5/12 (12002 )  474,653.25 197.77 − 100 Pr ( S > 100)  Φ √  Φ (0.14)  0.5557 474,653.25

!

16.22. Let N P be the number of payments. This exercise is like the last exercise, except that the modified distribution of payment frequency is harder to calculate. There are two ways of making one payment: 1. 2.

One loss greater than 500; the probability of this is Pr ( N P  1) 1 − F (500)  0.2e −5/12  0.13185.





Two losses, one greater than 500 and one less, which can happen in either order; the probability of this is   2 Pr ( N P  2) F (500) 1 − F (500)  2 (0.1) e −5/12 (1 − e −5/12 )  0.04493

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 16

291

The probability of one payment is therefore 0.13185 + 0.04493  0.17678. The probability of two payments is the probability of two losses times S (500) 2 , or 0.1e −10/12  0.04346. E[N P ]  1 (0.17678) + 2 (0.04346)  0.26370 E[ ( N P ) 2 ]  1 (0.17678) + 4 (0.04346)  0.35062 Var ( N P )  0.35062 − 0.263702  0.28108 Since the exponential is memoryless, the distribution of payments given that a payment is made is exponential with mean 1200. E[S]  0.26370 (1200)  316.44 Var ( S )  0.26370 (12002 ) + 0.28108 (12002 )  784,478 Pr ( S > 100)  Φ

316.44 − 100  Φ (0.24)  0.5948 √ 784,478

!

The alternative is to calculate number of losses and payment per loss. Let N be number of losses. Then E[N]  0.4, E[N 2 ]  0.2 + 0.1 (22 )  0.6, and Var ( N )  0.6 − 0.42  0.44. The payment per loss distribution is a two-point mixture of 0 and an exponential with mean 1200, with weight S (500) on the latter. Let Y L be the payment per loss random variable. E[Y L ]  1200e −5/12 Var ( Y L )  e −5/12 (12002 ) + e −5/12 (1 − e −5/12 )(12002 )  1,272,792 where Var ( Y L ) was computed using the compound variance formula with a Bernoulli primary with q  e −5/12 and an exponential secondary with θ  1200. Then E[S]  0.4 (1200e −5/12 )  316.44 Var ( S )  0.4 (1,272,792) + 0.44 (1200e −5/12 ) 2  784,478 as before. 16.23. Severity has a single parameter Pareto distribution with α  3, θ  2000. The deductible of 2000 doesn’t affect the frequency of claims since all losses are greater than 2000. Therefore, the variance of payment size is unaffected, and the expected value of payment size is reduced by the deductible of 2000. Let X be the loss and Y P be the payment size.

(3)(2000)

 3000 2 E[X 2 ]  (3)(20002 )  12 (106 ) E[X] 

Var ( Y P )  Var ( X )  12 (106 ) − 30002  3 (106 ) E[Y P ]  E[X] − 2000  1000

Var ( S )  0.3 3 (106 ) + 0.6 (106 )  1,500,000





16.24. This exercise is harder than the previous one, since the deductible now affects claim frequency. There are two ways we can do the exercise: 1.

We can let N P be the number of payments and Y P the payment per payment.

C/4 Study Manual—17th edition Copyright ©2014 ASM

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

292

2.

We can let N be the number of losses and Y L the payment per loss, or

Both methods require work. I think the first method is easier, but will demonstrate both ways. First method The negative binomial has rβ  0.3 and rβ (1 + β )  0.6, so β  1, r  0.3. The probability of a loss above 3000 is !α !3 θ 2000 8 Pr ( X > 3000)    3000 3000 27 By Table 13.1, the modified negative binomial has r  0.3, β  8/27, so its moments are 2.4  0.088889 27

E[N P ] 

0.3 (8)(35)  0.115226 272

Var ( N P ) 

As we mentioned on page 100, Y P is a two-parameter Pareto with modified parameters with parameters θ  3000 and α  3. Using the tables to calculate its mean and variance: θ 3000   1500 α−1 3−1 f g 2 (30002 ) 2θ 2   30002 E (Y P ) 2  ( α − 1)( α − 2) 2 Var ( Y P )  30002 − 15002  6,750,000 E[ ( Y P ) ] 

The variance of aggregate payments is Var ( S )  E[N P ] Var ( Y P ) + Var ( N P ) E[Y P ]2  (0.088889)(6,750,000) + (0.115226)(15002 )  859,259 Second method We computed the mean and variance of Y P in the first method. Therefore, 8 E[Y ]  E[Y ] Pr ( X > 3000)  (1500)  444.444 27 L

!

P

The variance is computed by treating Y L as a compound distribution. The primary distribution is Bernoulli with q  Pr ( X > 3000) and the secondary is Y P . 8 8 Var ( Y )  (6,750,000) + 27 27 L

!

!

19 (15002 )  2,469,136 27

!

The variance of aggregate payments is Var ( S )  0.3 (2,469,136) + 0.6 (444.4442 )  859,259 16.25. Let X be the loss random variable in 2009, and Y the loss random variable in 2010. One way to compute the expected values is as follows: 1000

Z E[X]  1000

Z 0

C/4 Study Manual—17th edition Copyright ©2014 ASM

0

x dx + 3000



Z

1000

1000 1000 x dx   3000 6000 0 6 x2

xe − ( x−1000)/2000 dx 3000

QUIZ SOLUTIONS FOR LESSON 16 Substituting y  x − 1000:

Z



1000

293

1 xe − ( x−1000)/2000 dx  3000 3000 

So E[X] 

1000 6



Z 0

( y + 1000) e −y/2000 dy

20002 + 1000 (2000)  2000 3000

+ 2000  2166 23 . Now let’s compute E[X ∧ 500]. E[X ∧ 500] 

500

Z 0

  x dx + 500 1 − F (500) 3000

5 5002 + 500  6000 6

!

 458 13 So the average payment per loss E[X] − E[X ∧ 500]  2166 23 − 458 13  1708 13 . The number of losses doesn’t change with inflation, so if we calculate the average payment per loss after inflation, we’re done. But E[Y]  E[1.1X]  1.1 E[X]  2383.333 E[Y ∧ 500]  E[1.1X ∧ 500]  1.1 E X ∧

f

f

E X∧

500 1.1

g

500/1.1

Z 

0

500 1.1

x dx 500 * 500 + .1 − F / + 3000 1.1 1.1

!

, 500 1.1

g

-

< 1000, so the first definition of f ( x ) applies. 500 F  1.1

!

f

E X∧

500 1.1

g



500/1.1

Z 0

dx 500 1   3000 1.1 (3000) 6.6

(500/1.1) 2 6000

+

500 * 1 + .1 − / 1.1 6.6

500  34.4353 + 1.1

! , !

-

5.6  420.1102 6.6

E[Y ∧ 500]  1.1 (420.1102)  462.1212

So the average payment per loss is 2383.333 − 462.1212  1921.212. The percentage increase is 12.46% .

1921.212 1708.333

−1 

Quiz Solutions 16-1. Because of the exponential, it is easier to use the compound variance formula on the payment distribution rather than on the loss distribution. The modified negative binomial has parameters r  1.5 and β  0.2 Pr ( X > 25)  0.2e −25/40  0.107052. Payments are exponential with mean 40. Then the variance of annual aggregate payments is (using N P for the modified frequency and Y P for the modified severity) E[N P ] Var ( Y P ) + Var ( N P ) E[Y P ]2  1.5 (0.107052)(402 ) + (1.5)(0.107052)(1.107052)(402 )  541.36

C/4 Study Manual—17th edition Copyright ©2014 ASM

294

C/4 Study Manual—17th edition Copyright ©2014 ASM

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

Lesson 17

Aggregate Loss Models: The Recursive Formula Reading: Loss Models Fourth Edition 9.6 (before Section 9.6.1), 9.6.2–9.6.4 (skip 9.6.1) In order to help us calculate the aggregate loss distribution, we will assume the distribution of the severities, the X i ’s, is discrete. This may sound strange, since usually we assume severities are continuous. However, a continuous distribution may be approximated by a discrete distribution. We will discuss how to do that in Section 19.2. Once we make the assumption that severities are discrete, aggregate losses are also discrete. To obtain the distribution of aggregate losses, we only need to calculate Pr ( S  n ) for every possible value n of the aggregate loss distribution. Using the law of total probability, Pr ( S  n ) 

X k

k X * Xi  n+ Pr ( N  k ) Pr , i1

(*)

In principle, this expression can be calculated, although for high values of n it may require an inordinate amount of computing power. We will use the following notations for the probability functions of the three distributions—frequency, severity, aggregate loss. p n  Pr ( N  n )  f N ( n ) f n  Pr ( X  n )  f X ( n ) g n  Pr ( S  n )  fS ( n ) To calculate g n , we must sum up over all possibilities of k claims summing up to n. In other words, we have the following version of (*): gn 

∞ X k0

pk

X i 1 +···+i k n

f i1 f i2 · · · f i k

(**)

The product of the f i t ’s is called the k-fold convolution of the f ’s, or f ∗k . If the probability of a claim size of 0, or f0 , is zero, then the outer sum is finite, but if f0 , 0, the outer sum is an infinite sum, since any number of zeroes can be included in the second sum. However, if the primary distribution is in the ( a, b, 0) class, we showed in Lesson 13 how the primary distribution can be modified so that the probability of X  0 can be removed. On exams, you will only need to calculate Pr ( S  n ) for very low values of n. You will be able to use (*) or (**). However, for the ( a, b, 0) and ( a, b, 1) classes, there is a recursive formula for calculating g n that is more efficient than the (**) for higher values of n and automatically takes care of the probability that X  0. For the ( a, b, 0) class, the formula is given in Corollary 9.9 of the textbook:

! k X bj 1 a+ f j g k−j gk  1 − a f0 k j1

C/4 Study Manual—17th edition Copyright ©2014 ASM

295

k  1, 2, 3, . . .

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

296

For a Poisson distribution, where a  0 and b  λ, the formula simplifies to gk 

k λX j f j g k− j k

k  1, 2, 3, . . .

j1

For the ( a, b, 1) class, Theorem 9.8 of the textbook provides the following formula:

 gk 

p1 − ( a + b ) p0 f k +



Pk

j1 ( a

+ b j/k ) f j g k− j

1 − a f0

k  1, 2, 3, . . .

To start the recursion, we need g0 . If Pr ( X  0)  0, this is p0 . In the general case, we can use the formula of Theorem 6.14, g0  PN ( f0 ) , where PN ( z ) is the probability generating function of the primary distribution. Formulas for PN ( z ) for the ( a, b, 0) and ( a, b, 1) classes are included in the tables you get with the exam. However, Theorem 6.14 is not on the syllabus. Should you memorize the recursive formula? The recursive formula is more efficient than convolution for calculating g n for large n, but for n ≤ 3, there’s hardly a difference, and exam questions are limited to this range. Pre-2000 syllabus exams required your knowledge of the formula for a Poisson primary. There were frequent problems requiring your backing out f k ’s or other numbers using the formula. Exams since 2000, however, have not asked any question requiring it. You could solve any question on these exams through convolution, possibly eliminating f0 using the techniques of Lesson 13, as we will illustrate in Example 17B. It is also noteworthy that the formula for g0 , as well as the p n , f n , g n notation, is off the syllabus. Therefore it is not worthwhile memorizing the recursive formula. You will never use it. The following example is worked out both ways. Example 17A For an insurance coverage, the number of claims has a negative binomial distribution with mean 4 and variance 8. Claim size is distributed as follows: Claim Size

Probability

1 2 3

0.50 0.40 0.10

Calculate Pr ( S ≤ 3) . Answer: First we will use the convolution method. We can have from 0 to 3 claims. The negative binomial has parameters β  1 and r  4. We have: 1 p0  2

!4

4 p1  1

!

5 p2  2

!

6 3

!

p3 



1 16

1 2

!5

1 2

!6

1 2

!7



1 8



5 32



5 32

1 1 Then g 0  p0  16  0.0625. For S to equal 1, there must be one claim of size 1, so g1  p 1 f1  18 (0.5)  16  5 1 2 0.0625. For S to equal 2, there must be one claim of size 2 or two claims of size 1, so g2  32 (0.5 ) + 8 (0.4)  C/4 Study Manual—17th edition Copyright ©2014 ASM

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

297

0.0890625. For S to equal 3, there must be one claim of size 3, or two claims of sizes 2 and 1 which can happen in two orders, or three claims of size 1. Then g3 

1 5 5 (0.1) + (0.4)(0.5)(2) + (0.53 )  0.09453125. 8 32 32

Finally, Pr ( S ≤ 3)  g0 + g1 + g 2 + g3  0.0625 + 0.0625 + 0.0890625 + 0.09453125  0.30859375 . 1 Next we redo the example using the recursive method. g0  p0  16 . f0  0, so the fraction in front of the sum of the formula is 1. a  g1 



β 1+β



1 2

and b  ( r − 1) a  23 .

1 1 1 3 + (0.5)   0.0625 2 2 16 16



!

1 3 1 1 3 1 g2  + (0.5) + + (0.4)  0.0890625 2 4 16 2 2 16





!



!



1 3 1 6 1 3 1 1 g3  (0.5)(0.0890625) + (0.4) + (0.1)  0.09453125 + + + 2 6 2 6 16 2 2 16







!







!

Finally, Pr ( S ≤ 3)  g0 + g1 + g2 + g3  0.0625 + 0.0625 + 0.0890625 + 0.09453125  0.30859375 .



We can use the methods of Lesson 13 to work out questions regarding the aggregate claim distribution without the recursive formula if Pr ( X  0) , 0. We modify the frequency distribution and the severity distribution so that Pr ( X  0)  0. Example 17B The number of claims on an insurance coverage follows a binomial distribution with parameters m  3, q  0.2. The size of each claim has the following distribution: x Pr ( X  x ) 0 1 2

0.50 0.35 0.15

Calculate the probability of aggregate claims of 3 or more. Answer: The probability of aggregate claims of 3 or more is 1 − Pr ( S ≤ 2) , so we will calculate Pr ( S ≤ 2) . First we will calculate it using convolutions. To eliminate the probability of 0, we revise the binomial distribution by multiplying q by 1−0.5. The modified binomial has parameters m 0  3, q 0  (0.2)(1−0.5)  0.1. The revised severity distribution has f1  0.35/ (1 − 0.5)  0.7 and f2  0.15/ (1 − 0.5)  0.3. We now compute the p j ’s. p 0  0.93  0.729 3 p1  (0.92 )(0.1)  0.243 1

!

3 (0.9)(0.12 )  0.027 2

!

p2  Then

g0  p0  0.729 g1  (0.243)(0.7)  0.1701 g2  (0.243)(0.3) + (0.027)(0.72 )  0.08613 C/4 Study Manual—17th edition Copyright ©2014 ASM

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

298

So Pr ( S ≤ 2)  0.729 + 0.1701 + 0.08613  0.98523 and the answer is 1 − 0.98523  0.01477 . In this particular case, since the binomial distribution has only a finite number of possibilities with nonzero probability, we can calculate the probability as a finite sum even without modifying the frequency distribution. For the unmodified distribution p0  0.83  0.512 3 (0.82 )(0.2)  0.384 p1  1

!

3 (0.8)(0.22 )  0.096 p2  2

!

p3  0.23  0.008 The probability of aggregate claims of 0 is the probability of no claims, or one 0, or two 0’s, or three 0’s: g0  0.512 + 0.384 (0.5) + 0.096 (0.52 ) + 0.008 (0.53 )  0.729 The probability of aggregate claims of 1 is the probability of one 1, or two claims of 1 and 0 (in two possible orders), or three claims of 1,0,0 (in three possible orders): g1  0.384 (0.35) + 0.096 (2)(0.5)(0.35) + 0.008 (3)(0.52 )(0.35)  0.1701 The probability of aggregate claims of 2 is the probability of one 2, or two claims of 2 and 0 (in two possible orders) or 1 and 1, or three claims of 1,1,0 (three orders) or 2,0,0 (three orders): g2  0.384 (0.15) + 0.096 (2)(0.5)(0.15) + 0.352 + 0.008 (3) (0.352 )(0.5) + (0.15)(0.52 )  0.08613









Even though it is possible to work it out this way, it’s easier to work it out with the modified distribution. For comparison, we will work it out with the recursive formula too. a  − 14 and b  1. Then g0  P (0.5)  1 + 0.2 (−0.5)



1

3

 0.93  0.729

g1 

1 − + 1 (0.35)(0.729)  0.1701 1 4 1 + 4 (0.5)

g2 

    *. − 1 + 1 (0.35)(0.1701) + − 1 + 1 (0.15)(0.729) +/  0.08613 4 2 4 1 + 14 (0.5) , -





1

We see that we got the same probabilities and therefore the same answer.

?



Quiz 17-1 Claim counts and sizes on an insurance coverage are independent and have the following distribution: Claim Sizes Claim Counts Number of claims

Probability

Claim size

Probability

0 1 2

0.4 0.2 0.4

100 200 400 1000

0.3 0.5 0.1 0.1

Let S be aggregate claims. Calculate Pr ( S ≤ 300) . C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 17

299

In Subsections 9.6.2–9.6.4, the textbook discusses the following considerations of the recursive method. 1. The recursion starts at the probability of 0, fS (0) . However, fS (0) is likely to be small for a large group, possibly smaller than the smallest number a computer can store. To get around this, you can start at fS ( k ) , where k is selected so that fS ( k ) doesn’t underflow. For example, k could be selected 6 standard deviations below the mean. Assign an arbitrary set of values to fS (0) ,. . . fS ( k ) , like (0, 0, . . . , 1) . These values are used to start the recursion. Compute fS ( n ) for n > k until n is so large that fS ( n ) < fS ( k ) . After computing all the probabilities, divide each of them by their sum so that they add up to 1. An alternative is to calculate the distribution for a smaller parameter, like λ/2n instead of λ if frequency is Poisson, and then to convolve the results on themselves n times. 2. The algorithm is stable for Poisson and negative binomial since all factors in the summands are bj positive, but for a binomial there is a potential for numerical instability because a + k could be negative. 3. There is an integral formula if severity is continuous. I doubt the exam will test on these considerations.

Exercises Use the following information for questions 17.1 and 17.2: Customers arrive in a store at a Poisson rate of 0.5 per minute. The amount of profit the store makes on each customer is randomly distributed as follows: Profit Probability 0 1 2 3

0.7 0.1 0.1 0.1

17.1.

Determine the probability of making no profit in 10 minutes.

17.2.

Determine the probability of making profit of 2 in 10 minutes.

17.3.

[1999 C3 Sample:14] You are given:



An aggregate loss distribution has a compound Poisson distribution with expected number of claims equal to 1.25.



Individual claim amounts can take only the values 1, 2, or 3, with equal probability. Determine the probability that aggregate losses exceed 3.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

300

Use the following information for questions 17.4 and 17.5: Taxis pass by a hotel. The number of taxis per minute has a binomial distribution with parameters m  4 and q  0.25. The number of passengers each taxi picks up has the following distribution:

17.4. is 0. 17.5. is 1.

Number of passengers

Probability

0 1 2

0.5 0.3 0.2

Determine the probability that the total number of passengers picked up by the taxis in a minute Determine the probability that the total number of passengers picked up by the taxis in a minute

17.6. [3-F02:36] The number of claims in a period has a geometric distribution with mean 4. The amount of each claim X follows Pr ( X  x )  0.25, x  1, 2, 3, 4. The number of claims and the claim amounts are independent. S is the aggregate claim amount in the period. Calculate FS (3) . (A) 0.27

(B) 0.29

(C) 0.31

(D) 0.33

(E) 0.35

17.7. The number of taxis arriving at an airport in a minute has a Poisson distribution with mean 4. Each taxi picks up 1 to 4 passengers, with the following probabilities: Number of passengers

Probability

1 2 3 4

0.70 0.20 0.05 0.05

Calculate the probability that in one minute 4 or more passengers leave the airport by taxi. Use the following information for questions 17.8 and 17.9: For an insurance coverage, you are given: (i)

Claim frequency, before application of deductibles and limits, follows a geometric distribution with mean 5. (ii) Claim size, before application of deductibles and limits, follows a Poisson distribution with mean 1. (iii) Claim frequency and claim size are independent. (iv) There is a per-claim deductible of 1 and the maximum covered loss is 2 per loss. 17.8.

Calculate average aggregate payments per year.

17.9.

Calculate the probability that aggregate payments are greater than 3.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 17

301

Use the following information for questions 17.10 and 17.11: For an insurance coverage, you are given: (i)

Claim frequency, before application of deductibles and limits, follows a geometric distribution with mean 5. (ii) Claim size, before application of deductibles and limits, follows a Poisson distribution with mean 1. (iii) Claim frequency and claim size are independent. (iv) There is a per-claim deductible of 1 and a maximum covered loss of 3 per loss. 17.10. Calculate average aggregate payments per year. 17.11. Calculate the probability that aggregate payments are greater than 2. 17.12. [151-82-93:12] (2 points) You are given: (i) Aggregate claims has a compound Poisson distribution with λ  0.8. (ii) Individual claim amount distribution is

(iii)

x

Pr ( X  x )

1 2 3

0.5 0.3 0.2

The probabilities for certain values of the aggregate claims, S, are: x

Pr ( S  x )

2 3 5

0.1438 0.1198 0.0294

Determine Pr ( S  4) . (A) 0.051

(B) 0.064

(C) 0.076

(D) 0.089

(E) 0.102

17.13. [151-82-93:10] (2 points) Aggregate claims S has a compound Poisson distribution with individual claim amount distribution: x

Pr ( X  x )

1 3

1/3 2/3

Also, Pr ( S  4)  Pr ( S  3) + 6 Pr ( S  1) . Determine Var ( S ) . (A) 76

(B) 78

(C) 80

(D) 82

Additional released exam questions: SOA M-F05:27, CAS3-S06:36, C-S07:8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E) 84

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

302

Solutions 17.1. Modify the Poisson to eliminate 0. The Poisson parameter for 10 minutes is (10)(0.5)  5, and the probability of non-zero profit is 0.3, so the Poisson parameter for non-zero profits is (5)(0.3)  1.5. Then the probability of 0 profit is e −1.5  0.2231 . 17.2. Modify the secondary distribution to eliminate the probability of 0. The conditional probabilities of 1 and 2 given profit greater than 0 are then 1/3 and 1/3 respectively. The modified Poisson parameter is 1.5, as we stated in the solution to the previous exercise. Then the probability of 2 is the probability of one customer with profit of 2 or two customers with profit 1 apiece: g 2  p 1 f2 +

p2 f12

 1.5e

−1.5

1 1.52 e −1.5 + 3 2

!

!

1 3

!2  0.1395

You can also work this out with the recursive formula combined with g0 from the previous exercise: g1  (5)(0.1)(0.2231)  0.1116 g2  25 (0.1)(0.1116) + 5 (0.1)(0.2231)  0.1395 but this method is inferior. 17.3. The probability of 0 is e −1.25  0.2865048. We’ll factor this out of the other probabilities. To get 1, there must be exactly 1 claim of size 1, so the probability is e −1.25

1.25  0.416667e −1.25 . 3

To get 2, there must be 2 claims of size 1 or 1 claim of size 2, so the probability is

* 1.252

e −1.25 .

,

2

!

1 1.25 + /  0.503472e −1.25 + 9 3

!

-

To get 3, there must be 3 claims of size 1 or 2 claims of sizes 1 and 2 (in either order) or 1 claim of size 3, so the probability is

! ! ! ! * 1.253 1 + 2 1.252 1 + 1.25 +/  0.602334e −1.25 6 27 2 9 3 ,

e −1.25 .

Also, e −1.25  0.2865048. So the probability of more than 3 claims is 1 − 0.2865048 (1 + 0.41667 + 0.503472 + 0.602334)  1 − 0.2865048 (2.52247)  0.2773 17.4. Modify the binomial distribution to eliminate 0 passengers by multiplying q by 1 − p0  0.5; the revised q  0.125. Then the probability of 0 passengers is (1 − 0.125) 4  0.5862 .

17.5. The modified binomial has m  4 and q  0.125, as mentioned in the solution to the last exercise. The modified probability of 1 passenger per taxi, given more than 0 passengers per taxi, is 0.3/0.5  0.6. The probability that the total number of passengers is 1 is 4 (0.8753 )(0.125)(0.6)  0.2010 1

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 17

303

17.6. You could use the recursive formula. An alternative is to work out all ways to get 3 or less. 1  0.2 and each successive p is obtained by multiplying the For the geometric distribution, p0  1+β previous one by 0.8: p1  0.16, p2  0.128, p3  0.1024. With one claim, there are three ways to get 3 or less (probability 0.75). With two claims, there are three ways to get a sum 3 or less (1 + 2, 2 + 1, or 1 + 1), and with three claims, there’s one way to get a sum of 3 or less (1 + 1 + 1). Adding these all up: FS (3)  0.2 + 0.16 (3)(0.25) + 0.128 (3)(0.252 ) + 0.1024 (1)(0.253 )  0.3456 17.7.

(E)

Using the recursive formula: g0  e −4 g1  4 (0.7) e −4  2.8e −4 g2  2 (0.7)(2.8) + 2 (0.2) e −4  4.72e −4



g3 

4 3





 (0.7)(4.72) + 2 (0.2)(2.8) + 3 (0.05) e −4  6.098667e −4

Using convolutions: g1  (4e −4 )(0.7)  2.8e −4 g2  (8e −4 )(0.72 ) + (4e −4 )(0.2)  4.72e −4 −4 3 −4 −4 −4 g3  ( 32 3 e )(0.7 ) + (8e )(2)(0.7)(0.2) + 4e (0.05)  6.098667e

Either way, Pr ( N ≥ 4)  1 − (1 + 2.8 + 4.72 + 6.098667) e −4  1 − 14.61867e −4  0.73225 .

17.8. For the Poisson, P (0)  P (1)  e −1 . Hence p0 , the probability of a payment of 0, is 2e −1 . The only other possible payment is 1, and p1  1 − 2e −1 . The average number of claims is 5, so the answer is 5 (1 − 2e −1 )  1.321

17.9. Aggregate payments greater than 3 means more than three payments. Using what we learned about coverage modifications, the frequency distribution for losses of 2 or more is a geometric distribution with β equal to the original β times the probability of a loss 2 or more, or 5 (1−2e −1 )  1.321. The probability of more than three payments is !4 !4 β 1.321   0.1050 1+β 2.321 17.10. We will use payment size as the subscript. When we take the deductible and limit into account, and apply a Poisson distribution with parameter 1: Payment size

Claim size

Probability

0 1 2

0 or 1 2 3 or more

f0  Pr ( X  0) + Pr ( X  1)  e −1 + e −1  0.73576 f1  Pr ( X  2)  0.5e −1  0.18394 f2  Pr ( X > 2)  1 − 0.73576 − 0.18394  0.08030

Therefore average payment size per claim is 0.18394 (1) + 0.08030 (2)  0.34454. The average number of claims per year is 5, making average aggregate payments 5 (0.34454)  1.7228 . 17.11. You can modify the geometric distribution to remove individual payments of 0. Use the formula in Table 13.1 on page 224 to adjust the geometric distribution for severity modifications. Here, the probability of a payment for each loss is 1 − f0  1 − 0.73576  0.26424, so the adjusted geometric distribution for non-zero payments has parameter β0  0.26424β  0.26424 (5)  1.3212, and the payment distribution conditional on non-zero payments would be (using Y for the payment variable) Pr ( Y  1)  C/4 Study Manual—17th edition Copyright ©2014 ASM

f1 0.18394   0.69611 0.26424 0.26424

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

304

Pr ( Y  2) 

f2 0.08030   0.30389 0.26424 0.26424

Then the revised geometric distribution has the following probabilities. We use the ( a, b, 0) formula for the geometric distribution, β0 1.3212 p k−1  p k−1  0.5692p k−1 pk  0 1+β 2.3212

!

!

to calculate successive probabilities: p0  1 −

β0  1 − 0.5692  0.4308 1 + β0

p 1  (0.5692)(0.4308)  0.2452

p 2  (0.5692)(0.2452)  0.1396 The probability of zero aggregate payments is 0.4308. The probability of aggregate payments of 1 is the probability of one claim of size 1, or g1  (0.2452)(0.69611)  0.1707. The probability of aggregate payments of 2 is the probability of two claims of size 1 or one claim of size 2: g2  (0.1396)(0.696112 ) + (0.2452)(0.30389)  0.1422 Pr ( S ≥ 3)  1 − 0.4308 − 0.1707 − 0.1422  0.2563 . The alternative is to use the recursive formula. First of all, the probability aggregate claims is 0 is, using the off-syllabus formula, g0  PN ( f0 )  PN (0.73576) , where we’ve take f0 from the solution to the previous exercise. By Loss Models Appendix, for a geometric distribution, P ( z )  1 − β ( z − 1)



g0  1 − 5 (0.73576 − 1)



 −1

 −1

. So

 0.4308

Then g1  g2 

1 1−

1

1 − 56 (2e −1 )

 2.5849

5 1 −1 6 2e

 

5 −1 6 (2e )

5 6



5 6





 (0.4308)  0.1707

1 −1 2 e (0.1707)

+ (1 − 2.5e −1 )(0.4308)



 (0.1839)(0.1707) + (0.0803)(0.4308)  0.1422

Then Pr ( S ≥ 3)  1 − 0.4308 − 0.1707 − 0.1422  0.2563 .

17.12. This is an exercise where they virtually force you to use the recursive formula. One way to do it is to calculate g 1 with the recursive formula: g0  Pr ( S  0)  e −0.8  0.449329 g1  λ (1)(0.5) g0  0.4 (0.449329)  0.179732 Then calculate g 4 using the recursive formula:

!

g4 

 λ  (1) f1 g 3 + (2) f2 g 2 + (3) f3 g 1 4

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 17

305

 0.2 (1)(0.5)(0.1198) + (2)(0.3)(0.1438) + (3)(0.2)(0.179732)  0.05080





(A)

With this method, g5  Pr ( S  5) is not needed. The official method backed out g 4 using the recursive formula and the values of g5 , g3 , and g2 :

 0.8  (1)(0.5)( g4 ) + (2)(0.3)(0.1198) + (3)(0.2)(0.1438) 5 0.0294 − 0.07188 − 0.08628 0.5g 4  0.16 0.5g 4  0.02559

0.0294 

!

g 4  0.05118

(A)

The difference between the two answers is because of the use of rounded values for g2 and g3 . To do it without the recursive formula, consider all ways S could equal 4. Either four 1’s, one 2 and two 1’s (three orders), two 2’s, or one 3 and one 1 (two orders). Pr (1, 1, 1, 1)  e

−0.8

Pr (2, 1, 1)  3e −0.8 Pr (2, 2)  e

−0.8

Pr (3, 1)  2e

−0.8

0.84 (0.54 )  0.001067e −0.8 4!

!

0.83 (0.3)(0.52 )  0.0192e −0.8 3!

!

0.82 (0.32 )  0.0288e −0.8 2!

!

0.82 (0.2)(0.5)  0.064e −0.8 2!

!

Pr ( S  4)  (0.001067 + 0.0192 + 0.0288 + 0.064) e −0.8  0.113067e −0.8  0.050804 17.13. This is an example of the old style exam question which virtually required knowing the recursive formula, at least for the Poisson distribution. We must back out λ from the given relationship g4  g3 +6g1 . From the recursive formula 1 λ 2 g3 + 3λ g1 4 3 3 λ λ  g3 + g1 12 2

g4 





Since we are given g4  g3 +6g1 , we must have both λ/2  6 and λ/12  1, either one implying that λ  12. Without using the recursive formula, you’d have to set up an equation for Pr ( S  4) as follows: Pr ( S  1)  Pr ( N  1) Pr ( X  1) 

λe −λ 3

2 −λ 1 λ3 e −λ λe + 3 3 6 3 2 e −λ λ 1 λ4 e −λ 4 Pr ( S  4)  2 Pr ( N  2) | Pr ( X  1) Pr ( X  3) + Pr ( N  4) Pr ( X  1) 4  + 4 9 2 3 24

Pr ( S  3)  Pr ( N  1) Pr ( X  3) + Pr ( N  3) Pr ( X  1) 3 

Using Pr ( S  4)  Pr ( S  3) + 6 Pr ( S  1) , we get, after substituting the above expressions, multiplying out the denominators, and dividing by λe −λ , λ 3 − 12λ2 + 432λ − 5184 C/4 Study Manual—17th edition Copyright ©2014 ASM

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

306

You can use the cubic equation if you know it, or notice that λ − 12 is a factor, so we get

( λ − 12)( λ2 + 432)  0

and λ  12. We now calculate the second moment of the secondary distribution: E[X 2 ]  13 (1) + 32 (9)  Var ( S )  12

19  76 3

19 3 .

Finally

!

(A)

Quiz Solutions 17-1. The probability of aggregate claims of 0 is 0.4. The probability of aggregate claims of 100 is (0.2)(0.3)  0.06. The probability of aggregate claims of 200 is the sum of the probabilities of two claims of 100 and one claim of 200, or fS (200)  0.2 (0.5) + 0.4 (0.32 )  0.136 The probability of aggregate claims of 300 is the probability of two claims of 100 and 200, or fS (300)  2 (0.4)(0.3)(0.5)  0.12 The sum of these probabilities is Pr ( S ≤ 300)  0.4 + 0.06 + 0.136 + 0.12  0.716 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 18

Aggregate Losses—Aggregate Deductible Reading: Loss Models Fourth Edition 9.3, 9.5 This lesson discusses the expected aggregate payments in the presence of an aggregate deductible. An aggregate deductible is a deductible that is applied to aggregate losses rather than individual losses. Stoploss reinsurance is structured this way. A stop-loss reinsurance contract reimburses aggregate losses only above a certain level. When we refer to aggregate losses, we mean the sum of individual claim payments after individual claim modifications such as policy limits and deductibles, but before consideration of aggregate modifications. In the following, we shall continue to use the notation N for the loss frequency distribution, X for the loss severity distribution, and S for the aggregate loss distribution. If frequency and severity are independent, then E[S]  E[N] E[X]. The expected value of aggregate losses above the deductible is called the net stop-loss premium. If S is f g aggregate losses and d is the deductible, we denote it by E ( S − d )+ , the same notation as in Lesson 6. To simplify matters, we will assume that severity is discrete. Since frequency is discrete, aggregate loss is discrete. By equation (6.4), f g E ( S − d )+  E[S] − E[S ∧ d], and since we can evaluate E[S] as the product of E[N] and E[X] if N and X are independent, we only have to deal with E[S ∧ d], a finite integral. We have two formulas for limited expected value, equation (5.3) and equation (5.6). In the following, we shall continue using the notation of Lesson 17: p n  Pr ( N  n ) f n  Pr ( X  n ) g n  Pr ( S  n ) We will assume that for some h, Pr ( S  n ) is nonzero only for n a multiple of h. The first step in evaluating E[S ∧ d] is to calculate g n for n < d using one of the methods of Lesson 17. This gets harder and harder as d increases. Exams will not make you evaluate more than four values of gn . After doing that, there are three methods you can use to evaluate E[S ∧ d], and none of them require formula memorization. It is helpful to see these formulas graphically. I will explain the three methods as part of solving the next example. Example 18A On an insurance coverage, the number of claims has a geometric distribution with mean 4. The distribution of claim sizes is as follows: x Pr ( X  x ) 2 4 6 8

0.45 0.25 0.20 0.10

(i) Calculate E[ ( S − 2.8)+ ]. C/4 Study Manual—17th edition Copyright ©2014 ASM

307

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

308

(ii) Calculate E[ ( S − 4)+ ]. Answer: In this example, h  2; all severities are multiples of 2. Let’s start by calculating E[S], and the probabilities of aggregate losses of 0 and 2, since this will be necessary regardless of the method used to calculate E[S ∧ d]. We have E[X]  0.45 (2) + 0.25 (4) + 0.20 (6) + 0.10 (8)  3.9 E[S]  4 (3.9)  15.6 Claim counts follow a geometric distribution with mean β  4. The probabilities of N  0 and N  1 are 1  0.2 1+4 ! 4  0.16 p 1  0.2 1+4

p0 

The probability that aggregate losses are 0 is the same as the probability of no claims, or g 0  Pr ( S  0)  0.2. The probability that aggregate losses are 2 is the probability of one claim of size 2, or g 2  Pr ( S  2)  0.16 (0.45)  0.072. We can then calculate the aggregate survival function S ( x ) for x < 4: SS (0)  Pr ( S > 0)  1 − g0  1 − 0.2  0.8

SS (2)  Pr ( S > 2)  SS (0) − g2  0.8 − 0.072  0.728

and SS ( x )  SS (0) for x < 2, SS ( x )  SS (2) for 2 ≤ x < 4. 1. Using the definition of E[S ∧ d]

Equation (5.3) is the definition of E[S ∧ d]. For a discrete distribution in which the only possible values are multiples of h, it becomes E[S ∧ d] 

u X j0

h j g h j + d Pr ( S ≥ d )

where u  dd/he − 1, so the sum is over all multiples of h less than d.1 The sum can actually start at j  1 instead of j  0, since the j  0 term is 0. If u − 1 < 0 (for example if d  1 and h  2), the sum is empty and the limited expected value is d Pr ( S ≥ d ) . For part (i), we sum up the probabilities of aggregate losses equal to values below 2.8 times the values, and then add 2.8 times the probability of aggregate losses greater than 2.8. In Subfigure 18.1a, this means summing rectangles A and B. The area of rectangle A is (0.072)(2) and the area of rectangle B is (0.728)(2.8) . So we get E[S ∧ 2.8]  0.2 (0) + 0.072 (2) + 0.728 (2.8)  2.1824

E[ ( S − 2.8)+ ]  15.6 − 2.1824  13.4176

For part (ii), we sum up the probabilities of getting values 0 and 2 times the values, and then add 4 times the probability of losses greater than or equal to 4. In Subfigure 18.1b, this means summing rectangles A and B. The area of rectangle A is (0.072)(2) and the area of rectangle B is (0.728)(4) . So we get E[S ∧ 4]  0.2 (0) + 0.072 (2) + 0.728 (4)  3.056

E[ ( S − 4)+ ]  15.6 − 3.056  12.544

1 dxe means the least integer above x. Thus if x  3, dxe  3, while if x  3.1, dxe  4. C/4 Study Manual—17th edition Copyright ©2014 ASM

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

309

S (x )

S (x )

1

1

0.8 0.728

0.8 0.728

A

0.6

0.6

0.4

0.4

B

0.2 0

A

B

0.2 0

2

2.8

x

6

4

(a) Example 18A(i)

0

0

2

4

(b) Example 18A(ii)

6

x

Figure 18.1: Calculating E[S ∧ d] using the definition

2. Calculating E[S ∧ d] by integrating the survival function

Equation (5.6) for a discrete distribution in which the only possible values are multiples of h becomes E[S ∧ d] 

u−1 X j0

hS ( h j ) + ( d − hu ) S ( hu )

where once again u  dd/he − 1. If u − 1 < 0 (for example if d  1 and h  2), the sum is empty and the limited expected value is ( d − hu ) S ( hu ) . This formula sums up the probabilities of S being above each of the possible values below d times the distance between the possible values (the first term), and also the distance between jthekhighest possible value below d and d (the second term). In Example 18A(i), d  2.8 and h  2, so u  2.8 2 − 1  1 and there is one term in the sum plus one additional term. This is shown in Subfigure 18.2a, where we sum up rectangles C and D. Please don’t memorize the formula; try to understand it graphically and you will be easily able to reproduce it. The area of rectangle C is 2 (0.8)  1.6, and the area of rectangle D is 0.8 (0.728)  0.5824. So we have E[S ∧ 2.8]  2S (0) + 0.8S (2)

 1.6 + 0.5824  2.1824

E[ ( S − 2.8)+ ]  15.6 − 2.1824  13.4176 In Example 18A(ii), d  4 and h  2, so u 

4 2

j k

− 1  1, and

E[S ∧ 4]  2 (0.8) + 2 (0.728)  3.056

E[ ( S − 4)+ ]  15.6 − 3.056  12.544

Because S ( h j ) may be computed recursively (for example, S (4)  S (2) − g4 ), this is called the recursive formula, but has no direct relationship to the recursive formula of Lesson 17. C/4 Study Manual—17th edition Copyright ©2014 ASM

1

0.8 0.728

0.8 0.728

0.6

0.6

0.4

C

0.4

D

0.2 0

S (0)

1

S (2)

S (x )

S (0)

S (x )

C

S (2)

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

310

D

0.2 0

2

2.8

4

(a) Example 18A(i)

6

x

0

0

2

4

(b) Example 18A(ii)

6

x

Figure 18.2: Calculating E[S ∧ d] by integrating the survival function

3. Proceeding backwards A variant of the second method for calculating E[S ∧ d] is to express it as d minus something. The something is the sum of the probabilities of S  k for some k < d, times d − k. Thus in part (i), the deductible of 2.8 saves the insurance company 2.8 unless aggregate losses are 0 or 2. If aggregate losses are 0, the expected amount not saved is 2.8g0 . If aggregate losses are 2, the expected amount not saved is 0.8g 2 , since the deductible only saves the company 2 rather than 2.8 in this case. So we have E[S ∧ 2.8]  2.8 − 2.8g0 − 0.8g 2  2.8 − 2.8 (0.2) − 0.8 (0.072)  2.1824

The graph is Subfigure 18.3a. We start with the rectangle from (0, 0) to (2.8, 1) and then subtract rectangles E and F. In part (ii), the savings are 4g0 + 2g2 , so E[S ∧ 4]  4 − 4g0 − 2g2  4 − 4 (0.2) − 2 (0.072)  3.056

The graph is Subfigure 18.3b. We start with the rectangle from (0, 0) to (4, 1) and then subtract rectangles E and F. 

?

Quiz 18-1 Claim counts and sizes on an insurance coverage are independent and have the following distribution: Claim Sizes Claim Counts Number of claims

Probability

Claim size

Probability

0 1 2 3

0.5 0.3 0.1 0.1

100 200 300 400

0.2 0.4 0.3 0.1

A stop-loss reinsurance contract pays the excess of aggregate losses over 200. The premium for the contract is 150% of expected payments. Calculate the premium. C/4 Study Manual—17th edition Copyright ©2014 ASM

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

311

S (x )

S (x )

1

1 E

E

0.8 0.728

0.8 0.728

F

0.6

0.6

0.4

0.4

0.2

0.2

0

0

2

2.8

4

(a) Example 18A(i)

6

x

0

F

0

2

4

(b) Example 18A(ii)

6

x

Figure 18.3: Calculating E[S ∧ d] as d minus excesses of d over values of S

The second method is the best one for problems where you must solve for the deductible. Example 18B (Continuation of previous example.) On an insurance coverage, the number of claims has a geometric distribution with mean 4. The distribution of claim sizes is as follows: x Pr ( X  x ) 2 0.45 4 0.25 6 0.20 8 0.10 A stop-loss reinsurance contract sets the deductible so that expected payments under the contract are 12. Calculate the deductible needed to satisfy this condition. Answer: We continue calculating recursively E[ ( S − d )+ ] until it is below 12. Since E ( S )  15.6, we need E[S ∧ d] ≥ 15.6 − 12  3.6. We use the second method, and continue adding rectangles until the area adds up to 3.6. Using just S (2) , we can calculate E[S ∧ 4]  E[S ∧ 2] + 2S (2)  1.6 + 2 (0.728)  3.056

In order to proceed, we will have to calculate additional g n ’s. For the geometric distribution, which is the same as in Example 18A, we calculated that the probabilities of 0 and 1 are p0  0.2 and p1  0.16 respectively. Then p2  0.8p1  0.128, and g4  0.16 (0.25) + 0.128 (0.452 )  0.06592 S (4)  0.728 − 0.06592  0.66208

We solve for d:

E[S ∧ 6]  E[S ∧ 4] + 2S (4)  3.056 + 2 (0.66208) > 3.6 3.6  E[S ∧ d]  3.056 + ( d − 4)(0.66208) 3.6 − 3.056 d 4+  4.8217 0.66208

C/4 Study Manual—17th edition Copyright ©2014 ASM



18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

312

Note that between possible values of S, E[ ( S − u )+ ] is a linear function of u.

Example 18C For aggregate losses S, you are given:

(i) E[ ( S − 100)+ ]  4200 (ii) E[ ( S − 150)+ ]  4180 (iii) S does not assume any value in the interval (100, 150) . Calculate E[ ( S − 120)+ ]. Answer: E[ ( S − 120)+ ]  0.6 (4200) + 0.4 (4180)  4192



Exercises 18.1. [151-83-94:15] (2 points) Aggregate claims has a compound Poisson distribution with λ  2, Pr ( X  1)  0.4, and Pr ( X  2)  0.6, where X is individual claim size. An insurer charges a premium of 4.5 and returns a dividend of the excess, if any, of 2.7 over claims. Determine the excess of premiums over expected claims and dividends. (A) 0.4

(B) 0.5

(C) 0.6

(D) 0.7

(E) 0.8

18.2. The number of claims on an insurance coverage has a negative binomial distribution with mean 2 and variance 6. The claim size distribution is binomial with parameters m  3 and q  0.4. A reinsurance contract pays aggregate claims over an aggregate deductible of 2. Determine the expected aggregate loss paid by reinsurance. 18.3. [151-81-96:16] (2 points) A stop-loss reinsurance pays 80% of the excess of aggregate claims above 20, subject to a maximum payment of 5. All claim amounts are non-negative integers. For aggregate claims S, you are given: E ( S − 16)+  3.89

E ( S − 25)+  2.75

E ( S − 24)+  2.84

E ( S − 27)+  2.65

f

g

f

E ( S − 20)+  3.33

f

f

g

E ( S − 26)+  2.69

g

f

g

f

g

g

Determine the total amount of claims the reinsurer expects to pay. (A) 0.46

(B) 0.49

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.52

(D) 0.54

(E) 0.56

Exercises continue on the next page . . .

EXERCISES FOR LESSON 18

313

18.4. The number of claims on an insurance coverage has a negative binomial distribution with parameters r  2 and β  1. The claim size distribution is as follows: Amount 1 2 3 4

Probability 0.4 0.3 0.2 0.1

Reinsurance covers the aggregate loss with an aggregate deductible of d. The deductible d is set so that the expected reinsurance payment is 2. Determine d. 18.5. The Late Night Quiz Show gives out 4 prizes per night. Each prize has a 0.8 probability of being 1000 and a 0.2 probability of being 2000. Determine the probability that total prizes for a night will be 6000. 18.6. The number of claims per year on a homeowner’s policy follows a Poisson distribution with mean 0.2 per year. The claim size distribution has the following probability function: f (x ) 

1 4 4 5

!x

x  1, 2, 3, . . . .

Reinsurance pays 90% of claims for a year subject to an aggregate deductible of 2. The deductible is applied after multiplying the claims by 90%. Determine the expected reinsurance payment per year. Use the following information for questions 18.7 and 18.8: The Late Night Quiz Show gives out prizes each night. The number of prizes given out is randomly distributed as follows: Number of Prizes

Probability

1 2 3 4

0.20 0.35 0.30 0.15

Each prize has a 0.8 probability of being 1000 and a 0.2 probability of being 2000. 18.7.

Determine the probability that total prizes are 6000.

18.8. The prize payment for a night, minus an aggregate deductible of 1500, is insured. The premium for the insurance equals 120% of expected claims. Calculate the premium.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

314

18.9.

For an insurance coverage, aggregate losses have the following distribution: Amount 0 1000 2000 3000 4000 or more

Probability 0.50 0.10 0.15 0.10 0.15

Average aggregate losses are 2800. An insurance coverage pays 80% of aggregate losses, minus a deductible of 1000. The deductible is applied after the 80% factor. Determine the average payment of the insurance coverage. 18.10. You have a $10 gift card for use at Amigo’s Department Store. You plan to buy several items there. The number of items you will buy has the following distribution: Number of items

Probability

0 1 2 3

0.2 0.4 0.3 0.1

The price of each item is $4 ( X + 1) , where X is a random variable having a binomial distribution with parameters m  2, q  0.2. You will use the gift card, regardless of the value of the amount spent. Calculate the expected amount of money you spend net of the value of the gift card. 18.11. A group life insurance policy covers 40 lives. Each policy is for 100,000. The probability of death for each life covered is 0.01. A reinsurance contract reimburses the insurance company for 80% of aggregate losses, subject to an aggregate deductible of 200,000. The aggregate deductible is subtracted after multiplying aggregate losses by 80%. The reinsurance premium is 110% of expected reinsurance payments. Calculate the reinsurance premium. 18.12. [151-82-92:16] A group policyholder’s aggregate claims, S, has a compound Poisson distribution with λ  1 and all claim amounts equal to 2. The insurer pays the group the following dividend:

 6 − S D 0 

S 0)   0.5510 0.784 Therefore, the probability of aggregate claims of size 1 is g1  (0.2377)(0.5510)  0.1310 Now we can compute E[ ( S − 2)+ ]. E[S]  ( rβ )( mq )  (1)(2)(3)(0.4)  2.4 Pr ( S ≥ 2)  1 − 0.3894 − 0.1310  0.4796 Using the third method, the reinsurance pays 2.4, minus 1 if S  1, minus 2 if S ≥ 2, or E[ ( S − 2)+ ]  2.4 − 0.1310 (1) − 0.4796 (2)  1.3098 18.3. First of all, ignore the 80% coinsurance. Then the reinsurer pays for the amount of each claim between 20 and 26.25, since 0.8 (6.25)  5. All claims are integers, so the reinsurer pays min ( S, 26) − min ( S, 20) + 0.25 min ( S, 27) − min ( S, 26)





To express this in terms of ( S − n )+ variables, remember that min ( x, y )  x + y −max ( x, y )  x −max (0, x − y ) , so the payment equals S − max (0, S − 26) − S + max (0, S − 20) + 0.25 S − max (0, S − 27) − S + max (0, S − 26)





 ( S − 20)+ − ( S − 26)+ + 0.25 ( S − 26)+ − ( S − 27)+









and the expected value of this is E ( S − 20)+ − E ( S − 26)+ + 0.25 E ( S − 26)+ − E ( S − 27)+

f

g

f

g

 f

g

f

g

(3.33 − 2.69) + 0.25 (2.69 − 2.65)  0.65

Multiplying this by the coinsurance factor 0.8 we get 0.8 (0.65)  0.52 . (C) 18.4. Let N be the number of claims and S the aggregate loss. f We willg calculate the probabilities of values of S directly and use the recursive method for calculating E ( S − x )+ . E[S]  2 (2)  4 Pr ( N  0) 

1 2 2

 0.25

Pr ( S  0)  0.25 SS (0)  1 − 0.25  0.75

E ( S − 1)+  4 − 0.75  3.25

f

g

Pr ( N  1)  2

1 3 2

 0.25

Pr ( S  1)  0.25 (0.4)  0.1 SS (1)  0.75 − 0.1  0.65

E ( S − 2)+  3.25 − 0.65  2.60

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

320

Pr ( N  2)  3

1 4 2

 0.1875

Pr ( S  2)  0.25 (0.3) + 0.1875 (0.42 )  0.105 SS (2)  0.65 − 0.105  0.545

E ( S − 3)+  2.60 − 0.545  2.055

f

g

Pr ( N  3)  4

1 5 2

 0.125

Pr ( S  3)  0.25 (0.2) + 0.1875 (2)(0.4)(0.3) + 0.125 (0.43 )  0.103 SS (3)  0.545 − 0.103  0.442

E ( S − 4)+  2.055 − 0.442  1.613

f

g

Interpolating between 3 and 4, d 3+

2.055 − 2  3.124 2.055 − 1.613

18.5. If X is binomial with m  4, q  0.2, the loss variable is 1000X + 4000. The probability that this is  equal to 6000 is the probability X  2, or 42 (0.82 )(0.22 )  0.1536 .

18.6. The size of claims X is 1 plus a geometric distribution with β  4. Using this, or directly summing the geometric series, we have E[X]  5. The probabilities are f1  0.2, f2  0.16. Letting S be aggregate 2 losses, E[S]  0.2 (5)  1. Subtracting 2 after multiplying by 90% is equivalent to first subtracting 0.9  20 9 20 and then multiplying by 90%, so the reinsurance payment is 0.9 ( S − 9 )+ . The aggregate probabilities are g0  e −0.2  0.8187 g1  0.2 (0.2) e −0.2  0.04e −0.2  0.0327 0.22 (0.22 ) + (0.2)(0.16) e −0.2  0.02685 g2  2

!

f

Pr ( S > 2)  1 − 0.8187 − 0.0327 − 0.02685  0.1217

E (S −

0.9 E ( S −

f

20 9 )+

g

20 9 )+

g

 1 − 0.0327 − 0.02685 (2) − 0.1217

20 9

 0.6432

 (0.9)(0.6432)  0.5788

18.7. Total prizes are 6000 if three prizes are 2000 or if two prizes are 2000 and two are 1000. The proba bility of this is (0.3)(0.23 ) + (0.15) 42 (0.22 )(0.82 )  0.02544 .

18.8. If S is total prizes, then Pr ( S  1000)  0.2 (0.8)  0.16. The average prize is 1200 and the average f number of prizes is 1 (0.20) + 2 (0.35) + 3 (0.30) + 4 (0.15)  2.4, so E[S]  2.4 (1200)  2880. Then E ( S − 1500)+  2880 − 1000 (0.16) − 1500 (0.84)  1460. Multiplying by 120%, (1460)(1.2)  1752 .

g

18.9. If S is the loss variable, the payment variable is 0.8 ( S − 1250)+ . We calculate E ( S − t )+ recursively. E[ ( S − 1250)+ ] is obtained by linear interpolation between E[ ( S − 1000)+ ] and E[ ( S − 2000)+ ].

f

S (0)  0.5

S (1000)  0.4

E ( S − 1000)+  2800 − 0.5 (1000)  2300

f

g

E ( S − 2000)+  2300 − 0.4 (1000)  1900

f

g

E ( S − 1250)+  0.75 (2300) + 0.25 (1900)  2200

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

g

EXERCISE SOLUTIONS FOR LESSON 18

321

E 0.8 ( S − 1250)+  0.8 (2200)  1760

f

g

An alternative solution2 is to calculate E[0.8S] − E[0.8S ∧ 1000]. Since we are given that E[S]  2800, then E[0.8S]  0.8 (2800)  2240. Now, 0.8S ∧ 1000 is 0 with probability 0.5, 800 with probability 0.1, and 1000 with probability 0.4, so E[0.8S ∧ 1000]  0.1 (800) + 0.4 (1000)  480

The answer is 2240 − 480  1760 .

18.10. E[N]  1 (0.4) + 2 (0.3) + 3 (0.1)  1.3, so without the card expected spending would be (1.3)(1.4)(4)  7.28. The gift card reduces the amount spent by 10, except that it reduces the amount spent by 0 if S  0, etc. We calculate (using the third method) g0  0.2 g4  (0.82 )(0.4)  0.256 g8  (0.4)(0.32) + (0.3)(0.642 )  0.25088 So the expected amount covered by the gift card is 10 − 0.2 (10) − 0.256 (6) − 0.25088 (2)  5.96224. The expected amount spent is 7.28 − 5.96224  1.31776 .

18.11. We have

p0  0.9940  0.66897 p1  40 (0.9939 )(0.01)  0.27029 p2  780 (0.9938 )(0.012 )  0.05324 The mean payment before deductible is (40)(100,000)(0.01)  40,000. Ignoring the coinsurance, the deductible removes 250,000 (which is multiplied by 80%), unless there are 0, 1, or 2 deaths. We have E max (0, 250,000 − X )  0.66897 (250,000) + 0.27029 (150,000) + 0.05324 (50,000)  210,448.65

f

g

Hence the expected reinsurance payment before coinsurance is 40,000 − 250,000 + 210,448.65  448.65. Multiplying this by 80% and then by 110%, we get 394.81 . 18.12. The dividend is 6 if no claims, 4 if one claim, 2 if two claims, otherwise 0, so the expected value is 6e

−1

+ 4e

−1

e −1 11 +2  2 e

!

(D)

18.13. The probability that the secondary distribution is 0 is e −0.5 . We modify the primary geometric’s parameter to 3 (1−e −0.5 ) and zero-truncate the Poisson. Then β/ (1+β )  3 (1−e −0.5 ) / (4−3e −0.5 )  0.541370. We will use p k for primary probabilities, f k for secondary probabilities. We will use ( a, b, i ) methods to calculate probabilities recursively: for the geometric, repeated multiplication by 0.541370; for the Poisson, multiplication by λ/k. p 0  1 − 0.541370  0.458630 p2  0.541370 (0.248289)  0.134416

0.5e −0.5  0.770747 1 − e −0.5 f2  0.25 (0.770747)  0.192687

p3  0.541370 (0.134416)  0.072769

f3  (0.5/3)(0.192687)  0.032114

p 1  0.541370 (0.458630)  0.248289

2shown to me by Bryn Clarke C/4 Study Manual—17th edition Copyright ©2014 ASM

f1 

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

322

Let N be total number of losses. Pr ( N  0)  0.458630 Pr ( N  1)  (0.248289)(0.770747)  0.191368 Pr ( N  2)  (0.134416)(0.7707472 ) + (0.248289)(0.192687)  0.127692 Pr ( N  3)  (0.072769)(0.7707473 ) + 2 (0.134416)(0.770747)(0.192687) + (0.248289)(0.032114)  0.081217 Pr ( N ≥ 4)  1 − 0.458630 − 0.191368 − 0.127692 − 0.081217  0.141094

18.14. Using the results of the previous exercise, we calculate E[N ∧ 4] and multiply it by 100. E[N ∧ 4]  0.458630 (0) + 0.191368 (1) + 0.127692 (2) + 0.081217 (3) + 0.141094 (4)  1.254779

The expected value of N is 3 (0.5)  1.5, so 100 E[N] − E[N ∧ 4]  100 (1.5 − 1.254779)  24.5221





18.15. Without the deductible, E[S]  0.3 (10) + 0.3 (20) + 0.4 (50)  29. The probabilities of aggregate losses of 0, 10, and 20 (computed directly, without the recursive formula) are: g0  e −1  0.367879 g 10  e −1 (0.3)  0.110364 e −1 (0.32 )  0.126918 2 The probability of 30 or greater is 1 − 0.367879 − 0.110364 − 0.126918  0.394838. So E[S ∧ 30], using the first method, is E[S ∧ 30]  0.110364 (10) + 0.126918 (20) + 0.394838 (30)  15.4872 g 20  e −1 (0.3) +

The expected amount of claims over 30 is 29 − 15.4872  13.5128 . (E)

18.16. The probabilities that aggregate losses are 0 or 1 are g0  e −2  0.135335 g1  e

−2

1  0.090224 (2) 3

!

So the probability that aggregate losses are 2 or more is 1 − 0.135335 − 0.090224  0.774441. Then E[S]  2 (2)  4

E[S ∧ 2]  0.090224 (1) + 0.774441 (2)  1.639106

E ( S − 2)+  4 − 1.639106  2.360894

f

g

(B)

18.17. E[S]  3 0.3 (1) + 0.2 (2) + 0.1 (3)  3





Pr ( S  0)  0.43  0.064 E[S ∧ 1]  (1 − 0.064)  0.936

E ( S − 1)+  3 − 0.936  2.064

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

(C)

EXERCISE SOLUTIONS FOR LESSON 18

323

18.18. E[S]  E[N] E[X]  (1.2)(170)  204. Now, we calculate probabilities that S, aggregate prizes, is 0 or 100. g0  0.8 (0.2) + 0.2 (0.22 )  0.168 g100  0.8 (0.7) + 2 (0.2)(0.2)(0.7)  0.616 In the calculation of g100  Pr ( S  100) , we added the probability of one prize of 100 and of two prizes, either the first 0 and the second 100 or the first 100 and the second 0 (hence the multiplication by 2). Pr ( S ≥ 200)  1 − 0.168 − 0.616  0.216. So E[S ∧ 200]  0.616 (100) + 0.216 (200)  104.8

E ( S − 200)+  204 − 104.8  99.2

f

g

The answer is 2.75 (99.2)  272.8 . (D). 18.19. Retained claims are claims not paid by the reinsurer. Reinsurance pays 0 unless a member of the group has claims of 3, in which case it pays 1. The probability of a claim of 3 is 0.1, so the average reinsurance payment per person is (0.1)(1)  0.1, and total reinsurance claims for the group is 2 (0.1)  0.2. The reinsurance premium is therefore (1.1)(0.2)  0.22. The dividend is 3 minus administrative expenses of (0.2)(3)  0.6, reinsurance premium of 0.22, and claims, or 3 − 0.6 − 0.22 − retained claims, or 2.18 − retained claims, but not less than 0. If retained claims are greater than 2, the dividend will be 0. The probability of retained claims of 0 is the probability that both members have claims of 0, which is (0.4)(0.4)  0.16. The probability of retained claims of 1 is the probability that one member has claims of 0 and the other has claims of 1, or 2 (0.4)(0.3)  0.24. Since reinsurance pays the amount of the claim above 2, the probability that retained claims for a member are 2 is the probability that claims are 2 or greater, or 0.3. The probability that retained claims for the group are 2 is the probability of 0 for one and 2 retained for the other, or 2 (0.4)(0.3)  0.24, plus the probability of 1 for both, or (0.3)(0.3)  0.09, so the total probability of retained claims of 2 is 0.24 + 0.09  0.33. Summarizing: Retained Claims

Probability

Dividend

0 1 2

0.16 0.24 0.33

2.18 1.18 0.18

The expected value of the dividend is therefore

(0.16)(2.18) + (0.24)(1.18) + (0.33)(0.18)  0.6914

(A)

18.20. At each factory, expected repair costs are E[X]  0.3 (1) + 0.2 (2) + 0.1 (3)  1, and the limited expected value at 1 is E[X ∧ 1]  0.6. Therefore, insurance premium is 1.1 (1 − 0.6)  0.44. Nonrandom profit is then 3 − 0.15 (3) − 2 (0.44)  1.67. Profits will be non-negative if 1.

both factories have 0 repair cost, probability 0.42  0.16, or

2.

if one factory has non-zero repair costs and the other has 0 repair costs, probability 2 (0.4)(0.6)  0.48.

Expected value of non-negative profit is 0.16 (1.67) + 0.48 (0.67)  0.5888 . (E)

C/4 Study Manual—17th edition Copyright ©2014 ASM

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

324

18.21. E[S]  4 (40)  160. To calculate E[S ∧ 100] we need the probabilities of 0, 40, and 80, which are the geometric distribution’s probabilities (p n ) of 0, 1, and 2, which can be calculated recursively by repeated multiplication by β/ (1 + β )  0.8 1  0.2 p0  1+4

!

p 1  0.8 (0.2)  0.16 p 2  0.8 (0.16)  0.128 So Pr ( N > 2)  1 − 0.2 − 0.16 − 0.128  0.512. E[S ∧ 100]  0.16 (40) + 0.128 (80) + 0.512 (100)  67.84

E ( S − 100)+  160 − 67.84  92.16

f

g

(C)

18.22. Expected aggregate claims, S, is E[N] E[X]  (1.3)(4)  5.2. To exceed 4 (5.2)  20.8 there must be two claims, either one for 20 and one for 10, probability 2 (0.4)(0.1)(0.2)  0.016, or two for 20, probability 0.4 (0.12 )  0.004, so the total probability is 0.016 + 0.004  2% . (A) 18.23. The probability of paying less than or equal to $150 is the probability that all losses will be less than or equal to 500, since an individual loss greater than 500 is at least 800, on which 200 is paid after the per event and annual deductible. The modified frequency distribution can be calculated by multiplying the Poisson parameter λ  0.15 by the probability of a claim above 500, as we learned in Lesson 13, so the new parameter is 0.15 (1 − 0.10 − 0.25)  0.0975. Then the probability of no losses above 500 is e −0.0975  0.9071 and the probability of at least one loss above 500 is 1 − 0.9071  9.29% . (E)

Quiz Solutions 18-1.

We’ll use the second method. E[S]  0.3 + 0.1 (2) + 0.1 (3)





0.2 (100) + 0.4 (200) + 0.3 (300) + 0.1 (400)  (0.8)(230)  184



g0  0.5 g100  (0.3)(0.2)  0.06

SS (0)  0.5 SS (100)  0.5 − 0.06  0.44

E[S ∧ 200]  (100)(0.5) + (100)(0.44)  94 E[ ( S − 200)+ ]  184 − 94  90 The stop-loss reinsurance premium is 1.5 (90)  135 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 19

Aggregate Losses: Miscellaneous Topics Reading: Loss Models Fourth Edition 9.4, 9.6.5 This lesson is devoted to topics on the syllabus not conveniently fitting into the other lessons.

19.1

Exact Calculation of Aggregate Loss Distribution

In some special cases, one can combine the frequency and severity models into a closed form for the distribution function of aggregate losses. The distribution function of aggregate losses at x is the sum over n of the probabilities that the claim count equals n and the sum of n loss sizes is less than or equal to x. When the sum of loss random variables has a simple distribution, it may be possible to calculate the aggregate loss distribution. Two cases for which the sum of independent random variables has a simple distribution are: 1. Normal distribution. If X i are normal with mean µ and variance σ 2 , their sum is normal. 2. Exponential or gamma distribution. If X i are exponential or gamma, their sum has a gamma distribution. We shall now discuss these distributions in greater detail.

19.1.1

Normal distribution

If n random variables X i are independent and normally distributed with parameters µ and σ2 , their sum is normally distributed with parameters nµ and nσ2 . Thus we can calculate the probability that the sum is less than a specific value by referring to the normal distribution table. Example 19A For an insurance coverage, the number of losses follows a binomial distribution with m  2, q  0.3. Loss sizes are normally distributed with mean 1000 and variance 40,000. (1) Determine the probability that aggregate losses are less than 1200. Do not use the normal approximation. (2) Repeat (1) with the normal approximation. Answer: 1. The probabilities of 0, 1, and 2 losses are p0  0.72  0.49, p1  2 (0.3)(0.7)  0.42, and p2  0.09. If there is no loss, then aggregate losses are certainly below 1200. If there is N  1 loss, then for the aggregate distribution S, 1200 − 1000 Pr ( S < 1200 | N  1)  Φ √  0.8413 40,000

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

325

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

326

If there are N  2 losses, the sum of those 2 losses is normal with mean 2000 and variance 80,000, so 1200 − 2000 Pr ( S < 1200 | N  2)  Φ √  Φ (−2.83)  0.0023 80,000

!

The probability that aggregate losses are less than 1200 is Pr ( S < 1200)  0.49 + 0.42 (0.8413) + 0.09 (0.0023)  0.8436 2. With the normal approximation, the aggregate mean is (0.6)(1000)  600 and the aggregate variance is Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  0.6 (40,000) + 0.42 (10002 )  444,000 The probability that aggregate losses ae less than 1200 is 1200 − 600  Φ (0.90)  0.8159 Pr ( S < 1200)  Φ √ 444,000

!

19.1.2



Exponential and gamma distributions

The Erlang distribution function The sum of n exponential random variables with common mean θ is a gamma distribution with parameters α  n and θ. When a gamma distribution’s α parameter is an integer, the gamma distribution is also called an Erlang distribution. I will use the letter n instead of the letter α as the first parameter of an Erlang distribution. Let’s discuss how to compute the distribution function F ( x ) for an Erlang distribution with parameters n and θ. If n  1, the Erlang distribution is an exponential distribution, and F ( x )  1 − e −x/θ . But let’s develop a formula that works for any n. In a Poisson process1 with parameter 1/θ, the time between events is exponential with mean θ. Therefore, the time until the n th event occurs is Erlang with parameters n and θ. In other words, the probability that n events occur before time x is FX ( x ) , where X is Erlang(n, θ). Equivalently, the probability of at least n events occurring before time x in a Poisson process is equal to FX ( x ) . By Poisson probability formulas, the probability of exactly n events occurring before time x in this Poisson process is e −x/θ ( x/θ ) n /n! The formula we have just developed for the Erlang distribution function FX ( x ) is FX ( x )  1 −

n−1 X j0

e −x/θ

( x/θ ) j j!

(19.1)

Example 19B Loss sizes on an insurance coverage are exponentially distributed with mean 1000. Three losses occur in a day. Calculate the probability that the total insurance reimbursement for these three losses is greater than 4000. 1Poisson processes are covered in Exam ST. All Poisson processes mentioned in this lesson are homogeneous. If you have not taken Exam ST, just be aware of the following: • In a Poisson process with parameter λ, the number of events occurring by time t has a Poisson distribution with mean λt. • In a Poisson process with parameter λ, the time between events follows an exponential distribution with parameter 1/λ.

C/4 Study Manual—17th edition Copyright ©2014 ASM

19.1. EXACT CALCULATION OF AGGREGATE LOSS DISTRIBUTION

327

Answer: Let X be the total insurance reimbursement. Total insurance reimbursement follows an Erlang distribution with parameters n  3 and θ  1000. Notice that the example asks for the survival function at 4000 rather than the distribution function. As discussed above, the corresponding Poisson process has parameter 1/1000. The probability that the reimbursement is greater than 4000 is the probability of at most three events occurring before time 4000, or Pr ( X > 4000)  e

−4000/1000

2 X (4000/1000) n n0

n!

 e −4 (1 + 4 + 8)  0.2381



Since we can calculate the Erlang distribution function, we can in principle calculate FS ( x ) for any compound distribution with exponential severities. Suppose the mean of the exponential is θ. To calculate FS ( x ) , sum up over all n the probabilities of n events times the probability that an Erlang with parameters n and θ is no greater than x. In symbols: FS ( x ) 

∞ X

Pr ( N  n ) FX n ( x )

n0

where X n follows an Erlang distribution with parameters n and θ. The only problem with this method is that the sum is usually infinite, and to decide when to stop summing, we’d have to determine when the sum has converged to the desired accuracy. However, if the frequency distribution has only a finite number of values—for example, if frequency is binomial—this formula provides an exact answer. Example 19C Claim counts have a binomial distribution with m  2, q  0.2. Claim sizes are exponential with mean 1000. Calculate the probability that aggregate claims are less than their mean. Answer: The mean is 2 (0.2)(1000)  400. We must calculate2 FS (400) . The probability of zero claims is p0  0.82  0.64. The probability of one claim is p1  2 (0.8)(0.2)  0.32. The probability that one exponential claim is less than 400 is 1 − e −400/1000  0.329680. The probability of two claims is 0.04. The probability that an Erlang distribution with parameters n  2 and θ  1000 is less than 400 is the probability of at least two events by time 400 in a Poisson process with parameter 1/1000, or 1 − e −0.4 (1 + 0.4)  0.061552. Summing up, the probability that aggregate claims are less than their mean is FS (400)  0.64 + 0.32 (0.329680) + 0.04 (0.061552)  0.74796



We can similarly calculate the distribution of the compound model if the severities are Erlang (instead of exponential) with the same parameter θ, since the sums of n such Erlangs is an Erlang whose first parameter is the sum of the n first parameters and whose second parameter is θ. But since n will be a big number (for example, summing up two Erlangs each with n i  2 results in n  4), this is not a realistic set-up for an exam question. Negative binomial/exponential compound models The following material is unlikely to appear on an exam. In the fourth edition of Loss Models, it only appears in an example. 2The aggregate distribution is continuous except at 0, so I will use “less than 400” and “no greater than 400” interchangeably. C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

328

Suppose a compound model has negative binomial frequency with parameters r and β, and exponential severities with parameter θ. Moreover, suppose r is an integer. The textbook proves the remarkable result that this model is equivalent to a compound model with binomial frequency with parameters m  r and q  β/ (1 + β ) and exponential severities with parameter θ (1 + β ) . We can then use the method discussed above to calculate the compound model’s probabilities. The textbook proves this by algebraic manipulation on the probability generating function of the compound distribution. You need not learn this derivation. To remember the above-mentioned parameters • The binomial m equals the negative binomial r. (Memorize this fact.)

r

• The probability of 0 must not change. For the negative binomial, p0  1/ (1 + β ) . For which binomial would this be p0 ? Since m  r, we must have q  β/ (1 + β ) to make the probability p0 .



• The expected value of the compound model must not change. For the original negative binomial/exponential model, it is rβθ. Therefore, for the binomial/exponential model with binomial mean rβ/ (1 + β ) as derived in the previous paragraph, the exponential must have mean θ (1 + β ) . The easiest case (and considering how unlikely exam questions are on this topic, perhaps the only one that would appear on an exam) is when the r parameter of the negative binomial distribution is equal to 1. In this case, the negative binomial distribution is a geometric distribution, and the Erlang distribution is an exponential distribution. The binomial distribution in the equivalent model is a Bernoulli with p0  1/ (1 + β ) , p1  β/ (1 + β ) . Thus the compound geometric/exponential model’s distribution function is a two-point mixture of a degenerate distribution at 0 with weight 1/ (1 + β ) and an exponential distribution with mean θ (1 + β ) , weight β/ (1 + β ) . Example 19D Claim counts follow a geometric distribution with mean 0.2. Claim sizes follow an exponential distribution with mean 1000. A stop-loss reinsurance contract pays all losses with an aggregate deductible of 5000. 1. Determine the probability that aggregate losses will be greater than 5000. 2. Determine the expected losses paid under the contract. Answer: The model is equivalent to a model having • Bernoulli frequency with parameter β/ (1 + β )  0.2/1.2  1/6. • Exponential severity with parameter (1 + β ) θ  1.2 (1000)  1200. 1. An exponential distribution with mean 1200 has survival function S (5000)  e −5000/1200  0.015504. Since the probability of a loss is 1/6, the probability that aggregate losses are greater than 5000 is 0.015504/6  0.002584 . 2. We want E ( S − 5000)+  E[S] − E[S ∧ 5000]. For an exponential, E[X ∧ x]  θ 1 − e −x/θ by the formulas in the distribution tables. In our case, since the probability of a loss in the Bernoulli/exponential model is 1/6, the limited expected value E[S ∧ 5000] is 1/6 of the corresponding value for an exponential. Therefore,

f

g



E ( S − 5000)+  16 (1200) − 61 (1200) 1 − e −5000/1200

f

g



 200e −5000/1200  3.1008

Now let’s do a non-geometric example. C/4 Study Manual—17th edition Copyright ©2014 ASM



 

19.2. DISCRETIZING

329

Example 19E You are given: (i) Claim counts follow a negative binomial distribution with r  2, β  0.25. (ii) Claim sizes follow an exponential distribution with mean 800. Calculate the probability that aggregate claims are less than 400. Answer: The equivalent binomial model has binomial frequency m  2, q  1+β  0.25 1.25  0.2 and exponential severity with mean θ (1 + β )  800 (1.25)  1000. So this example reduces to Example 19C, and the answer is the same: 0.74796 .  β

19.1.3

Compound Poisson models

If S j are a set of compound P Poisson distributions with Poisson parameters λ j and severity random P variables X j , the sum S  nj1 S j is a compound Poisson distribution with Poisson parameter λ  nj1 λ j and severity equal to a weighted average, or a mixture, of the individual severities X j . The weights are λ j /λ. This means that if you are interested in the distribution function of S, you can calculate it directly rather than calculating the distribution functions of the S j separately and convolving them. The syllabus goes to the trouble of excluding the textbook’s example of compound Poisson models, although it doesn’t exclude the textbook’s discussion preceding the example, which I’ve summarized in the above paragraph. Perhaps they think the textbook’s example (which involves a sum of ten compound Poisson models with Erlang severities) is too complicated, so here’s a simple example. Example 19F An automobile liability policy covers bodily injury and property damage losses. Annual aggregate losses from bodily injury claims follow a compound Poisson process with mean 0.1 per year. Loss sizes follow an exponential distribution with mean 10,000. Annual aggregate losses from property damage claims follow a compound Poisson process with mean 0.3 per year. Loss sizes follow an exponential distribution with mean 5,000. Calculate the median size of the next loss from this policy. Answer: The combined losses form a compound Poisson process with mean 0.4 per year, in which loss sizes are a mixture of the two exponential distributions with weight 0.1/ (0.1 + 0.3)  0.25 on the distribution with mean 10,000 and weight 0.75 on the distribution with mean 5,000. Setting the survival function equal to 0.5, 0.25e −x/10,000 + 0.75e −x/5,000  0.5 Let u  e −x/10,000 , and multiply through by 4. 3u 2 + u − 2  0 √ −1 + 25 2  u 6 3 2 −x/10,000 e  3 x  −10,000 ln 2/3  4055

19.2



Discretizing

This topic was briefly on the exam syllabus in the early 2000’s, then removed, then returned in 2005. To my knowledge, no exam questions have ever been asked on it, so you’re safe skipping it. At most, you probably only need to know the method of rounding, which is easy. C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

330

The recursive method for calculating the aggregate distribution as well as the direct convolution method require a discrete severity distribution. Usually the severity distribution is continuous. There are two methods for discretizing the distribution. In both methods, you pick a span, the distance between the points that will have a positive probability in the discretized distribution. For example, you may create a distribution with probabilities at all integers (span  1), or only at multiples of 1000 (span  1000). In the following example, recall our notational convention that f n is the probability that the severity equals n. (We use p n for the frequency and g n for aggregate losses, but those won’t come up in the following.)

19.2.1

Method of rounding

In the method of rounding,  the severity values  within a span  are rounded to the endpoints. If h is the span, f kh is set equal to F ( k + 0.5 − 0) h − F ( k − 0.5 − 0) h , where − 0 indicates that the lower bound is included but the upper bound isn’t; as usual in rounding, 0.5 rounds up. This rounding convention makes no difference if F is continuous everywhere. Example 19G Loss sizes follow a Pareto distribution with α  2, θ  3. The distribution will be discretized by the method of rounding with a span of 4. Calculate the resulting probabilities of 0, 4, 8, and 12; f0 , f4 , f8 , and f12 . Answer: Anything below 2 gets rounded to 0, so f0  F (2)  1 −

3 2 5

 0.64 .

2 3 2 − 39  0.24889 . 5 2 3 2 6 and 10 gets rounded to 8, so f8  F (10) − F (6)  39 − 13  0.05786 . 3 2 3 2  0.02211 10 and 14 gets rounded to 12, so f12  F (14) − F (10)  13 − 17

Anything between 2 and 6 gets rounded to 4, so f4  F (6) − F (2) 

Anything between Anything between

19.2.2

. 

Method of local moment matching

The method of rounding will usually result in a distribution whose mean is different from the original mean. The method of local moment matching guarantees that the discretized distribution will have the same mean as the original distribution. In the method of local moment matching, the probabilities and partial moments within a span are distributed between the two endpoints. Let h be the span, the distance between endpoints. We’ll only discuss the simplest case, in which only the first moments are matched. In this case, the endpoints of an interval are x k and x k+1 , where x k  x0 +kh. (Usually x0  0 for the distributions we use.) We assign masses m0k and m 1k to these endpoints such that 1. m0k + m 1k  F ( k + 1) h − F ( kh ) . This means the probabilities match.



2. x k m 0k + x k+1 m1k  matches.



R

( k+1) h kh

x f ( x ) dx. This means that locally, the contribution of the span to the mean

These are two equations in two unknowns—the two masses. After the masses are determined, they are added together to get the probabilities. In other words, f kh is equal to the sum of m 0k and m 1k−1 , the masses from the starting endpoint of the span and the ending endpoint of the previous span. As in the method of rounding, the convention (if F is not fully continuous) is to include the left endpoint but not the right endpoint in calculating the probabilities and moments of the spans. Example 19H Repeat example 19G using the method of local moment matching, matching the first moment. Calculate f0 and f4 . C/4 Study Manual—17th edition Copyright ©2014 ASM

19.2. DISCRETIZING

331

Answer: In this example, h  4. We must calculate m 00 , m 01 , and m10 .

The sum of the two masses m 00 and m 10 for the first span should equal Pr (0 ≤ X < 4)  1− Note that for this Pareto, θ * θ .1 − E[X ∧ x]  α−1 x+θ

,

so that E[X ∧ 4] 

12 7

! α−1

3 2 7

 0.81633.

  3x +/  3 1 − 3  3+x 3+x -

 1.71429 and E[X ∧ 8]  24/11. Also, in general, b

Z a

x f ( x ) dx  E[X ∧ b] − bS ( b ) − E[X ∧ a] − aS ( a )









The sum 0m 00 + 4m 10 should equal 4

Z 0

Then m10 

0.97959 4

x f ( x ) dx  E[X ∧ 4] − 4S (4)  1.71429 − 4 (0.18367)  0.97959

 0.24490 and m00  0.81633 − 0.24490  0.57143.

For the second span, the sum of the two masses m 01 and m 11 should equal F (8) − F (4)  0.10929. The first moment matching is 4m 01 + 8m 11  E[X ∧ 8] − 8S (8) − E[X ∧ 4] − 4S (4)



24 3  −8 11 11

!2

3 2 7



3 2 11





− 0.97959  0.60719

Solving: m 01  0.10929 − m 11

4 (0.10929 − m 11 ) + 8m 11  0.60719 0.60719  0.15180 0.10929 − m 11 + 2m 11  4 m 11  0.15180 − 0.10929  0.04250 m 01  0.10929 − 0.04250  0.06679

Then f0  m 00  0.57143 and f4  m 10 + m01  0.24490 + 0.06679  0.31169 To calculate f8 , you’d need the left mass of the span [8, 12) .



The textbook provides the general formulas m 0k

Z −

m 1k 

Z

x k +h−0

x k −0 x k +h−0 x−k−0

x − xk − h dF ( x ) h

x − xk dF ( x ) h

when matching one moment, where − 0 as before means include the point mass at the lower bound only. For a continuous F, dF ( x )  f ( x ) dx. C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

332

The textbook provides the following simpler set of formulas directly for the probabilities f k (rather than for the masses m 0k and m1k ) in the exercises: E[X ∧ h] h 2 E[X ∧ ih] − E[X ∧ ( i − 1) h] − E[X ∧ ( i + 1) h]  , h

f0  1 − f ih

i  1, 2, . . .

(19.2)

The textbook also generalizes to any number of moments. In this case, you select points uniformly throughout the span, p + 1 points including the endpoints for p moments, and get p + 1 equations in p + 1 unknowns. This means that probability gets assigned to the intermediate points as well; if you were matching second moments with a span of 4, you’d assign probabilities to 0, 2, 4, . . . . The probabilities assigned to the intermediate points are the masses, while at the endpoints you’d add up the masses for the two intervals ending and starting at the endpoint, as above. When matching first moments, you will never get negative masses, but you could get negative masses when matching higher moments. Due to the difficulty of the calculation, I doubt you will be asked to discretize matching higher moments on an exam. Actually, I doubt there will be any discretizing questions of any type on the exam.

Exercises Exact Calculation of Aggregate Loss Distribution 19.1. In a collective risk model for aggregate losses, claim sizes follow a normal distribution with parameters µ  25, σ2  50. Claim sizes are independent of each other. Claim counts have the following distribution: n

pn

0 1 2

0.5 0.3 0.2

Claim sizes are independent of claim counts. Calculate the probability that aggregate claims are greater than 40. 19.2. In a collective risk model for aggregate losses, claims counts follow a binomial distribution with m  3, q  0.4. Claim sizes follow a normal distribution with parameters µ  100, σ2  1000. Claim sizes are independent of each other and of claim counts. Calculate the probability that aggregate claims are greater than their mean. 19.3.

You are given:

(i) Claim counts follow a binomial distribution with parameters m  2, q  0.2. (ii) Claim sizes follow an exponential distribution with mean 5000. (iii) Claim sizes are independent of each other and of claim counts. Determine the probability that aggregate losses are greater than 3000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 19

333

19.4. Claim sizes follow an exponential distribution with θ  5, and are independent of each other. Claim counts are independent of claim sizes, and have the following distribution: n

pn

0 1 2

0.7 0.2 0.1

Calculate FS (3) . 19.5.

For a block of 50 policies:

(i) Claim sizes on a coverage follow an exponential distribution with mean 1500. (ii) Claim counts for each policy follow a negative binomial distribution with parameters r  0.02 and β  25. (iii) Claim sizes are independent of each other. Claim counts and claim sizes are independent. Determine the probability that aggregate claims are within 0.842 standard deviations of the mean. 19.6. Claim sizes on a coverage follow an exponential distribution with mean 500. 100 lives are covered under the contract. Claim counts for each life follow a negative binomial distribution with mean 0.1 and variance 1.1. Claim counts and claim sizes are independent. A stop-loss reinsurance contract is available at 150% of expected claim cost. You are willing to pay 1500 for the contract. Determine the aggregate deductible needed to make the cost of the contract 1500. 19.7.

For a collective risk model:

(i) Claim counts follow a geometric distribution with β  0.2. (ii) Claim sizes follow an exponential distribution with θ  8000. (iii) Claims sizes are independent of each other and of claim counts. Calculate TVaR0.9 ( S ) for the aggregate loss distribution S. Discretizing 19.8.

X has an exponential distribution with mean 1.

Calculate p2 of the distribution discretized using the method of rounding with a span of 1. 19.9. Claim counts follow a Poisson distribution with mean 3. Claim sizes follow an exponential distribution with θ  2. Claim counts and claim sizes are independent. The severity distribution is discretized using the method of rounding with span 1. A stop-loss reinsurance contract has an aggregate deductible of 1.6. Calculate expected losses paid by the reinsurance contract. 19.10. X has a single-parameter Pareto distribution with θ  2, α  1. The distribution of X is discretized using the method of local moment matching with span 2, matching the first moment. Calculate p4 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

334

Use the following information for questions 19.11 and 19.12: You are given: (i) Claims counts follow a negative binomial distribution with r  2, β  0.5. (ii) Claim sizes follow an exponential distribution with θ  3. (iii) Claim counts and claim sizes are independent. 19.11. The severity distribution is discretized using the method of local moment matching with span 1. Calculate FS (1) . 19.12. Using the actual severity distribution, calculate FS (1) . Additional released exam questions: CAS3-S06:36

Solutions 19.1.

Let N be the number of claims and S aggregate claims. If there is one claim Pr ( S > 40 | N  1)  Φ

25 − 40  Φ (−2.12)  0.0170 √ 50

!

If there are two claims, their sum is normal with mean 50 and variance 100, so the probability that aggregate losses are greater than 40 is 50 − 40 Pr ( S > 40 | N  2)  Φ √  Φ (1)  0.8413 100

!

The probability aggregate claims are greater than 40 is 0.3 (0.0170) + 0.2 (0.8413)  0.1734 . 19.2. Mean aggregate claims is 3 (0.4)(100)  120. Conditional probability of aggregate claims greater than 120 for each number of claims N is 120 − 100  Φ (−0.63)  0.2643 √ 1000 ! 120 − 200 Pr ( S > 120 | N  2)  1 − Φ √  Φ (1.79)  0.9633 2000 ! 120 − 300 Pr ( S > 120 | N  3)  1 − Φ √  Φ (3.29)  0.9995 3000

Pr ( S > 120 | N  1)  1 − Φ

!

The probabilities of 1, 2, 3 claims are p1  3 (0.62 )(0.4)  0.432, p2  3 (0.42 )(0.6)  0.288, and p 3  0.43  0.064. The probability aggregate losses are greater than their mean is 0.432 (0.2643) + 0.288 (0.9633) + 0.064 (0.9995)  0.4556 . 19.3.

If there is one claim, the probability that aggregate losses are greater than 3000 is Pr ( S > 3000 | N  1)  e −3000/5000  0.548812

If there are two claims, aggregate losses are Erlang with n  2 and Pr ( S > 3000 | N  2)  e −3000/5000 (1 + 0.6)  0.878099 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 19

335

The distribution of claim counts is p 1  2 (0.2)(0.8)  0.32 and p2  0.22  0.04. The probability that aggregate losses are greater than 3000 is Pr ( S > 3000)  0.32 (0.548812) + 0.04 (0.878099)  0.2107 19.4. The probability of 0 is 0.7. The probability that one claim is less than 3 is the probability that an exponential with mean 5 is less than 3, or 1 − e −3/5  0.45119. The probability that two claims are less than 3 is the probability that an Erlang with parameters n  2 and θ  5 is less than 3. That probability is the same as the probability that at least two events occur by time 3 in a Poisson process with parameter 1/5, or 1 − e −0.6 (1 + 0.6)  0.12190 Summing up the three probabilities, FS (3)  0.7 + 0.2 (0.45119) + 0.1 (0.12190)  0.80243 19.5. For the block of 50 policies, claim counts follow a negative binomial with r  50 (0.02)  1 and β  25, or a geometric. The mean and variance of the geometric are β  25 and β (1 + β )  650 respectively. Letting S be aggregate losses for the entire block, E[S]  25 (1500)  37,500. By the √ compound variance formula, Var ( S )  25 (15002 ) + 650 (15002 ) , and the standard deviation is σ  1500 25 + 650  38,971. Then 37,500 − 0.842 (38,971)  4686 and 37,500 + 0.842 (38,971)  70,314, so we want the probability of 4686 ≤ S ≤ 70,314. β In the equivalent Bernoulli/exponential distribution, the Bernoulli has parameter (1+β )  25 26 and the exponential has parameter θ (1 + β )  (26)(1500)  39,000. The probabilities of the aggregate distribution being below 4686 and 70,314 are 25 −4686/39,000 e  0.1473 26 25 FS (70,314)  1 − e −70,314/39,000  0.8415 26 FS (4686)  1 −

So the probability of being in the interval is 0.8415 − 0.1473  0.6942 . If the normal approximation had been used, the probability would be 0.6 since Φ (0.842)  0.8. 19.6. For each life, rβ  0.1 and rβ (1 + β )  1.1, so β  10 and r  0.01. For the group, r  100 (0.01)  1 and β  10, making the distribution geometric. f g In order to make the cost 1500, expected claim costs, E ( S − d )+ , must be 1000. We have

f

g

∞

Z

E ( S − d )+ 

d

1 − FS ( x ) dx



∞ β  e −x/θ (1+β ) dx 1+β d Z ∞ 10  e −x/5500 dx 11 d  10   5500e −d/5500 11

Z

C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

336

 5000e −d/5500 We set E ( S − d )+ equal to 1000.

f

g

5000e −d/5500  1000 e −d/5500  0.2 d  −5500 ln 0.2  8852 19.7. The aggregate loss distribution is equivalent to Bernoulli claim counts with q  β/ (1+β )  0.2/1.2  1/6 and exponential claim sizes with mean θ (1 + β )  8000 (1.2)  9600. To find the 90th percentile of S, since the probability of a loss is 1/6, we need x for which Pr ( S > x )  0.1 and Pr ( S > x )  Pr ( N  1) Pr ( X > x )  61 Pr ( X > x ) , so we need Pr ( X > x )  0.6. Thus e −x/9600  0.6 x  −9600 ln 0.6  4903.93 The average value of S given S > 4903.93, due to lack of memory of the exponential, is 9600, so TVaR0.9 ( S )  4903.93 + 9600  14,503.93 . 19.8.

The interval [1.5, 2.5) goes to 2, so p2  e −1.5 − e −2.5  0.1410 .

19.9. The discretized distribution will have p0  Pr ( X < 0.5)  1 − e −0.5/2  0.221199 and p1  e −0.5/2 − e −1.5/2  0.306434. One problem with the method of rounding is that the mean of the discretized distribution is not the same as the mean of the original distribution. Fortunately, it is easy to calculate for an exponential (but not for other distributions). Since p k  e − ( k−0.5)/2 − e − ( k+0.5)/2 except for p0 , we have E[X] 

∞ X

kp k

k0

 e −1/4 − e −3/4 + 2 e −3/4 − e −5/4 + 3 e −5/4 − e −7/4 + · · ·









 e −1/4 + e −3/4 + e −5/4 + · · · 

e −1/4  1.979318 1 − e −1/2

Expected aggregate losses are E[S]  3 (1.979318)  5.937953. We’ll modify the Poisson to be the frequency of claims above 0, so that we don’t have to use the recursive formula; we’ll make the parameter λ0  λ (1 − p0 )  3 (1 − 0.221199)  2.336402. Then, letting p n be the probability of n claims, 0

p0  e −λ  e −2.336402  0.0966745 0

p1  λ0 e −λ  2.336402 (0.0966745)  0.225871 The modified probability of a claim size of 1 (given that the claim size is not 0) is f1  0.306434/ (1 − 0.221199)  0.393469. Letting g n be the probability that aggregate losses are n, g0  p0  0.0966745 g1  p1 f1  (0.225871)(0.393469)  0.0888734 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 19

337

Thus the survival function of aggregate losses is SS (0)  1 − 0.0966745  0.903325

SS (1)  0.903325 − 0.0888734  0.814452

Then E[S∧1.6]  0.903325+0.6 (0.814452)  1.391996. The expected losses paid by the reinsurance contract are E[S] − E[S ∧ 1.6]  5.937953 − 1.391996  4.545957 19.10. We’ll use the textbook’s formulas for the masses, with x1  2 and x2  4. m11

Z 

2

Z 

2

Z 

4

x−2 f ( x ) dx 2

4

x−2 2

4

! !

2 dx x2

!

( x − 2) dx x2

2

2 4 x 2  ln 4 + 12 − ln 2 − 1  ln 2 −  ln x +

Z

m02  −

4 6

Z 

4

6

x−4−2 2

!

1 2

 0.19315

2 dx x2

!

(6 − x ) dx x2

6 6 − ln x x 4  −1 − ln 6 + 23 + ln 4 −



1 2

+ ln 32  0.09453

So p4  0.19315 + 0.09453  0.28768 . Alternatively, we can use formulas (19.2). For a single-parameter Pareto with α  1, x ≥ θ, x

Z E[X ∧ x] 

0

S ( x ) dx 

θ

Z 0

1dx +

x

Z θ

x dx x  θ 1 + ln θ θ





so E[X ∧ 2]  2

E[X ∧ 4]  2 (1 + ln 2) E[X ∧ 6]  2 (1 + ln 3)

2 2 (1 + ln 2) − 2 − 2 (1 + ln 3)



p4 

C/4 Study Manual—17th edition Copyright ©2014 ASM



2

 2 ln 2 − ln 3  0.28768

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

338

19.11. We’ll use formulas (19.2) for the probabilities of severities. E[X ∧ 1]  3 1 − e −1/3  0.85041





E[X ∧ 2]  3 1 − e −2/3  1.45975





f0  1 − 0.85041  0.14959

f1  2 (0.85041) − 1.45975  0.24106 We’ll modify the negative binomial to have non-zero claims only by multiplying β by 1 − f0 , obtaining β0  (0.5)(1 − 0.14959)  0.42520. We must also modify the probabilities of severities to make them conditional on severity not being zero by dividing by 1 − f0 ; the modified f1 is 0.24106/ (1 − 0.14959)  0.28346. Then 1 Pr ( S  0)  1 + β0

!r

1  1.42520

!2

 0.49232

rβ0 2 (0.42520)  (0.49232)  0.29376 p1  p0 0 1+β 1.42520

!

!

Pr ( S  1)  (0.28346)(0.29376)  0.08327 Then FS (1)  0.49232 + 0.08327  0.57559 . 19.12. In the equivalent binomial/exponential compound model, the binomial has parameters m  2, q  β/ (1 + β )  1/3 and the exponential has parameter θ (1 + β )  3 (1.5)  4.5. The probability of 0 claims is (2/3) 2  4/9. In the equivalent model: • •

The probability of 1 claim is 2 (2/3)(1/3)  4/9. If there is one claim, the probability of the claim being less than 1 is 1 − e −1/4.5  0.19926.

The probability of 2 claims is 1/9. If there are 2 claims, the sum of the two claims is Erlang with parameters n  2, θ  4.5. The probability of the sum being less than 1 is the probability of at least two events by time 1 in a Poisson process with parameter 1/4.5, or 1 − e −1/4.5 (1 + 1/4.5)  0.02132.

Summing up these probabilities, FS ( 1 ) 

4 4 1 + (0.19926) + (0.02132)  0.53537 9 9 9

The discretized estimate was 7.5% too high.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 20

Supplementary Questions: Severity, Frequency, and Aggregate Loss 20.1. Loss sizes follow a spliced distribution. In the range (0, 100) , the probability density function is of the form f ( x )  c 1 e −x/θ1 . In the range (100, ∞) , the probability density function is of the form f ( x )  c 2 θ22 / ( θ2 + x ) 3 . The parameters c 1 , c 2 , θ1 , and θ2 are chosen so that F (50)  0.5 F (100)  0.7 F (200)  0.9 Determine F (150) . (A) 0.80 20.2.

(B) 0.81

(C) 0.82

(D) 0.83

(E) 0.84

For a zero-modified random variable N from the ( a, b, 1) class, you are given

(i) Pr ( N  1)  0.6 (ii) Pr ( N  2)  0.18 (iii) Pr ( N  3)  0.072 Determine Pr ( N  0) . (A) 0.06 20.3.

(B) 0.07

(C) 0.08

(D) 0.09

(E) 0.10

(C) 237

(D) 244

(E) 250

For loss size, you are given:

(i) (ii)

α , h ( x )  100+x E[X]  50.

x > 0.

Calculate TVaR0.9 ( X ) . (A) 223

(B) 230

20.4. Claim costs for an insurance coverage follow a gamma distribution with parameters α  4 and θ  10. Claim adjustment costs are a proportion of the claim costs. The proportion is uniformly distributed on (0.05, 0.15) , and is independent of claim costs. Determine the variance of total claim costs including claim adjustment costs. (A) 483

(B) 484

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 485

(D) 486

339

(E) 487

Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

340

20.5. Loss counts follow a negative binomial distribution with r  2, β  5. Loss sizes follow an inverse exponential distribution with θ  10. Let N be the number of losses of amounts less than 20. Determine the coefficient of variation of N. (A) 0.40 20.6.

(B) 0.60

(C) 0.64

(D) 0.66

(E) 0.82

For each of five tyrannosaurs with a taste for scientists:

(i) The number of scientists eaten has a binomial distribution with parameters m  1, q  0.6. (ii) The number of calories of a scientist is uniformly distributed on (7000, 9000) . (iii) The number of calories of each scientist is independent of the others and independent of the number of scientists eaten. Determine the probability that two or more scientists are eaten and that no more than two have at least 8000 calories each. (A) 0.50 20.7.

(B) 0.60

(C) 0.63

(D) 0.65

(E) 0.75

The conditional hazard rate of a random variable X given Θ is h ( x | Θ)  0.1Θ

The probability density function of Θ is f ( θ )  1002 θe −100θ

θ>0

Calculate the median of X. (A) (B) (C) (D) (E)

Less than 150 At least 150, but less than 250 At least 250, but less than 350 At least 350, but less than 450 At least 450

20.8. The number of snowstorms in January has a binomial distribution with m  8, q  0.5. The distribution of the number of inches of snow is: Inches

Probability

1 2 3 4 5 6

0.2 0.3 0.2 0.1 0.1 0.1

The number of snowstorms and the number of inches of snow are independent. Determine the expected amount of snow in January given that at least 4 inches of snow fall. (A) 11.7

(B) 11.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 11.9

(D) 12.0

(E) 12.1

Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

20.9.

341

You are given:

(i) The number of claims follows a binomial distribution with m  3, q  0.2. (ii) Claim sizes follow the following distribution:

(iii)

Claim size

Claim probability

0 1 2 3

0.2 0.5 0.2 0.1

A reinsurance policy has an aggregate deductible of 6.

Determine the expected aggregate amount paid by the reinsurer. (A) 0.000300

(B) 0.000312

(C) 0.000324

(D) 0.000336

(E) 0.000348

20.10. You are given: (i) Claim counts follow a negative binomial distribution with r  0.5, β  1 per year. (ii) Claim sizes follow a two-parameter Pareto distribution with α  3, θ  1000. (iii) Claim counts and claim sizes are independent. Using the normal approximation, determine the probability that annual aggregate claims are less than 150. (A) 0.15

(B) 0.25

(C) 0.35

(D) 0.45

(E) 0.55

20.11. For an insurance coverage loss sizes follow a Pareto distribution and are independent of the deductible. You are given: (i) With an ordinary deductible of 100, average payment size per paid claim is 2600. (ii) With an ordinary deductible of 500, average payment size per paid claim is 2800. Calculate the average payment size per paid claim for a policy with a franchise deductible of 1000. (A) (B) (C) (D) (E)

Less than 3000 At least 3000, but less than 3500 At least 3500, but less than 4000 At least 4000, but less than 4500 At least 4500

20.12. Losses follow a Pareto distribution with parameters θ  1000 and α. The loss elimination ratio at 600 is 0.4. Determine α. (A) (B) (C) (D) (E)

Less than 1.9 At least 1.9, but less than 2.0 At least 2.0, but less than 2.1 At least 2.1, but less than 2.2 At least 2.2

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

342

20.13. For a random variable N following zero-modified negative binomial distribution, r  2, Pr ( N  0)  0.8, and Pr ( N  1)  0.02. Determine Pr ( N  2) . (A) 0.0125

(B) 0.0150

(C) 0.0175

(D) 0.0200

(E) 0.0225

20.14. Earned premium for an insurance coverage is 10,000. An agent gets a bonus of 20% of the amount by which losses are below the level generating a loss ratio of x, but the bonus may not be less than 0. The loss ratio is the ratio of losses to earned premium. Losses follow a Pareto distribution with parameters α  2, θ  6000. The expected value of the bonus is 500. Determine x. (A) 53%

(B) 55%

(C) 58%

(D) 61%

(E) 63%

20.15. At a train station, two train lines stop there, the A and the C. You take the first train that arrives. The probability that the A comes first is 50%. The number of friends you meet on the train, given the train line, has the following distribution: Number of friends

Probability for A train C train

0 1 2 3

0.5 0.2 0.2 0.1

0.6 0.3 0.1 0

Let X be the number of friends you meet. Which of the following intervals constitutes the range of all 80th percentiles of X? (A) [1, 1]

(B) [2, 2]

(C) [1, 2)

(D) (1, 2]

(E) [1, 2]

20.16. Losses follow a lognormal distribution with µ  3, σ  2. Calculate the Tail-Value-at-Risk for losses at the 90% security level. (A) 539

(B) 766

(C) 951

(D) 1134

(E) 1301

20.17. Earned premium for an insurance coverage is 7,500. An agent gets a bonus of 50% of the amount by which losses are below the level generating a loss ratio of 60% but not less than 0, where the loss ratio is the ratio of losses to earned premium. Losses on an insurance coverage follow a Pareto distribution with parameters α  1, θ  5000. Determine the expected value of the bonus. (A) 382

(B) 645

(C) 764

(D) 1068

(E) 1605

20.18. An agent sells 10,000 of premium. He gets a bonus of 20% of the premium times the excess of 80% over the loss ratio, but not less than 0, where the loss ratio is the quotient of losses over premium. Losses follow a single-parameter Pareto distribution with θ  4000, α  3. Calculate the variance of the agent’s bonus. (A) 70,000

(B) 140,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 210,000

(D) 280,000

(E) 350,000 Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

343

20.19. The random variable X follows a Pareto distribution with parameters α  2 and θ  4. X e is a random variable having the equilibrium distribution for X. Calculate FX e (5) . (A)

4 81

(B)

16 81

(C)

25 81

(D)

5 9

(E)

65 81

Solutions 20.1. [Section 4.3] F (50) is extraneous, since we only need the distribution function from 100 on, which is a multiple of a Pareto; let k be the multiplier. To match F (100) and F (200) , we need θ 0.3  S (100)  k θ + 100 θ 0.1  S (200)  k θ + 200

!2 !2

Dividing the second equation into the first,

!2

θ + 200 3 θ + 100 θ + 200 √  3 θ + 100 √ √ θ 3 + 100 3  θ + 200

θ θ + 200

So F (150)  1 − k 20.2.

θ θ + 150

!2

√ 200 − 100 3  36.6025 θ √ 3−1

!2

!2

36.6025   0.02393 236.6025 0.1 k  4.179 0.02393

 1 − 4.179

36.6025 186.6025

!2

 1 − 0.16077  0.83923

[Section 11.2] Back out a and b. b 2 b a+ 3 b 6 b

a+

0.18  0.3 0.6 0.072   0.4 0.18 

 −0.1  −0.6

a − 0.3  0.3

a  0.6

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

344

Then β/ (1 + β )  0.6, so β  1.5, and since b  −a, then r − 1  −1 and r  0, making N logarithmic. For a logarithmic distribution, β 0.6   0.654814 p1T  (1 + β ) ln (1 + β ) ln 2.5 and since p1M  (1 − p 0M ) p1T , it follows that 1 − p0M  0.6/0.654814 and p0M  0.0837 . (C) 20.3.

[Section 8.3] The survival function is developed as follows: x

Z H (x ) 

0

h ( u ) du 

x

Z 0

100 100 + x

S ( x )  e −H ( x ) 

100 + x α du x  α ln (100 + u )  α ln 0 100 + u 100



We recognize this as a two-parameter Pareto with θ  100. Since E[X]  50, then θ/ ( α − 1)  50, so α  3. Using the tables, VaR0.9 ( X )  100 0.1−1/3 − 1  115.44





TVaR0.9 ( X )  115.44 +

100 + 115.44  223.17 2

(A)

20.4. [Lesson 3] Let X be claim costs and Y the proportion of claim adjustment costs. We want the variance of X (1 + Y ) . We will compute the first and second moments, then the variance. E[X (1 + Y ) ]  E[X] E[1 + Y]  ( αθ )(1 + 0.1)  (4)(10)(1.1)  44

 E

X (1 + Y )

2

 E X 2 E (1 + Y ) 2

f

g

f

g

E X 2  α ( α + 1) θ 2  (4)(5)(102 )  2000

f

g

E (1 + Y ) 2  E[ (1 + Y ) ]2 + Var (1 + Y )  1.12 +

f

g

0.12 12

because the variance of a uniform distribution is the range squared divided by 12 E (1 + Y ) 2  1.210833

f

 E

X (1 + Y )

g

2

 2000 (1.210833)  2421.667

Var X (1 + Y )  2421.667 − 442  485.667





(D)

An alternative solution1 uses conditional variance, and also uses the fact that the gamma distribution is a scale distribution with scale parameter θ, so total costs follow a gamma distribution with parameters α  4 and θ  u, where u is uniformly distributed on (10.5, 11.5) since it is 10 times a random variable that is uniform on (1.05, 1.15) . Let Z be total claim costs. Var ( Z )  E[Var ( Z | u ) ] + Var (E[Z | u])  E[4u 2 ] + Var (4u )

The second moment of u is

Z

11.5 10.5

1Suggested by Francois Demers-Telmosse C/4 Study Manual—17th edition Copyright ©2014 ASM

v 2 dv 

11.53 − 10.53 3

EXERCISE SOLUTIONS FOR LESSON 20 so

20.5.

345

1 11.53 − 10.53 + 42  485.667 Var ( Z )  4 3 12

!

!

[Lesson 13] For an inverse exponential, F (20)  e −θ/20  e −10/20  0.60653

The coverage modification multiplies β by this factor, so β  5 (0.60653)  3.0327. Then the coefficient of variation is the standard deviation over the mean, or

p

rβ (1 + β )  rβ

s

1+β  rβ

r

4.0327  0.8154 2 (3.0327)

(E)

20.6. [Lesson 11] For the combination of 5 tyrannosaurs, the distribution is binomial(5,0.6). Each scientist has a 0.5 probability of being over 8000 calories. There are 4 ways to satisfy the conditions: 5 2 3 2 (0.6 )(0.4 )

1.

Eat 2 scientists, probability

2.

Eat 3 scientists, at least one below 8000 calories. Probability of 3 is 53 (0.63 )(0.42 )  0.3456. Probability that all 3 are above 8000 calories is 0.53 . Multiplying, (0.3456)(1 − 0.53 )  0.3024.

3.

4.

 0.2304.



Eat 4 scientists, at least 2 below 8000 calories. Probability of 4 is 54 (0.64 )(0.4)  0.2592. Probability   5 5 that 3 or 4 out of 4 are above 8000 calories is 43 (0.54 ) + 44 (0.54 )  16 . Multiplying, (0.2592)(1 − 16 ) 0.1782.



Eat 5 scientists, at least 3 below 8000 calories. Probability of 5 is 0.65  0.07776. Probability that 3, 4, or 5 scientists are at least 8000 is 21 . Multiplying, 0.07776 (1 − 21 )  0.03888

The total probability is 0.2304 + 0.3024 + 0.1782 + 0.03888  0.74988 . (E) 20.7.

[Subsection 4.1.3] The conditional survival function is S ( x | θ )  e −0.1θx

and integrating over Θ,

S ( x )  E[e −0.1Θx ]  MΘ (−0.1x )

Since Θ’s distribution is gamma with α  2 and θ  0.01,

S ( x )  (1 + 0.001x ) −2 Setting this equal to 0.5,

(1 + 0.001x ) 2  2

√ 2−1 √ x  1000 ( 2 − 1)  414.2

0.001x 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

346

20.8.

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS [Lesson 17] Let X be the number of inches of snow per snowstorm. Then E[X]  0.2 + 2 (0.3) + 3 (0.2) + 4 (0.1) + 5 (0.1) + 6 (0.1)  2.9

The average number of snowstorms is (8)(0.5)  4. The average amount of snow is 4 (2.9)  11.6. We need the probabilities of 0, 1, 2, and 3 inches of snow. We will calculate them directly, although the recursive formula could also be used. First we calculate the binomial probabilities for the number of snowstorms. p0  0.5

8 (0.58 )  8 (0.58 ) p1  1

!

8

8 (0.58 )  56 (0.58 ) p3  3

8 (0.58 )  28 (0.58 ) p2  2

!

!

Then for the aggregate distribution, number of inches of snow, g0  0.58  0.00390625 g1  8 (0.58 )(0.2)  0.00625 g2  0.58 (28)(0.22 ) + 8 (0.3)  0.01375





g3  0.58 (56)(0.23 ) + 28 (2)(0.3)(0.2) + 8 (0.2)  0.021125





The probability of 4 or more inches is 1 − g 0 − g1 − g2 − g3  0.95496875. The expected amount of snow that falls if we ignore years with less than 4 inches is 11.6 − 0.00625 − 0.01375 (2) − 0.021125 (3)  11.502875 The expected amount of snow conditioned on 4 inches or more is the quotient, or (D)

11.502875 0.95496875

 12.0453 .

20.9. [Lesson 18] It’s easiest to calculate this directly: probability of 7 times 1 plus probability of 8 times 2 plus probability of 9 times 3. Three claims are needed for any reinsurance payment, and the probability of 3 claims is 0.23  0.008, which we’ll multiply at the end. Aggregate claims of 7 can be obtained by two 3’s and one 1: 3 (0.1) 2 (0.5)  0.015 or by two 2’s and one 3: 3 (0.2) 2 (0.1)  0.012. Aggregate claims of 8 can be obtained by two 3’s and one 2: 3 (0.1) 2 (0.2)  0.006. Aggregate claims of 9 can be obtained by three 3’s: (0.1) 3  0.001. The answer is: 0.008 (0.015 + 0.012) + 2 (0.006) + 3 (0.001)  0.008 (0.042)  0.000336





(D)

20.10. [Lesson 15] E[N]  0.5, Var ( N )  0.5 (1)(2)  1, E[X]  500, E X 2  1,000,000, Var ( X )  750,000.

f

g

E[S]  (0.5)(500)  250 Var ( S )  (0.5)(750,000) + (1)(5002 )  625,000 150 − E[S] −100  −0.126  √ 790.569 Var ( S ) Φ (−0.126)  Φ (−0.13)  1 − 0.5517  0.4483

(D)

Note: the normal approximation would not be used in real life in this example due its high probability of being below 0.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 20

20.11.

347

[Lesson 6] The mean excess loss at d for a Pareto is θ + 100 α−1 θ + 500 α−1 400 α−1 α

θ+d α−1 ,

so

 2600  2800  200 3

θ  5100

 3050. Add to this 1000, since under a For an ordinary deductible of 1000, mean excess loss is 5100+1000 2 franchise deductible the entire loss is paid once it is above the deductible, and we get the answer 4050 . (D) 20.12. [Lesson 7] For a Pareto, the loss elimination ratio (dividing E[X ∧ d] by E[X]) is 1 − Therefore 1000 1− 1600

! α−1

θ  α−1 . θ+x

 0.4

( α − 1) ln 85  ln 0.6 α−1

ln 0.6 ln 58

 1.0869

α  2.0869 20.13.

(C)

[Lesson 11] From the tables, p 1T 

(1 +

rβ − (1 + β )

β ) r+1

and since p0  0.8, the modified probability 0.02  p 1M  p1T (1 − 0.8)  0.2p1T , so 2β 3 (1 + β ) − (1 + β ) 0.02 2   0.1 0.2 2 + 3β + β 2 2 β 2 + 3β + 2   20 0.1 β 2 + 3β − 18  0 √ −3 + 32 + 72 β 3 2 0.02  0.2

!

The negative solution to the quadratic is rejected since β can’t be negative. Now, a  b  ( r − 1) a  0.75, so ! b M 9 M 9 p2M  a + (E) p  p1  (0.02)  0.0225 2 1 8 8

C/4 Study Manual—17th edition Copyright ©2014 ASM

β 1+β



3 4

 0.75 and

348

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

20.14. [Lesson 10] If X is the loss random variable, the bonus is 0.2 max (0, 10,000x −X ) and its expected value is 0.2 E[max (0, 10,000x − X ) ]  0.2 (10,000x − E[X ∧ 10,000x])  500 We divide by 0.2 and by 1000 (to make the numbers easier to handle).

!

θ θ *1 − 10x − 0.001 α−1 10,000x + θ

,

10x − 6 1 −

! α−1

+  2.5 -

6,000  2.5 10,000x + 6,000 ! 10x 10x − 6  2.5 10x + 6

100x 2 + 60x − 60x  25x + 15

100x 2 − 25x − 15  0 √ 25 + 625 + 6000 x  0.5320 200

(A)

20.15. [Section 1.2] Since the trains are equally likely, the joint probabilities of the number of friends is 0.5 times the conditional probabilities, and adding these up gives the marginal probabilities of meeting a number of friends: Number of friends

Joint probability for A C

0 1 2 3

0.25 0.10 0.10 0.05

Total probability

Cumulative probability

0.55 0.25 0.15 0.05

0.55 0.80 0.95 1.00

0.30 0.15 0.05 0

Thus the cumulative distribution function F ( x ) is 0.8 for x ∈ [1, 2) . Thus for any x ∈ [1, 2], Pr ( X < x ) ≤ 0.8 and Pr ( X ≤ x ) ≥ 0.8, and any x ∈ [1, 2] is an 80th percentile. (E)

20.16. [Lesson 8] The 90th percentile of the lognormal is e 3+1.282 (2)  e 5.564 . The partial expectation for x > e 5.564 is 2

e µ+0.5σ .1 − Φ

*

ln x − µ − σ2 + 2 5.564 − 3 − 22 + /  e 3+0.5(2 ) *.1 − Φ / σ 2

!

,

!

-

,

 148.413 (1 − Φ (−0.72))

-

 148.413 (0.7642)  113.4

Dividing by the probability of being above the 90th percentile (0.1), we get 1134 as the final answer. (D) 20.17.

[Lesson 10] The bonus is 0.5 max 0, 0.6 (7500) − X  0.5 4500 − min ( X, 4500)  2250 − 0.5 ( X ∧ 4500)









and its expected value is

2250 − 0.5 E[X ∧ 4500]  2250 + 0.5θ ln

θ 4500 + θ

!

5000  2250 − 1604.63  645.37  2250 + 2500 ln 9500

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

EXERCISE SOLUTIONS FOR LESSON 20

20.18.

349

[Lesson 10] If X is losses, then the bonus is



max .0, 0.2 (10,000) 0.8 −

*

X +/  0.2 max (0, 8000 − X )  0.2 (8000 − X ∧ 8000) 10,000



,

-

The variance of the parenthesized expression, since 8000 has no variance, is the variance of X ∧ 8000. Using the tables, we calculate the moments. θ3 αθ − α − 1 2 (80002 ) 3 (4000) 40003   5500 − 2 2 (80002 ) f g 2θ 3 αθ2 − E ( X ∧ 8000) 2  α − 2 8000 3 (40002 ) 2 (40003 )  −  32,000,000 1 8000 Var ( X ∧ 8000)  32,000,000 − 55002  1,750,000 E[X ∧ 8000] 

The variance of 0.2 ( X ∧ 8000) is 0.22 (1,750,000)  70,000 . (A)

θ  20.19. [Section 8.4] E[X]  α−1 distribution has density function

4 2−1

 4, while S ( x ) 

 fe (X ) 

4/ (4 + x )

2 

4

θ α θ+x



4 2 4+x .

Then the equilibrium

4 (4 + x ) 2

which you should recognize as the density function of a two-parameter Pareto with α  1, θ  4, so Fe (5)  1 − 4/ (4 + 5)  5/9 . (D) If you didn’t recognize it, you could calculate Fe (5) by integrating f e ( x ) from 0 to 5:

R FX e ( 5 ) 

5 16dx 0 (4+x ) 2

4

1  −4 4+x

!5 0

5 1 1 −  4 4 9 9



C/4 Study Manual—17th edition Copyright ©2014 ASM



(D)

350

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

C/4 Study Manual—17th edition Copyright ©2014 ASM

Part II

Empirical Models

352

PART II. EMPIRICAL MODELS

In empirical models, a distribution is fitted to data without specifying an underlying model. The distribution is data-dependent; it can only be described by referring to all of the observations. This contrasts with parametric models, where the assumed distribution can be described by listing a short set of parameters. We begin with a fast review of statistics, which is used both here and when discussing parametric estimators. We then discuss the empirical fit, first with complete data, then with incomplete data. The empirical distribution function will have jumps. Kernel smoothing is a method for smoothing the empirical distribution. We finish off discussing approximations to the empirical model when there are large amounts of data, such as when constructing mortality tables.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 21

Review of Mathematical Statistics Reading: Loss Models Fourth Edition 10 This lesson is a short review of statistics. If you’ve never studied statistics, you may have difficulty with it. However, a full-fledged introduction to statistics is beyond the scope of this manual. On exams, there is typically only one question on the material of this lesson, almost always on Estimator Quality, but much of the work we do later depends on the other concepts in this lesson, hypothesis testing and confidence intervals. A statistic is a number calculated based purely on observed data. No assumptions or hypotheses are needed to calculate it. Examples of statistics are the sample mean, the sample variance, the sample median. The field of statistics deals with two tasks: testing hypotheses and estimating parameters. We shall deal with the latter first.

21.1

Estimator quality

One of the tasks of statistics is to estimate parameters. Typically, a statistical method will provide a formula which defines an estimated parameter in terms of sample statistics. For example, suppose you have a random variable that is believed to follow a normal distribution. You would like to estimate the parameters µ and σ2 of the distribution. You have a sample of n observations from the population. Here are examples of statistical methods (a hat on a variable indicates an estimate): 1. Ignore the sample data. Estimate µˆ  100 and σˆ 2  25. 2. Estimate µˆ  100, and estimate σˆ 2 as the average square difference from 100, or ¯ the sample mean, and σˆ 2  3. Estimate µˆ  x, 4. Estimate µˆ  x¯ and σˆ 2 

P

P

P

( x i − 100) 2 /n.

( x i − x¯ ) 2 /n.

( x i − x¯ ) 2 / ( n − 1) .

In the subsequent discussion, we will refer to this example as the “four-statistical-method” example. All of these methods are possible estimators for µ and σ. Which estimator is the best? What are the advantages and disadvantages of each of them? We somehow suspect that the first two estimators aren’t good, since they ignore the data, at least in estimating µ. However, they have the advantage of giving you a clear-cut answer. The estimate of µ is 100, regardless of what the data says. When using the last two estimators, you may estimate µˆ  50 after 1000 trials, yet after the 1001st trial you almost surely end up with a different estimate. There are many non-mathematical reasons that an estimator may be bad. It may be based on the wrong population. It may be based on bad assumptions. We cannot quantify these errors, and will not discuss them further. What we can quantify is the intrinsic quality of an estimator. Assuming that all of our hypotheses are correct, there is built-in error in the estimator. We will discuss three measures of estimator quality. In the following discussion, θˆ is the estimator, θˆ n is the estimator based on n observations, and θ is the parameter being estimated. C/4 Study Manual—17th edition Copyright ©2014 ASM

353

21. REVIEW OF MATHEMATICAL STATISTICS

354

21.1.1

Bias

Bias is the excess of the expected value of the estimator over its true value. biasθˆ ( θ )  E[ θˆ | θ] − θ

An estimator is unbiased if biasθˆ ( θ )  0, which means that based on our assumptions, the average value of the estimator will be the true value, obviously a desirable quality. Even if an estimator is biased, it may be asymptotically unbiased, meaning that as the sample size goes to infinity, the bias of the estimator goes to zero: θˆ asymptotically unbiased estimator for θ ⇔ lim biasθˆ ( θ )  0 n→∞

An unbiased estimator is automatically asymptotically unbiased. In the four-statistical-method example above, let’s calculate the bias of each estimator.

1. The bias of the estimator for µ is 100 − µ. If µ happens to be 100, then the estimator is unbiased. If µ is 0, then the bias is 100. This method may be very biased or it may be unbiased—it all depends on the true, presumably unknown, value of µ. Similar remarks apply to σˆ 2 . If σ2  25, the estimator is unbiased. Otherwise it is biased. 2. The remarks for µˆ of the first estimator apply equally well here. Let’s postpone discussing the bias of σˆ 2 . 3. The expected value of the sample mean is ¯  E[x]

E

fP

n i1

xi

n

g 

n E[X]  E[X] n

and E[X]  µ, so the bias of the sample mean is E[X] − µ  µ − µ  0. In general, for any distribution, the sample mean is an unbiased estimator of the true mean. A theorem of probability states that in general (not just for normal populations), for a sample of size ¯ n with sample mean x,

X  n ( x i − x¯ ) 2   ( n − 1) Var ( X )  i1 

E  Here, Var ( X )  σ2 , so

2

E[σˆ ] 

E

fP

Therefore, the bias of the estimator for σ2 is biasσˆ 2 ( σ2 ) 

( x i − x¯ ) 2 n

g 

n−1 2 σ n

n−1 2 σ2 σ − σ2  − n n

While σˆ 2 is biased, it is asymptotically unbiased. As n → ∞, the bias goes to 0.

4. Once again, the sample mean is an unbiased estimator of the true mean. The estimator for σ2 has expected value fP g E ( x i − x¯ ) 2 ( n − 1) Var ( X )   Var ( X ) n−1 n−1 and therefore σˆ 2 is an unbiased estimator for σ 2 . In general (not just for normal populations), the quotient of the square difference from the sample mean divided by n − 1 is an unbiased estimator of the variance. C/4 Study Manual—17th edition Copyright ©2014 ASM

21.1. ESTIMATOR QUALITY

355

Let’s get back to σˆ 2 of the second method. A useful trick to calculate its expected value is to break out (xi − µ)2: n X i1

( x i − 100) 2  

n  X i1 n X i1

( x i − µ ) + ( µ − 100)

(xi − µ)2 +

n X i1

2

( µ − 100) 2 + 2

n X i1

( x i − µ )( µ − 100)

The expected value of each summand of the first term is the variance, or σ2 ; in fact, E[ ( X − µ ) 2 ] is the definition of Var ( X ) . The second term has a constant summand, so it equals n ( µ − 100) 2 . The expected value of x i − µ is 0 (µ is the expected value of x i ), so the expected value of the third term is 0. So 2

E[ σˆ ]  E

" Pn

− 100) 2  σ2 + ( µ − 100) 2 n

i1 ( x i

#

If µ  100, the estimator is unbiased. Otherwise it is biased, and the bias is independent of sample size. Example 21A In an urn, there are four marbles numbered 5, 6, 7, and 8. You draw three marbles from the urn without replacement. Let θˆ be the maximum of the three marbles. Calculate the bias of θˆ as an estimator for the maximum marble in the urn, θ. Answer: There are four combinations of 3 marbles out of 4. Three of the combinations include 8, making the maximum 8. The remaining one is {5, 6, 7}, with a maximum of 7. Thus the expected value of θˆ is 3 4 (8)

+ 41 (7)  7 34 , whereas the true maximum is 8. The bias is 7 34 − 8  − 41 .



Example 21B X has a uniform distribution on [0, θ]. A sample {x i } of size n is drawn from X. Let θˆ  max x i . Determine biasθˆ ( θ ) . Answer: To calculate the expected value of the maximum, we need its density function. Let Y be the random variable for the maximum. For a uniform distribution on [0, θ], FX ( x )  Pr ( X ≤ x )  x/θ for 0 ≤ x ≤ θ. Then for 0 ≤ x ≤ θ, FY ( x )  Pr ( X1 ≤ x ) Pr ( X2 ≤ x ) · · · Pr ( X n ≤ x )  fY ( x ) 

nx n−1 θn

xn θn

The expected value of Y is θ

Z E[Y] 

0

y f ( y ) dy 

θ

Z 0

n y n dy nθ  θn n+1

We conclude that the bias of θˆ is biasθˆ ( θ )  The estimator is asymptotically unbiased. C/4 Study Manual—17th edition Copyright ©2014 ASM

θ nθ −θ − n+1 n+1



21. REVIEW OF MATHEMATICAL STATISTICS

356

21.1.2

Consistency

An unbiased estimator is good on the average, but may be quite bad. It may be too high half the time and too low half the time, and never close to the true value. Consistency may be a better measure of quality. An estimator is consistent if it is, with probability 1, arbitrarily close to the true value if the sample is large enough. In symbols, an estimator is consistent if for all δ > 0, limn→∞ Pr ( | θˆ n − θ| < δ )  1. Sometimes this is called weak consistency. A sufficient but not necessary condition for consistency is that the estimator be asymptotically unbiased and that its variance goes to zero asymptotically as the sample size goes to infinity. Let’s use this condition to analyze the consistency of the sample mean as an estimator for the true mean. We already mentioned when discussing the bias of the four-statistical-method example that the sample mean is unbiased, so it’s certainly asymptotically unbiased. The variance of the sample mean is Var ( X ) /n. If Var ( X ) is finite, then Var ( X ) /n → 0. So if Var ( X ) is finite, the sample mean is consistent. If Var ( X ) is not finite, such as for a Pareto distribution with α ≤ 2, the sample mean may not be consistent. In our four-statistical-method example, 1. The estimators of µ and σ2 are consistent if and only if µ  100 and σ2  25. 2. The estimators of µ and σ2 are consistent if and only if µ  100. 3. Since σ 2 is finite, the estimator for µ is consistent, as we just discussed. σˆ 2 is asymptotically unbiased. The variance of σˆ 2 is a function of the fourth and lower moments of a normal distribution, which are all finite, so σˆ 2 is consistent. 4. For the same reasons as the third estimator, the estimators of µ and σ 2 are consistent.

21.1.3

Variance and mean square error

Mean square error is the average square difference between the estimator and the true value of the parameter, or f g MSEθˆ ( θ )  E ( θˆ − θ ) 2 | θ

The lower the MSE, the better the estimator. In some textbooks, an estimator with low variance is called “efficient”, but the textbook on the syllabus, Loss Models, avoids the use of this vague term, so you are not responsible for knowing the meaning of the word “efficient”. An estimator is called a uniformly minimum variance unbiased estimator (UMVUE) if it is unbiased and if there is no other unbiased estimator with a smaller variance for any true value θ. It would make no sense to make a similar definition for biased estimators (i.e., a uniformly minimum MSE estimator), since estimators like the first method of the four-statistical-method example have a mean square error of 0 when the constant equals the parameter, so no estimator can have the smallest error for all values of the parameter. There is an important relationship between MSE, bias, and variance: MSEθˆ ( θ )  Var ( θˆ ) + biasθˆ ( θ )



2

(21.1)

Example 21C [4B-F96:21] (2 points) You are given the following: • The expectation of a given estimator is 0.50. • The variance of this estimator is 1.00. • The bias of this estimator is 0.50. Determine the mean square error of this estimator. (A) 0.75

(B) 1.00

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1.25

(D) 1.50

(E) 1.75

21.2. HYPOTHESIS TESTING

357

Answer: MSE ( θˆ )  1.00 + 0.502  1.25 . (C)



Example 21D In an urn, there are four marbles numbered 5, 6, 7, and 8. You draw 3 marbles from the urn without replacement. Let θˆ be the maximum of the 3 marbles. Calculate the mean square error of θˆ as an estimator for the maximum marble in the urn, θ. Answer: There are four combinations of 3 marbles out of 4. Three of the combinations have 8. The remaining one is {5, 6, 7}, with a maximum of 7. The true maximum is 8. Thus, the error is 7 − 8  −1

one-fourth of the time, 0 otherwise, so the mean square error is 14 (12 ) 

1 4

.

The variance of the estimator (using the Bernoulli shortcut—see section 3.3 on page 54) is

(0.25)(0.75)(12 )  and indeed − 14



2

+

3 16

3 , 16

 14 —the bias squared plus the variance equals the mean square error.



Example 21E X has a uniform distribution on [0, θ]. A sample {x i } of size n is drawn from X. Let θˆ  max x i . Determine MSEθˆ ( θ ) . Answer: We calculated the density of Y  max x i above in Example 21B. Now let’s calculate the second moment of Y. E[Y 2 ] 

θ

Z 0

y 2 f ( y ) dy 

nθ 2 nθ Var ( Y )  − n+2 n+1

θ

Z 0

!2 

n y n+1 dy nθ 2  n θ n+2

nθ 2 ( n + 1) 2 ( n + 2)

(21.2)

Combining this with our calculation of biasθˆ ( θ ) in Example 21B, we conclude that the mean square error is !2 θ nθ 2 2θ 2 MSEθˆ ( θ )  +  n+1 ( n + 1) 2 ( n + 2) ( n + 1)( n + 2) 

?

Quiz 21-1 To estimate the parameter µ of a random variable following lognormal distribution with parameters µ and σ  4: (i) An observation x is made of the random variable. (ii) µ is estimated by ln x. Calculate the mean square error of this estimator.

21.2

Hypothesis testing

While there will probably be no exam questions directly on the material in this section, it will be used heavily in this part of the course. We often have various ideas of how the world operates. We may come up with a hypothesis quantifying natural occurrences. Examples of hypotheses are: • A certain medicine is effective in preventing Alzheimer’s. C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

358

Table 21.1: Summary of Estimator Quality Concepts

In the following, θˆ is an estimator for θ and θn is an estimator for θ based on a sample of size n. Bias: bias ˆ ( θ )  E[ θˆ | θ] − θ. θ

If biasθˆ ( θ )  0, the estimator is unbiased. If limn→∞ biasθn ( θ )  0, the estimator is asymptotically unbiased. The sample mean is unbiased. The sample variance (with division by n − 1) is unbiased. Consistency: limn→∞ Pr ( |θn − θ| >  )  0.

A sufficient condition for consistency is that the estimator is asymptotically unbiased and the variance of the estimator goes to 0 as n → ∞. Mean square error: MSE ˆ ( θ )  E[ ( θˆ − θ ) 2 | θ]  bias2 ( θ ) + Var ( θˆ ) θ

θˆ

• Smokers have a higher probability of getting lung cancer. • Losses follow a Pareto distribution with α  4, θ  10,000. Statistics helps decide whether to accept a hypothesis. To decide on a hypothesis, we set up two hypotheses: a null hypothesis, one that we will believe unless proved otherwise, and an alternative hypothesis. Usually, the null hypothesis is a fully specified hypothesis: it has a probability distribution associated with it. The alternative hypothesis may also be fully specified, or it may allow for a range of possibilities. For example, suppose a new drug might prevent Alzheimer’s disease. It is known that for a healthy person age 75, there is a 0.1 probability of contracting Alzheimer’s disease within 5 years. To statistically test the drug, you would set up the following two hypotheses: • H0 (the null hypothesis): The probability of contracting Alzheimer’s disease if you use this drug is 0.1. • H1 (the alternative hypothesis): The probability of contracting Alzheimer’s disease if you use this drug is less than 0.1. Notice that H0 is fully specified: a Bernoulli distribution with parameter q  0.1. You can calculate probabilities assuming H0 . On the other hand H1 is not fully specified, and allows any value of q < 0.1. The next thing you would do is specify a test. For example, we’ll give 100 people age 75 this drug, and observe them for 5 years. The test statistic will be the number who get Alzheimer’s disease. Let’s call this number X. If X is low, we’ll reject H0 and accept H1 , while if it is high we won’t reject H0 . When we don’t reject H0 , we say that we accept H0 . This does not mean we really believe H0 ; it just means we don’t have enough evidence to reject it. Now we have to decide on a boundary point for the test. This boundary point is called the critical value. Let’s say the boundary point is c. Then we reject H0 if X < c or possibly if X ≤ c. The set of values for which we reject H0 is called the critical region. How do we go about setting c? The lower we make c, the more likely we’ll accept H0 when it is false. The higher we make c, the more likely we’ll reject H0 when it is true. Rejecting H0 when it is true is called a Type I error. The probability of a Type I error, assuming H0 is true, is the significance level of the test. Thus, the lower the significance, the greater the probability of accepting H0 . The letter α is often used for the significance level. The precise probability of getting the observed statistic given that the null hypothesis is true is called the p-value; the lower the p-value, the greater the tendency to reject H0 . C/4 Study Manual—17th edition Copyright ©2014 ASM

21.3. CONFIDENCE INTERVALS

359

If we reject H1 when it is true, we’ve made a Type II error. The power of a test is the probability of rejecting H0 when it’s false. We’d like as much power as possible, but increasing the power may also raise the significance level (which is no good). A uniformly most powerful test gives us the most power for a fixed significance level. When the alternative hypothesis is not fully specified, we cannot calculate the power for the entire range, but we can calculate the power for specific values of the alternative hypothesis. Example 21F In the drug example discussed above, we will reject the null hypothesis if c ≤ 6. Determine the significance level of the test. Also determine the power of the test at q  0.08. Answer: The significance level of the test is the probability of 6 or less people out of 100 getting Alzheimer’s if q  0.1. The number of people getting Alzheimer’s is binomial with m  100, q  0.1, and the probability of such a binomial being 6 or less is

! 6 X 100 i0

i

(0.1i )(0.9100−i )  0.1171556

This is the exact significance level. Since this can be difficult to compute, the normal approximation is usually used. The mean of the binomial variable is 100 (0.1)  10 and the variance is 100 (0.1)(0.9)  9. Making a continuity correction, we calculate the approximate probability that a normal random variable with µ  10, σ 2  9 is no greater than than 6.5. ! 6.5 − 10 Pr ( X ≤ 6.5 | H0 )  Φ  Φ (−1.17)  0.1210 3 This is the significance level using a normal approximation. The power of the test is the probability of X ≤ 6 if q  0.08. Using the normal approximation, the mean is 100 (0.08)  8 and the variance is 100 (0.08)(0.92)  7.36, so 6.5 − 8 Pr ( X < 6.5 | q  0.08)  Φ √  Φ (−0.55)  0.2912 7.36

!



Example 21G X is a normally distributed random variable with variance 100. You are to test the null hypothesis H0 : µ  50 against the alternative hypothesis H1 : µ  55. The test consists of a random sample of 25 observations of X. If the sample mean is less than 54, H0 is accepted; otherwise H1 is accepted. Determine the significance and power of the test. Answer: The variance of the sample mean of 25 observations is 100/25  4. The probability that x¯ ≥ 54 √ given that H0 is true is 1 − Φ (54 − 50) / 4  1 − Φ (2)  0.0228, so the significance of the test is 2.28% .

If H1 is true, the probability that x¯ > 54 is 1 − Φ (54 − 55) /2  Φ (0.5)  0.6915, so the power of the test



is 69.15% .





Table 21.2 summarizes the concepts of this section.

21.3

Confidence intervals

Rather than making a point estimate of a parameter, we may recognize that our estimates are not so precise and instead provide an interval which we believe has a high probability of containing the true value. A 100 (1 − α ) % confidence interval for a parameter θ is an interval ( L, U ) such that the probability of L being less than θ and U being greater than θ is 1 − α. Keep in mind that θ is a parameter, not a random variable; it’s L and U, the statistics, that are random. C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

360

Table 21.2: Summary of Hypothesis Testing Concepts

Test A procedure and an accompanying rule that determine whether or not to accept the null hypothesis. Critical region Set of values for which, if the test statistic is in the set, the null hypothesis is rejected. Usually an interval of numbers going to infinity, or two intervals of numbers going to positive and negative infinity. Critical value(s) Boundary/boundaries of critical region. Type I error Rejecting the null hypothesis when it is true. Type II error Accepting the null hypothesis when it is false. Significance (level) Probability that observation is in critical region given that the null hypothesis is true. This is set before performing the test: the statistician selects a significance level and then sets the critical region accordingly. p-value Probability of observation (or more extreme observation) if the null hypothesis is true. This is calculated after the test is performed. Thus, at significance level α, the null hypothesis is rejected if the p-value of the test is less than the significance level of the test. Power Probability of rejecting null hypothesis when it is false. Uniformly most powerful test A test with maximal power for a given significance level. Typically, to construct a confidence interval for an estimate, we add to and subtract from the estimate the square root of the estimated variance times a standard normal coefficient appropriate for the level of confidence we’re interested in. If we let q z α be the 100α percentile of a standard normal distribution,

L ( θˆ ) to the estimate of the parameter θ, where Var L ( θˆ ) is this means adding and subtracting z (1−α )/2 Var the estimate of the variance of the estimator of θ. How do we estimate the variance? We often make no assumption about the distribution of the underlying data. Instead, we use the unbiased sample variance as an estimate of the variance of the underlying distribution. When using the sample mean as an estimate of the underlying mean, remember that the variance of the sample mean is the variance of the underlying distribution divided by the size of the sample. Thus when estimating the variance, the sum of square differences from the sample mean is divided by n − 1 (to obtain the unbiased sample variance) and also by n (to obtain the variance of the sample mean). Example 21H For a sample of 100 loss sizes x i , you have the following summary statistics:

X

X

x i  325,890

x 2i  1,860,942,085

Construct a 95% confidence interval for mean loss size. Answer: The sample mean is x¯  The raw second moment is µˆ 02  C/4 Study Manual—17th edition Copyright ©2014 ASM

325,890  3258.90 100

18,860,942,085  18,609,420.85 100

21.3. CONFIDENCE INTERVALS

361

The biased sample variance is 18,609,420.85 − 3258.902  7,988,991.64 To make it unbiased, we multiply by n/ ( n − 1) . 100  8,069,688.53 7,988,991.64 99

!

That is the estimate of the underlying variance of the distribution. The estimate of the variance of the sample mean is 8,069,688.53  80,696.89 100 √ The confidence interval is 3258.90 ± 1.96 80,696.89  (2975, 3543) .  The rest of this lesson is unlikely to appear on an exam and may be skipped. If the sample size is small, we should use t distribution values instead of z (1−α )/2 . However, a t distribution table is not provided at the exam, so I doubt you would be expected to use t coefficients on an exam question. If a specific distribution is assumed for the data, the variance may be a function of the mean. Then we can approximate the variance as a function of the sample mean instead of using the unbiased sample variance. For example, if the data can only have one of two possible values, then it follows a Bernoulli distribution. If q is the mean, then the variance is q (1 − q ) . We can approximate the variance of the underlying distribution as x¯ (1 − x¯ ) . This would be divided by n, the sample size, if we are approximating the variance of the sample mean. Sometimes a better confidence interval can be constructed by using the true variance of the estimator of the parameter instead of the approximated variance. In the Bernoulli example of the last paragraph, this would mean using q (1 − q ) instead of x¯ (1 − x¯ ) . When a single parameter is being estimated, the variance of the estimator of the parameter can be expressed in terms of the parameter: Var ( θˆ )  v ( θ ) for some function v. We can then use the following equation, which is equation (10.3) of the fourth edition of Loss Models: ! θˆ − θ 1 − α  Pr −z (1−α )/2 ≤ √ ≤ z (1−α )/2 (21.3) v (θ) where 1−α is the confidence level and z ( 1−α ) is the 100 ( α ) percentile of the standard normal distribution. This leads to a quadratic equation in θˆ when we assume a Poisson or a Bernoulli distribution for the data. Example 21I A sample of 200 policies has 20 claims. It is assumed claim frequency for each policy has a Poisson distribution with mean λ. Construct symmetric 95% confidence intervals for λ using the (1) the approximated variance and (2) the true variance. Answer: The estimate for λ is the sample mean, λˆ  20/200  0.10. The variance of the sample mean is the distribution variance divided by the size of the sample. The distribution variance is λ, since the variance of a Poisson equals its mean, so the variance of the sample mean is λ/200. In the approximated variance method, we substitute λˆ for λ in the formula for variance, so the variance √ of λˆ is estimated as 0.10/200  0.0005. The 95% confidence interval is 0.1±1.96 0.0005  (0.0562, 0.1438) . The true variance of λˆ is λ/200. For the confidence interval using the true variance, we want: 0.10 − λ ≤ 1.96. −1.96 ≤ √ λ/200 C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

362

Square this inequality,

(0.10 − λ ) 2 λ/200

≤ 1.962

1.962 λ  0.019208λ 200 λ2 − 0.219208λ + 0.01 ≤ 0 √ 0.219208 ± 0.2192082 − 0.04 λ  0.0647, 0.1545 2 0.01 − 0.20λ + λ2 ≤

Therefore the confidence interval is (0.0647, 0.1545) . Although it doesn’t look symmetric around 0.10, it is! This is because higher values of λ lead to higher variances, so more room is needed on the right to cover the same probability range.  Example 21J In a mortality study on 10 lives with complete data, 2 lives die before time 5. Construct a 95% symmetric confidence interval for S (5) using the true variance. Answer: To make things simpler, let’s work with the number of deaths, 2. Let the expected number of deaths before time 5 be θ. This is a binomial variable with m  10, q  θ/10. The variance of this binomial variable is θ (10 − θ ) /10. We then need to solve:

2−θ  1.96. √ θ (10 − θ ) /10

We square both sides and solve.

(2 − θ ) 2  3.8416θ − 0.38416θ2

0  1.38416θ 2 − 7.8416θ + 4

7.8416 ± 7.84162 − 4 (1.38416)(4) θ 2 (1.38416)  0.5668, 5.0984

p

S (5) is 1 minus the number of deaths divided by the number in the study (10), so the interval for S (5) is obtained by dividing these two bounds by n  10 and subtracting them from 1: 0.5668  0.94332 10 5.0984 1−  0.49016 10 1−

The interval is (0.49016, 0.94332) .

For comparison, the approximate variance is θˆ (10 − θˆ ) /10  (2)(8) /10  1.6, so the confidence interval √ for number of deaths would be 2 ± 1.96 1.6  (−0.47923, 4.47923) . Since number of deaths can’t be less than 0, the confidence interval for S (5) would be (0.55208, 1) , where 0.55208  1 − 4.47923/10. 

In summary, we’ve mentioned three methods for obtaining the variance needed for constructing a normal confidence interval, from least to most refined: 1. Use the unbiased sample variance. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 21

363

2. Assume the data comes from a distribution. Express the variance as a function of the mean, then set the variance equal to that function of the sample mean. 3. Assume the data comes from a distribution. Express the variance as a function of the mean, and solve for the confidence interval. It would be hard for an exam question to specify that you should calculate a confidence interval in any manner other than the first one, adding and subtracting z1−α/2 s. They would have to say something like “First calculate the variance, then apply a normal approximation.” 1 On the other hand, the first method is standard and comes up frequently in this course.

Exercises Estimator quality The following three exercises are from old exams and use the term “efficient”, which is no longer on the syllabus. However, they are easy exercises and are included as a review of the definitions of this lesson. If you wish to do them, just keep in mind that an efficient estimator is one with minimum variance. [4B-S92:2] (1 point) Which of the following are true?

21.1. 1.

The expected value of an unbiased estimator of a parameter is equal to the true value of the parameter.

2.

If an estimator is efficient, the probability that an estimate based on n observations differs from the true parameter by more than some fixed amount converges to zero as n grows large.

3.

A consistent estimator is one with a minimal variance.

(A) 1 only (B) 3 only (C) 1 and 2 only (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) 1,2 and 3

21.2. [4B-S91:28] (1 point) αˆ is an estimator of α. Match each of these properties with the correct mathematical description. a. Consistent b. Unbiased c. Efficient (A) (B) (C) (D) (E)

a a a a a

 1, b  2, b  1, b  3, b  3, b

 2, c  1, c  3, c  2, c  1, c

3 3 2 1 2

ˆ α 1. E[ α] ˆ ≤ Var[ α] ˜ where α˜ is any other estimator of α 2. Var[ α] 3. For any  > 0, Pr{| αˆ − α| < } → 1 as n → ∞, where n is the sample size.

1There was an exam question long ago, in the CAS 4B days, when they made you use the refined method by specifying that you should construct the confidence interval using the method specified in textbook X. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

21. REVIEW OF MATHEMATICAL STATISTICS

364

[4B-F92:8] (1 point) You are given the following information:

21.3.

X is a random variable whose distribution function has parameter α  2.00. Based on n random observations of X you have determined: •

E[α 1 ]  2.05, where α1 is an estimator of α having variance equal to 1.025.



E[α 2 ]  2.05, where α2 is an estimator of α having variance equal to 1.050.



As n increases to ∞, P ( |α 1 − α| >  ) approaches 0 for any  > 0.

Which of the following are true? 1.

α 1 is an unbiased estimator of α.

2.

α 2 is an efficient estimator of α.

3.

α 1 is a consistent estimator of α.

(A) 1 only

(C) 3 only

(D) 1,3 only

(E) 2,3 only

Which of the following statements are true?

21.4. I.

(B) 2 only

An estimator that is asymptotically unbiased and whose variance approaches 0 as the sample size goes to infinity is weakly consistent.

II.

For an unbiased estimator, minimizing variance is equivalent to minimizing mean square error. P III. The estimator S2  n1 nj1 ( X j − X¯ ) 2 for the variance σ2 is asymptotically unbiased. (A) I and II (B) I and III (C) II and III (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) I, II, and III

[4B-S96:12] (1 point) Which of the following must be true of a consistent estimator?

21.5. 1.

It is unbiased.

2.

For a small quantity , the probability that the absolute value of the deviation of the estimator from the true parameter value is less than  tends to 1 as the number of observations tends to infinity.

3.

It has minimal variance.

(A) 1 21.6. (A) (B) (C) (D) (E)

(B) 2

(C) 3

(D) 2,3

(E) 1,2,3

Which of the following statements is false? If two estimators are unbiased, a weighted average of them is unbiased. The sample mean is an unbiased estimator of the population mean. The sample mean is a consistent estimator of the population mean. For a uniform distribution on [0, θ], the sample maximum is a consistent estimator of the population maximum. The mean square error of an estimator cannot be less than the estimator’s variance.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 21

21.7. (A) (B) (C) (D) (E)

365

[4-F04:40] Which of the following statements is true? A uniformly minimum variance unbiased estimator is an estimator such that no other estimator has a smaller variance. An estimator is consistent whenever the variance of the estimator approaches zero as the sample size increases to infinity. A consistent estimator is also unbiased. For an unbiased estimator, the mean squared error is always equal to the variance. One computational advantage of using mean squared error is that it is not a function of the true value of the parameter. ˆ  3 and E θˆ 2  13. θˆ is an estimator for θ. E[θ]

f

21.8.

g

ˆ If θ  4, what is the mean square error of θ? 21.9. [4B-S95:27] (2 points) Two different estimators, ψ and φ, are available for estimating the parameter, β, of a given loss distribution. To test their performance, you have conducted 75 simulated trials of each estimator, using β  2, with the following results: 75 X

ψ i  165,

i1

75 X i1

ψ 2i  375,

75 X i1

φ i  147,

75 X i1

φ 2i  312.

Calculate MSEψ ( β ) / MSEφ ( β ) . (A) (B) (C) (D) (E)

Less than 0.50 At least 0.50, but less than 0.65 At least 0.65, but less than 0.80 At least 0.80, but less than 0.95 At least 0.95, but less than 1.00

21.10. [4B-S92:17] (2 points) You are given that the underlying size of loss distribution for disability claims is a Pareto distribution with parameters α and θ  6000. You have used 10 random observations, maximum likelihood estimation, and simulation to determine ˆ the maximum likelihood estimator of α: the following for α, ˆ  2.20 E[ α] MSE ( αˆ )  1.00 Determine the variance of αˆ if α  2. (A) (B) (C) (D) (E)

Less than 0.70 At least 0.70, but less than 0.85 At least 0.85, but less than 1.00 At least 1.00, but less than 1.15 At least 1.15

21.11. A population contains the values 1, 2, 4, 9. A sample of 3 without replacement is drawn from this population. Let Y be the median of this sample. Calculate the mean square error of Y as an estimator of the population mean. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

21. REVIEW OF MATHEMATICAL STATISTICS

366

21.12. A sample of n elements, x1 , . . . , x n , is selected from a random variable having a uniform distribu¯ tion on [0, θ]. You wish to estimate θ with an estimator of the form θˆ  k x. Determine the k that minimizes the mean square error of the estimator as a function of the sample size n. 21.13. A sample of n elements, x 1 , . . . , x n , is selected from a random variable having a uniform distribution on [0, θ]. Let Y  max ( x i ) . You wish to estimate the parameter θ with an estimator of the form kY. You may use the following facts: (i) (ii)

E[Y]  nθ/ ( n + 1) .

Var ( Y )  nθ 2

.

 ( n + 2)( n + 1) 2 .

Determine the k that minimizes the mean square error of the estimator. 21.14. [4B-F93:13] (3 points) You are given the following: •

Two instruments are available for measuring a particular (non-zero) distance.



X is the random variable representing the measurement using the first instrument and Y is the random variable representing the measurement using the second instrument.



X and Y are independent.



E[X]  0.8m; E[Y]  m; Var ( X )  m 2 ; and Var ( Y )  1.5m 2 where m is the true distance. Consider the class of estimators of m which are of the form Z  αX + βY.

Within this class of estimators of m, determine the value of α that makes Z an unbiased estimator of minimum variance. (A) (B) (C) (D) (E)

Less than 0.45 At least 0.45, but less than 0.50 At least 0.50, but less than 0.55 At least 0.55, but less than 0.60 At least 0.60

21.15. [4-S00:18] You are given two independent estimates of an unknown quantity µ: (i) Estimate A: E ( µA )  1000 and σ ( µA )  400. (ii) Estimate B: E ( µ B )  1200 and σ ( µ B )  200. Estimate C is a weighted average of the two estimates A and B, such that µ C  w · µA + (1 − w ) · µ B Determine the value of w that minimizes σ ( µ C ) . (A) 0

(B) 1/5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1/4

(D) 1/3

(E) 1/2

Exercises continue on the next page . . .

EXERCISES FOR LESSON 21

367

21.16. For two estimators X and Y of λ: (i) E[X]  λ and Var ( X )  3λ 2 . (ii) E[Y]  λ and Var ( Y )  4λ 2 . (iii) Cov ( X, Y )  −λ 2 . Let Z  aX + bY.

Determine the a and b that make Z an unbiased estimator of λ and minimize its variance. 21.17. [4-F02:31] You are given: x Pr ( X  x )

0 0.5

1 0.3

2 0.1

3 0.1

Using a sample of size n, the population mean is estimated by the sample mean X¯ and the variance is P ( X i − X¯ ) 2 2 estimated by S n  . n Calculate the bias of S2n when n  4. (A) −0.72

(B) −0.49

(C) −0.24

(D) −0.08

(E) 0.00

21.18. Losses follow a Pareto distribution with parameters α  3, θ  600. A sample of 100 is available. Determine the mean square error of the sample mean as an estimator for the mean. 21.19. A random variable follows an exponential distribution with mean θ. X1 is an observation of this random variable. Express the bias of X12 as an estimator for θ 2 as a function of θ. (A) −2θ 2

(B) −θ 2

(C) 0

(D) θ2

(E) 2θ 2

21.20. A random variable follows an exponential distribution with mean θ. X1 is an observation of this random variable. Express the mean square error of X12 as an estimator for θ 2 as a function of θ. (A) 20θ 4

(B) 21θ 4

(C) 22θ 4

(D) 23θ 4

(E) 24θ4

21.21. A random variable follows an exponential distribution with mean θ. A sample of n items, ¯ {x1 , . . . , x n }, is drawn from the random variable. The sample mean is x. 2 2 ¯ Express the bias of x as an estimator for θ in terms of n and θ. 21.22. You are given a sample of n items, x1 , . . . , x n , from a uniform distribution on [0, θ]. As an estimator for θ, you use θ˘  ( n + 1) min x i ˘ Calculate the mean square error of θ.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

21. REVIEW OF MATHEMATICAL STATISTICS

368

21.23. [C-S05:16] For the random variable X, you are given: (i) E[X]  θ, θ>0 2 (ii) Var ( X )  θ /25 k X, k>0 (iii) θˆ  k+1 MSEθˆ ( θ )  2 biasθˆ ( θ )

(iv)



2

Determine k. (A) 0.2

(B) 0.5

(C) 2

(D) 5

(E) 25

Confidence intervals 21.24. A sample of 200 policies yields the following information for claim counts x i :

X

x¯  0.15

( x i − x¯ ) 2  46

Construct a 90% normal confidence interval for mean claim counts per policy. 21.25. [4B-S91:29] (2 points) (This exercise is on a topic unlikely to appear on the exam and may be skipped.) A sample of 1000 policies yields an estimated claim frequency of 0.210. The number of claims for each policy is assumed to have a Poisson distribution. A 95% confidence interval for λ is constructed using the true variance of the parameter. Determine the confidence interval. (A) (0.198, 0.225)

(B) (0.191, 0.232)

(C) (0.183, 0.240)

(D) (0.173, 0.251)

(E) (0.161, 0.264)

21.26. [4B-F97:1] (2 points) You are given the following: •

A portfolio consists of 10,000 identical and independent risks.



The number of claims per year for each risk follows a Poisson distribution with mean λ.



During the latest year, 1,000 claims have been observed for the entire portfolio. Determine the lower bound of a symmetric 95% confidence interval for λ.

(A) (B) (C) (D) (E)

Less than 0.0825 At least 0.0825, but less than 0.0875 At least 0.0875, but less than 0.0925 At least 0.0925, but less than 0.0975 At least 0.0975

Additional released exam questions: C-F05:28, C-F06:26

Solutions 21.1. (A) 21.2.

Only 1 is true. The other two statements have interchanged definitions of consistency and efficiency. a  3, b  1, c  2. (E)

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 21

21.3.

369

Only 3 is true. α 2 has higher variance than α 1 and the same bias, so it is less efficient. (C)

2

21.4. I is True . In II, MSEθˆ ( θ )  Var ( θˆ ) + biasθˆ ( θ ) and biasθˆ ( θ )  0, so it is True . III is True ; although this estimator is biased, asymptotically (as n → ∞), dividing by n − 1 and dividing by n doesn’t make a difference. (D)



21.5.

(B)

21.6. (A)

ˆ  θ and E[θ] ˜  θ. It follows that If θˆ and θ˜ are the two estimators, we are given E[θ] ˆ + (1 − w ) E[θ] ˜  wθ + (1 − w ) θ  θ w E[ θ]

!

(B)

This was discussed in the lesson. See item 3 on page 354. !

(C)

The sample mean is not necessarily consistent, unless the variance of the underlying distribution is finite. See Subsection 21.1.2. #

(D)

The sample maximum is asymptotically unbiased (Example 21B) and the variance approaches zero as n → ∞ (Example 21E), hence consistent. !

(E)

The mean square error is the variance plus the bias squared. !(C)

21.7. A correct version of (A) is “A uniformly minimum variance unbiased estimator is an unbiased estimator such than no other unbiased estimator has a smaller variance.” An estimator which is a constant has no variance, but if it is not equal to the true parameter it must be inconsistent, so (B) is false. Consistency is an asymptotic property, so a biased estimator which is asymptotically unbiased could be consistent, making (C) false. (D) is true, since mean square error is bias squared plus variance. Mean square error is a function of the true value of the parameter; in fact, it is the expected value of the square of the difference between the estimator and the true parameter, so (E) is false. 21.8. bias ˆ ( θ )  3 − 4  −1. Var ( θˆ )  13 − 32  4. MSE ˆ ( θ )  4 + (−1) 2  5 . θ

θ

21.9. We must estimate the variance of each estimator. The question is vague on whether to use the population variance (divide by 75) or the sample variance (divide by 74). The original exam question said to work out the answer according to a specific textbook, which used the population variance. We then get:

!2 375 165 + * + −  0.04 + 0.16  0.2 75 75 , !2  2 147 147 + 312 * MSEφ ( β )  −2 + −  0.0016 + 0.3184  0.32 75 75 75 , -

165 MSEψ ( β )  −2 75



0.2  0.625 0.32

2

(B)

If the sample variance were used, we would multiply 0.16 and 0.3184 by 75/74 to get 0.1622 and 0.3227 0.04+0.1622 respectively. The resulting quotient, 0.0016+0.3227  0.6235, which still leads to answer B. 21.10. The bias of αˆ is biasαˆ ( α )  2.20 − 2  0.2 Since the mean square error is the variance plus the bias squared, Var ( αˆ )  MSE ( αˆ ) − biasαˆ ( α )



C/4 Study Manual—17th edition Copyright ©2014 ASM

2

 1 − 0.22  0.96

(C)

21. REVIEW OF MATHEMATICAL STATISTICS

370

21.11. Half the time (if 4 or 9 is omitted from the sample) the sample median is 2 and the other half the time (if 1 or 2 is omitted from the sample) it is 4. The mean is 1+2+4+9  4. So the MSE is 12 (2 − 4) 2  2 . 4 21.12. The mean of the uniform distribution is θ/2, so the expected value of x¯ is θ/2, and the bias of the estimator is k ( θ/2) − θ  θ ( k/2 − 1) . The variance of x¯ is the variance of the uniform distribution over n, or θ 2 / (12n ) , and multiplying x¯ by k multiplies the variance by k 2 . We minimize the mean square error of ˆ the square of the bias plus the variance, or θ, k −1 2

!2

θ2 +

k 2 θ2 12n

as a function of k. Divide this expression by θ 2 . g (k ) 

k −1 2

!2

+

k2 12n

Differentiate k k −1 + 0 2 6n

!

g0 ( k )  k



1 1 + 1 2 6n



k 21.13. The bias of kY is

nθ n ( k − 1) − 1 k −θθ . n+1 n+1

!

The variance of kY is

The MSE is then

6n 3n + 1

k 2 nθ 2 . ( n + 2)( n + 1) 2

 2 n ( k − 1) − 1 + * // . + θ 2 .. ( n + 2)( n + 1) 2 ( n + 1) 2 , k 2 nθ 2

We shall minimize this by differentiating with respect to k. To simplify matters, divide the entire expression by θ 2 and multiply it by ( n + 1) 2 ; this has no effect on the minimizing k:

 2 k2 n + n ( k − 1) − 1 n+2   2kn 0 f (k )  + 2n n ( k − 1) − 1  0 n+2 k + n ( k − 1) − 1  0 n+2   1 k +n  n+1 n+2 f (k ) 

k n ( n + 2) + 1  ( n + 1)( n + 2)





k ( n + 1) 2  ( n + 1)( n + 2) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 21

k

371

n+2 n+1

21.14. From the unbiased condition: E[αX + βY]  m 0.8α + β  1 From the minimum variance condition: Minimize g ( α )  Var ( αX + βY )  α 2 Var ( X ) + β 2 Var ( Y )  α 2 m 2 + (1 − 0.8α ) 2 (1.5m 2 )

or

g (α)  α 2 + 1.5 − 2.4α + 0.96α 2 m2  1.96α2 − 2.4α + 1.5

A quadratic ax 2 + bx + c is minimized at −b/2a, so g ( α ) is minimized at α  2.4/3.92  0.6122 . (E)

21.15. The variance of the weighted average is

2 σC2  w 2 σA + (1 − w ) 2 σB2

 160,000w 2 + 40,000 (1 − w ) 2

Differentiating,

2 (160,000) w − 2 (40,000)(1 − w )  0 200,000w  40,000 w  1/5 21.16.

(B)

Z will be unbiased if and only if a + b  1. The variance of Z is Var ( Z )  a 2 Var ( X ) + b 2 Var ( Y ) + 2ab Cov ( X, Y )  λ 2 3a 2 + 4b 2 − 2ab





 λ 2 3a 2 + 4 (1 − a ) 2 − 2a (1 − a )





We’ll minimize the parenthetical expression.

3a 2 + 4 (1 − a ) 2 − 2a (1 − a )  3a 2 + 4 − 8a + 4a 2 − 2a + 2a 2  9a 2 − 10a + 4

The minimum of a quadratic px 2 + qx + r is −q/2p, so the minimum of this expression is a  21.17. We know that S2 

P

( X i −X¯ ) 2 n−1

n − 1 f 2g n − 1 2 E S  σ n n

n−1 σ2 − 1 σ2  − n n In this case, the true mean µ  0.5 (0) + 0.3 (1) + 0.1 (2) + 0.1 (3)  0.8 and the true variance is E[S2n ] − σ2 





σ2  0.5 (0 − 0.8) 2 + 0.3 (1 − 0.8) 2 + 0.1 (2 − 0.8) 2 + 0.1 (3 − 0.8) 2  0.96

So the bias is −0.96/4  −0.24 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

, so b 

is an unbiased estimator; in other words, E[S2 ]  σ2 . But then E[S2n ] 

and the bias is

5 9

4 9

.

21. REVIEW OF MATHEMATICAL STATISTICS

372

21.18. The estimator is unbiased because the sample mean is an unbiased estimator of the population mean. Therefore the mean square error equals the variance. The variance of the estimator is: Var ( X )  Var ( X¯ )  100

2 (600) 2 2·1



100

600 2 2

 2700 .

21.19. From the tables, the second moment of the exponential E[X12 ]  2θ 2 . Therefore, the bias is biasX1 ( θ 2 )  E[X12 ] − θ 2  2θ 2 − θ 2  θ 2

(D)

21.20. In the previous exercise, we calculated the bias as θ 2 . The variance of X12 is Var ( X12 )  E[X14 ] − E[X12 ]2  24θ 4 − (2θ 2 ) 2

using the tables for 4th moment and 2nd moment

 20θ 4

so the mean square error is 20θ 4 + ( θ 2 ) 2  21θ 4 . (B) 21.21. Y  ni1 x i is a gamma random variable with parameters n and θ. Our estimator is ( Y/n ) 2 . The expected value of Y 2 is, using the second moment of a gamma from the table,

P

E[Y 2 ]  n ( n + 1) θ 2 So the bias is biasx¯ 2 ( θ 2 ) 

n ( n + 1) 2 θ − θ 2  θ 2 /n n2

˘ we need the distribution function of Y  min x i 21.22. In order to calculate the expected value of θ, FY ( x )  Pr ( Y ≤ x )  1 − Pr ( Y > x )  1 −

n Y i1

θ−x Pr ( x i > x )  1 − θ

!n

Notice that Y follows a beta distribution with the same parameter θ, a  1, and b  n. Therefore, its mean and variance are E[Y]  E[Y 2 ] 

θ n+1

2θ 2 ( n + 1)( n + 2)

Var ( Y )  E[Y 2 ] − E[Y]2 

nθ 2 ( n + 1) 2 ( n + 2)

˘  θ, making the estimator unbiased. The mean square error is the variance of θ, ˘ or Therefore, E[ θ] nθ 2 Var ( θ˘ )  ( n + 1) 2 Var ( Y )  n+2 Note that the variance does not approach 0 as n → ∞. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 21 21.23. Since

373

MSEθˆ ( θ )  biasθˆ ( θ )



2

+ Var ( θˆ )

and by (iv) MSEθˆ ( θ )  2 biasθˆ ( θ )



it follows that



biasθˆ ( θ )

2

2

 Var ( θˆ )

(*)

so we calculate biasθˆ ( θ ) and Var ( θˆ ) . ˆ −θ biasθˆ ( θ )  E[θ]

k E X −θ k+1

"

#

θ kθ −θ− k+1 k+1

!



Var ( θˆ )  Var

k X k+1

k  k+1



biasθˆ ( θ )



θ − k+1

2 2

!2

θ2 25

 Var ( θˆ ) k  k+1

k2 1 25 k 5

!

!

by (*)

!2

θ2 25

!

(D)

Since k > 0, we reject k  −5.

21.24. The unbiased sample variance is 46/199. The variance of the sample mean is estimated as the estimated variance of the distribution divided by the size of the sample, or 46/199  0.0011558 200

√ The confidence interval is 0.15 ± 1.645 0.0011558  (0.094, 0.206) . 21.25. 0.21 − λ −1.96 ≤ √ ≤ 1.96 λ/1000 ( λ − 0.21) 2 ≤ 1.962  3.8416 λ/1000 1000λ 2 − 423.8416λ + 44.1 ≤ 0 √ √ 423.8416 − 3241.70 423.8416 + 3241.70 ≤λ≤ 2000 2000 0.183 ≤ λ ≤ 0.240 (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

374

21.26. Using the true variance. λ − 0.1 ≤ 1.96 −1.96 ≤ √ λ/10000 1.962 λ ( λ − 0.1) 2 ≤ 10000 λ 2 − 0.20038416λ + 0.01 ≤ 0 √ √ 0.20038416 − 0.0001538 0.20038416 + 0.0001538 ≤λ≤ 2 2 0.0940 ≤ λ ≤ 0.1064 The lower bound is 0.0940 . (D) The multiple choice ranges are so wide that using the cruder approxi√ mation with the approximate variance 0.1 − 1.96 0.1/10000  0.0938 results in the same answer.

Quiz Solutions 21-1. ln x is normally distributed with parameters µ and σ  4, so E[ln x]  µ, making the estimator unbiased. Also, Var (ln x )  σ2  16, so the mean square error is 16 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 22

The Empirical Distribution for Complete Data Reading: Loss Models Fourth Edition 11 Complete data for a study means that every relevant observation is available and the exact value of every observation is known. Examples of data that are not complete are: • Observations below a certain number are not available. You only obtain a data point if it is higher than that number. • For observations above a certain number, you are only told that the observation is above that number. For example, in a mortality study, the data points may be amount of time until death. For some individuals, you may only be told that the person survived 5 years, but not told exactly how long he survived. This lesson discusses an estimator for the underlying distribution when you are provided with complete data.

22.1

Individual data

If individual data, by which we mean the exact observation points, are provided, the empirical distribution, as defined in Section 1.5, may be used as the underlying distribution. The textbook uses a subscript n on a probability function to indicate the empirical distribution based on n observations. Thus Fn ( x ) is the empirical cumulative distribution function, f n ( x ) is the empirical probability or probability density function, and so on. Since the empirical distribution for individual data is discrete, f n ( x ) would be the probability of x, and would equal k/n, where k is the number of x i in the sample equal to x. The empirical cumulative hazard function is Hn ( x )  − ln S n ( x ) . As an alternative, if for some reason you don’t want to use the empirical distribution as the underlying distribution, the cumulative hazard function can be estimated using the Nelson-Åalen estimator, which we’ll study in Section 24.2. Note that when complete individual data is available, the Nelson-Åalen estimate of H ( x ) is different from the empirical distribution estimate, whereas the product limit estimate of S ( x ) (which we’ll study in Section 24.1) is the same as the empirical distribution estimate. Example 22A In a mortality study on 10 lives, times at death are 22, 35, 78, 101, 125, 237, 350, 350, 484, 600. The empirical distribution is used as a model for the underlying distribution of time to death for the population. Calculate F10 (100) , f10 (350) , and H10 (100) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

375

22. THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA

376

Answer: #{x i ≤ 100} 3   0.3 10 10 #{x i  350} 2 f10 (350)    0.2 10  10 H10 (100)  − ln 1 − F10 (100)  − ln 0.7  0.3567 F10 (100)  Pr ( X ≤ 100) 



The empirical distribution is discrete, so it is a step function. In the above example, Fn (101)  0.4 but Fn (101 −  )  0.3 for any small  > 0. The empirical distribution may be used as an estimator for discrete distributions as well. The procedure is the same as for continuous distributions—the probability of each observed value is the proportion of the observed values in the sample, and the probability of any other value is 0.

22.2

Grouped data

Grouped data has a set of intervals and the number of losses in each interval, but does not give the exact value of each loss. This means that we know the empirical cumulative distribution function only at endpoints of intervals. Strictly speaking, grouped data is not complete data, but we will consider a modification to the empirical distribution to handle it. To generate the cumulative distribution function for all points, we “connect the dots”. We interpolate linearly between endpoints of intervals. The resulting distribution function is denoted by Fn ( x ) and is called the ogive. The derivative of the ogive is denoted by f n ( x ) . It is the density function corresponding to the ogive, and is called the histogram. It is constant between endpoints of intervals. At each point, it is equal to the number of points in the interval of that point divided by the product of the length of the interval and the total number of points (in all intervals). In other words: fn (x ) 

nj n ( c j − c j−1 )

(22.1)

where x is in the interval1 [c j−1 , c j ) , there are n j points in the interval, and n points altogether. Example 22B [4B-S95:1] (1 point) 50 observed losses have been recorded in millions and grouped by size of loss as follows: Size of Loss (X) Number of Observed Losses ( 0.5, 2.5] ( 2.5, 10.5] ( 10.5, 100.5] (100.5, 1000.5]

25 10 10 5 50

What is the height of the relative frequency histogram, f n ( x ) , at x  50? (A) Less than 0.05 (B) At least 0.05, but less than 0.10 (C) At least 0.10, but less than 0.15 (D) At least 0.15, but less than 0.20 (E) At least 0.20 1 f n ( x ) is defined to be right-continuous, so the interval is closed on the left and open on the right. This is an arbitrary decision. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 22

377

Answer: The length of the interval around 50 is 100.5 − 10.5  90. n  50. So fn (x ) 

10

(90)(50)

 0.0022 .

(A)

The choices appear to be designed for somebody who forgot to divide by 90.



Exercises Use the following information for questions 22.1 and 22.2: You are given the following exact times to death in a study of a population of 20: Number of years

Number surviving this long

3 4 5 7 9 10 11 12 15

4 1 3 6 2 1 1 1 1

Let T be survival time. 22.1. Using the empirical distribution as a model, calculate the probability of surviving more than 8 but no more than 12, or Pr (8 < T ≤ 12) .

22.2. Using the empirical distribution as a model, calculate the probability of surviving at least 5 but no more than 12, or Pr (5 ≤ T ≤ 12) .

22.3.

For an insurance coverage, you observe the following 15 losses: 12, 12, 15, 20, 30, 35, 43, 50, 50, 70, 85, 90, 100, 120, 150

Calculate the empirical estimate of H (50) . 22.4.

For a nursing home population, you observe the following survival times: Number of years

Number surviving this long

0–1 1–2 2–5 5–10 10+

8 15 42 29 6 100

Calculate the empirical density function at 6 years, f100 (6) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

22. THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA

378

[4B-S93:31] (2 points)

22.5.

The following 20 wind losses, recorded in millions of dollars, occurred in 1992: 1, 6,

1, 6,

1, 8,

1, 10,

1, 13,

2, 14,

2, 15,

3, 18,

3, 22,

4, 25

To construct an ogive Fn ( x ) , the losses were segregated into four ranges: (0.5, 2.5), (2.5, 8.5), (8.5, 15.5), (15.5, 29.5) Determine the values of the probability density function f n ( x ) corresponding to Fn ( x ) for the values x1  4 and x2  10. (A) (B) (C) (D) (E)

f n ( x1 ) f n ( x1 ) f n ( x1 ) f n ( x1 ) f n ( x1 )

 0.300,  0.050,  0.175,  0.500,  0.050,

f n ( x2 ) f n ( x2 ) f n ( x2 ) f n ( x2 ) f n ( x2 )

 0.200  0.050  0.050  0.700  0.029

22.6. [4B-F94:12] (1 point) Nine observed losses have been recorded in thousands of dollars and are grouped as follows: Interval Number of claims

[0,2) 2

[2,5) 4

[5, ∞) 3

Determine the value of the relative frequency histogram for these losses at x  3. (A) (B) (C) (D) (E) 22.7. •

Less than 0.15 At least 0.15, but less than 0.25 At least 0.25, but less than 0.35 At least 0.35, but less than 0.45 At least 0.45 [4B-F96:11] (2 points) You are given the following:

Ten losses (X) have been recorded as follows: 1000, 1000, 1000, 1000, 2000, 2000, 2000, 3000, 3000, 4000



An ogive, Fn ( x ) , has been fitted to this data using endpoints for the connecting line segments with x-values as follows: x  c 0  500,

x  c 1  1500,

x  c 2  2500,

x  c 3  4500.

Determine the height of the corresponding relative frequency histogram, f n ( x ) , at x  3000. (A) 0.00010

(B) 0.00015

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.00020

(D) 0.00025

(E) 0.00030

Exercises continue on the next page . . .

EXERCISES FOR LESSON 22

22.8.

379

You are given the following information for claim sizes: Claim Size

Number of Claims

0–1000 1000–5000 5000–10000

35 45 20

Use the ogive to estimate the probability that a randomly chosen claim is between 4000 and 5000. 22.9.

[C-S05:26] You are given the following information regarding claim sizes for 100 claims: Claim Size 0 – 1,000 1,000 – 3,000 3,000 – 5,000 5,000 – 10,000 10,000 – 25,000 25,000 – 50,000 50,000 – 100,000 over 100,000

Number of Claims 16 22 25 18 10 5 3 1

Use the ogive to estimate the probability that a randomly chosen claim is between 2,000 and 6,000. (A) 0.36

(B) 0.40

(C) 0.45

(D) 0.47

(E) 0.50

22.10. You are given the following data for amount of time for 100 hospital stays: Number of days

Number of people

(0, 1] (1, 2] (2, 4] (4, 7] (7, 10] (10, ∞)

14 19 16 12 14 25

Using the ogive, estimate h (3) , the hazard rate function at 3 days. Additional released exam questions: C-F05:1,33, C-F06:35

Solutions 22.1. F20 (8)  (4 + 1 + 3 + 6) /20  0.7 and F20 (12)  (4 + 1 + 3 + 6 + 2 + 1 + 1 + 1) /20  0.95, so Pr (8 < T ≤ 12)  F20 (12) −F20 (8)  0.25 . You could do this faster by counting the times between 8 and 12 (2 at 9, etc.). There are five times greater than 8 and not greater than 12, so Pr20 (8 < T ≤ 12)  5/20  0.25 . 22.2. We have to be careful since Fn is discrete, and Pr20 (T < 5) , Pr20 (T ≤ 5) . We must count all observations in the interval [5, 12], including the ones at 5, so Pr20 (5 ≤ T ≤ 12)  (3 + 6 + 2 + 1 + 1 + 1) /20  0.7 . 22.3. There are 9 losses less than or equal to 50, so F15 (50)  9/15  0.6 and H15 (50)  − ln (1 − 0.6)  0.9163 . C/4 Study Manual—17th edition Copyright ©2014 ASM

22. THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA

380

22.4. There are 29 observations in the interval 5–10 which contains 6, and the width of the interval 5–10 is 5. By equation (22.1): 29 f (6)   0.058 . (5)(100) 22.5. Note that the exercise gives you a list of 20 wind losses, in order from lowest to highest, as the comma after the 4 on the first line indicates. The 20 numbers are not a two-line unlabeled table in which the first line is a frequency and the second line is a number! There are 6 observations in (2.5, 8.5) and 4 observations in (8.5, 15.5). 6  0.050 (20)(6) 4 f n (10)   0.029 (20)(7) f n (4) 

(E)

22.6. 3 is in the interval [2, 5). There are 4 observed losses in this interval, a total of 9 observed losses, and the width of the interval is 3. By equation (22.1): 4

(9)(3)

 0.148 .

(A)

22.7. 3000 is in the interval [2500, 4500]. There are 3 observed losses in this interval, a total of 10 observed losses, and the width of the interval is 2000. By equation (22.1): 3  0.00015 . (10)(2000)

(B)

22.8. Since linear interpolation is used, 1/4 of the claims in the 1000–5000 range will be between 4000 and 5000, or 45/4=11.25 claims out of 100, and 11.25/100  0.1125 . 22.9.

Let X be claim size. Since there are a total of 100 claims, 22  0.22 100 25 Pr (3000 < X < 5000)   0.25 100 18 Pr (5000 < X < 10,000)   0.18 100 Pr (1000 < X < 3000) 

Using the ogive, Pr (2000 < X < 3000)  21 Pr (1000 < X < 3000)  0.11 and Pr (5000 < X < 6000)  1 5 Pr (5000 < X < 10,000)  0.036. The answer is therefore 0.11 + 0.25 + 0.036  0.396 . (B)

22.10. As we learned in Lesson 1, h ( x )  f ( x ) /S ( x ) . Here, f100 (3) 

16

(2)(100)

 0.08

14 + 19 + 0.5 (16)  0.59 100 0.08 h 100 (3)   0.1356 0.59

S100 (3)  1 −

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 23

Variance of Empirical Estimators with Complete Data Reading: Loss Models Fourth Edition 12.2 Very few released exam questions relate to the material in this lesson. The textbook has several complicated looking formulas for the variance of the estimators we discussed in the last lesson. Rather than memorizing these formulas, you are better off understanding the principles behind them, so you can derive them yourself as needed. The two distributions you need to understand in order to derive any formula you need for variance of empirical estimators with complete data are: 1. The binomial distribution. If there are m items which are being placed in two categories, and the probability that it is placed in the first category is q, then the variance of the number of items, X, in the first category (or in the second category for that matter) is mq (1 − q ) . A variation of this is the random variable representing the proportion of items in the first category, or Y  X/m. The variance of Y is the variance of X divided by m 2 , or q (1 − q ) /m. This random variable is called a binomial proportion variable.

2. The multinomial distribution. If there are m items which are being placed in k categories, with probabilities q1 , . . . , q k respectively (the probabilities must sum up to 1), then the variance of the number of items in category i is mq i (1−q i ) . The covariance of the number of items in categories i and j is −mq i q j . As with the binomial distribution, you may also consider the proportion of items in the categories, in which case you would divide these expressions by m 2 ; the variance of the multinomial qi q j q (1−q ) proportion is i m i and the covariance is − m .

23.1

Individual data

If the empirical distribution is being used as the model with individual data, then S n ( x ) is the proportion of observations above x. Since the probability of an observation being above x is S ( x ) , S n ( x ) is a binomial proportion random variable with parameters m  n and q  S ( x ) ; its variance is therefore: S (x ) 1 − S (x )



Var S n ( x ) 







n

Since we don’t know S ( x ) , we estimate the variance using S n ( x ) instead of S ( x ) : Sn ( x ) 1 − Sn ( x )



L Sn ( x )  Var 



n

 (23.1)

There are a couple of pre-2000 syllabus exam questions (which are provided below in the exercises) where S ( x ) is explicitly specified. In such a case, you would use the real S ( x ) in this formula, not S n ( x ) . But I don’t expect such questions on current exams. C/4 Study Manual—17th edition Copyright ©2014 ASM

381

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

382

If n x is the observed number of survivors past time x, then S n ( x )  n x /n. Plugging this into equation (23.1), the estimated variance becomes

( n x /n )(1 − n x /n )

L Sn ( x )  Var 



n nx (n − nx )  n3

(23.2)

We can also empirically estimate the probability of survival past time y given survival past time x, Pr ( X > y | X > x ) . We shall use the notation y−x p x to denote this probability, the probability of survival to time y, or to y − x time units past x, given survival to time x, and also the notation y−x q x to denote the complement, the probability of failure within y − x additional time units after x given survival to time x. (This notation is identical to the notation used in Exam MLC/LC.) The estimator for y−x p x is Prn ( X > y | X > x )  n y /n x , where n x and n y are the observed number of survivors past times x and y respectively. The variance of this conditional estimator, a quotient of two random variables, cannot be estimated unconditionally, since the estimator may not even exist (if everyone dies before time x). The best we can do is estimate the variance conditional on having the observed number of lives at time x, in effect making the denominator a constant. The estimator of the variance is then essentially the same as the  L unconditional estimator of Var S n ( x ) (equation (23.2)), but replacing n with n x and n x with n y . Thus, we have ( n − n y )( n y ) L ( y−x pˆ x | n x )  Var L ( y−x qˆ x | n x )  x Var (23.3) n 3x Notice that the variances of y−x pˆ x and y−x qˆ x are identical, since one is the complement of the other and Var (1 − X )  Var ( X ) . The same applies to conditional versions of these variances. f n ( x ) is the proportion equal to x. The right hand sides of formulas (23.2) and (23.3)  of observations  are used to estimate Var f n ( x ) , with n x redefined to be the number of observations equal to x. The empirical estimators of S ( x ) and f ( x ) with individual data are unbiased. Example 23A (Same data as Example 22A) In a mortality study on 10 lives, times at death are 22, 35, 78, 101, 125, 237, 350, 350, 484, 600. The empirical distribution is used as a model for the underlying distribution of time to death for the population.     Estimate Var F10 (100) and Var f10 (350) . Answer: From formula (23.2), with n x  3 since there are 3 observations below 100: Var F10 (100) 





3 (7)  0.021 103

From formula (23.2), with n x  2 since there are 2 observations equal to 350: Var f10 (350) 



23.2



2 (8)  0.016 103



Grouped data

For estimating the variance of the S ( x ) estimator for grouped data, the same formulas as for individual data can be used at boundaries of intervals. C/4 Study Manual—17th edition Copyright ©2014 ASM

23.2. GROUPED DATA

383

Otherwise, S n ( x ) is linearly interpolated. The textbook derives the following formulas using the multinomial distribution. For a point x in the interval ( c j−1 , c j ) , if Y is the number of observations less than or equal to c j−1 and Z is the number of observations in the interval ( c j−1 , c j ], then

L Sn ( x )  Var 



L (Y )( c j − c j−1 ) 2 + Var L ( Z )( x − c j−1 ) 2 + 2Cov M (Y, Z )( c j − c j−1 )( x − c j−1 ) Var

L fn (x )  and Var 



n 2 ( c j − c j−1 ) 2

L (Z ) Var n 2 ( c j − c j−1 ) 2

(23.4) (23.5)

where Y (n − Y ) n Z ( n − Z) L (Z )  Var n M (Y, Z )  − YZ Cov n

L (Y )  Var

The ogive is a biased estimator (except at boundaries), since we have no reason to believe the true distribution is linear between boundaries. Rather than memorizing the above, I recommend deriving whatever you need from the multinomial formula. The following example illustrates how to do this. Example 23B You are given the following data on 50 loss sizes: Interval

Number of losses in interval

0– 500 500– 1000 1000– 5000 5000–10000 > 10000

25 10 10 3 2

An ogive is used to estimate the distribution of losses. 1. Estimate the variance of the estimator for the probability of a loss greater than 2500. 2. Estimate the variance of the estimator for the density function at 2500. Answer: Let’s first calculate the variance of the estimator for the probability of a loss greater than 2500. Step 1— What’s the estimator? What exactly, in mathematical form, are you calculating the variance of? The ogive estimator uses linear interpolation between endpoints. This means that we count up the number of observations above the next endpoint, add on a proportionate number of observations between 2500 and the next endpoint, and then divide by the total number of observations. In other words, 3 + 2 (the observations above 5000), plus 58 of 10 (the proportionate number of observations between 1000 and 5000), all divided by 50 (the total number of observations):

D ( X > 2500)  Pr

3 + 2 + (5/8) 10 50

Step 2— In this estimator, what is random and what is not? The only thing that is random is the number of observations in each interval. The total number of observations is not random—it is assumed we C/4 Study Manual—17th edition Copyright ©2014 ASM

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

384

decided in advance how many cases to study. The endpoints aren’t random—it is assumed that we designed the study this way. 2500 is not random—we decided what question to ask.

D ( X > 2500) , the 3, 2, and 10 are random, but the So in the above expression for Pr Items which are not random have no variance.

5 8

and 50 are not.

Step 3— Write down an expression for the estimator with mathematical symbols for the random variables. Let Y be the number of observations above 5000, and Z the number of observations between 1000 and 5000. (The number of observations between 2500 and 5000 is random, but we don’t know what it is; we’re estimating it!) Then the estimator is

D ( X > 2500)  Pr

1 1 Y + (5/8) Z  Y+ Z 50 50 80

Step 4— Calculate the variance of the random variables. Here’s where the binomial and multinomial distributions come into play. Whenever we need a probability (and we don’t have it, since we’re estimating it), use the empirical, estimated probability. Y, the number of observations above 5000, is a binomial random variable; either an observation is above 5000 or it isn’t. The probability of an observation above 5000 is estimated as 3+2 50  0.1. So Y is binomial with m  50 and q  0.1. Z, the number of observations between 1000 and 5000, is a binomial random variable; either an observation is between 1000 and 5000 or it isn’t. The probability of an observation between 1000 and 10 5000 is estimated as 50  0.2. So Z is binomial with m  50 and q  0.2. Y and Z form a trinomial distribution; either an observation is less than 1000, between 1000 and 5000, or over 5000. The parameters are q y  0.1, q z  0.2, and m  50. Step 5— Calculate the variance. Use the usual formula for Var ( aY + bZ ) : Var ( aY + bZ )  a 2 Var ( Y ) + 2ab Cov ( Y, Z ) + b 2 Var ( Z ) From step 3, a 

1 50

and b 

1 80 .

From the binomial and trinomial, Var ( Y )  mq y (1 − q y )  50 (0.1)(0.9)  4.5

Var ( Z )  mq z (1 − q z )  50 (0.2)(0.8)  8

Cov ( Y, Z )  −mq y q z  −50 (0.1)(0.2)  −1

L Pr D (loss > 2500)  Var 



1 1 (4.5) + 2 2500 50

!

1 1 (−1) + (8) 80 6400

!

 0.0018 − 0.0005 + 0.00125  0.00255 Now let’s estimate the variance of the estimate of the density function at 2500 in the same way. Step 1 The estimator, by equation (22.1), is 10

.

 (50)(4000) .

Step 2 Y, the number of observations in the interval [1000, 5000], is random; 50 and 4000 are not. Step 3 The expression is Y/200,000. Step 4 Y is binomial(m  50, q  0.2). Its variance is 50 (0.2)(0.8)  8. Step 5

L Var

C/4 Study Manual—17th edition Copyright ©2014 ASM

Y 1 (8)  2 × 10−10  200,000 200,0002

!



EXERCISES FOR LESSON 23

385

Exercises Use the following information for questions 23.1 and 23.2: You are given the following data on loss sizes for an insurance coverage: 2 23.1.

3

5

8

8

10

12

15

18

25

Estimate the variance of the empirical estimator of F (11) .

23.2. Assume that there is an ordinary deductible of 5, and losses 5 and lower are not submitted as claims. Estimate the variance of the probability of a claim resulting in a payment of more than 6. 23.3. In a mortality study on 50 lives, one death occurs at each of times 2, 4, and 8. There are no withdrawals. Estimate the variance of the empirical estimator of 4 q 2 . Use the following information for questions 23.4 through 23.6: You are given the following information for loss sizes: Loss sizes

Number of losses

0–1000 1000–2000 2000–5000 5000–10,000 Over 10,000

32 20 35 31 12 130

23.4. The probability of a loss greater than 500 is estimated empirically assuming uniform distribution of losses within each interval. Estimate the variance of this estimate. 23.5. The probability of a loss between 1000 and 2000 is estimated empirically assuming uniform distribution of losses within each interval. Estimate the variance of this estimate. 23.6. The probability of a loss greater than 1500 is estimated empirically assuming uniform distribution of losses within each interval. Estimate the variance of this estimate.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

386

23.7.

You are given the following data from a mortality study on 100 individuals: Survival time

Number of individuals

0– 5 5–10 10–15 15+

4 3 5 88

The density function at 7, f (7) is estimated empirically. Estimate the variance of this estimate. 23.8.

You are given the following loss data: Size of loss

Number of losses

0– 1,000 1,000– 5,000 5,000–10,000 10,000–20,000

10 8 4 3

S (1500) is estimated empirically using the ogive. Estimate the variance of the estimator. 23.9.

You are given the following loss data: Size of loss

Number of losses

0– 500 500–2000 Over 2000

15 12 8

The probability of a loss between 800 and 1000 is estimated empirically, assuming uniform distribution of losses within each interval. Estimate the variance of the estimator. 23.10. The following is data on claim frequency for one year: Number of claims

Number of policies

0 1 2 3 or more

502 34 12 2 550

The probability of at least 1 claim is estimated empirically using the ogive. Estimate the variance of this estimator. 23.11. [160-S87:7] Two mortality studies with complete data have produced independent estimators of S (10) . Study 1 began with 300 lives and produced S300 (10)  0.60. Study 2 began with 100 lives and produced S100 (10)  0.50. Calculate the estimated variance of the difference between these estimators. (A) 0.0008 (B) 0.0017 (C) 0.0025 (E) The value cannot be determined from the information given. C/4 Study Manual—17th edition Copyright ©2014 ASM

(D) 0.0033

Exercises continue on the next page . . .

EXERCISES FOR LESSON 23

387

23.12. [160-S88:7] You are given: (i) A cohort of 12 individuals is observed from t  0 until t  9. (ii) The observed times of death are 1, 2, 2, 3, 4, 4, 6, 6, 7, 8, 8, 9. (iii) The cohort group is assumed to be subject to the uniform survival distribution over (0,9]. Calculate the conditional variance of pˆ 2 . (A) 0.009

(B) 0.010

(C) 0.011

(D) 0.012

(E) 0.014

23.13. [160-F90:10] From a mortality study with complete data, you are given: (i) The number of deaths during the third year is 9. (ii) The number of lives beginning the fourth year is 18. (iii) S n ( y j ) is the empirical survival function at the end of year y j . (iv) S n (1)  0.65 and S n (3)  0.30. (v) (vi)

Var S n ( y j ) is the variance of S n ( y j ) if the underlying distribution is known to be exponential with mean − ln10.7 .





L S n ( y j ) is the estimated variance of S n ( y j ) if the underlying distribution is unknown. Var 



L S n (2) . Calculate the absolute difference between Var S n (2) and Var 

(A) 0.00001 23.14.

(B) 0.00002

(C) 0.00003







(D) 0.00004

(E) 0.00005

[160-S91:10] A cohort of 10 lives is observed over the interval (0, 6].

You are given: (i)

The observed times of death are 1, 2, 3, 4, 4, 5, 5, 6, 6, 6.

(ii)

VarU S10 (3) is the variance of S10 (3) when the survival distribution is assumed to be uniform on (0, 6].





L S10 (3) is the estimated variance of S10 (3) when no assumption about the underlying survival (iii) Var distribution is made. 



L S10 (3) . Calculate VarU S10 (3) − Var 

(A) −0.004



(B) −0.002

C/4 Study Manual—17th edition Copyright ©2014 ASM





(C) 0.000

(D) 0.002

(E) 0.004

Exercises continue on the next page . . .

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

388

23.15. [160-83-94:5] A cohort of terminally ill patients are studied beginning at time t  0 until all have died at t  5. You are given: (i) Time t

Deaths at Time t

1 2 3 4 5

6 9 5 d4 d5

(ii)

L S n (1)  Var L S n (3) based on the actual data. Var

(iii)

The average remaining lifetime for those who survive to t  3 is 76 .









Calculate the number of deaths at t  4. (A) 1

(B) 3

(C) 5

(D) 10

(E) 15

Additional released exam questions: C-S07:40

Solutions 23.1.

Using formula (23.1) (the variance of F and S is the same), we have

L F10 (11)  Var 

23.2.



10

 0.024

Using formula (23.3), we have

L (6 q 5 )  Var 23.3.

(0.6)(0.4)

(3)(4) 73



12 343

Using formula (23.3), we have

L (4 q 2 )  Var

(1)(48) 493

 0.0004080

23.4. The number of losses less than 1000 is a binomial random variable with m  130 and q  Pr ( X ≤ 32 1000) which is estimated by F130 (1000)  130 . The proportion of losses less than 1000 is a binomial proportion random variable. Its variance is estimated as

q (1−q ) m



(32)(98) 1303

. The probability of a loss greater

than 500 is estimated as 1 minus half the losses less than 1000 divided by 130. The variance is then times the variance of the binomial proportion random variable, or

L S130 (500)  Var 

C/4 Study Manual—17th edition Copyright ©2014 ASM



1 2

!2 

(32)(98) 1303



 0.0003569

1 2 2

EXERCISE SOLUTIONS FOR LESSON 23

389

23.5. This is the easy case. The estimator is the number of losses in this range divided by 130. The number 20 and variance mq (1 − q ) . So of losses is a binomial variable with estimated parameters m  130, q  130 2 the variance of the probability of a loss is mq (1 − q ) divided by 130 , or:

  L Pr D (1000 < X < 2000)  Var

20  110 130 130

130

 0.0010014

23.6. The ogive estimator of the probability of a loss greater than 1500 is ( X + 0.5Y ) /130, where X is the observed number of losses above 2000 and Y is the observed number of losses between 1000 and 2000. The variance is therefore Var ( X ) + 0.52 Var ( Y ) + 2 (0.5) Cov ( X, Y ) 1302 X and Y are multinomial with m  130 and X having estimated probability 78/130 and Y having estimated probability 20/130. Therefore, the variance of the ogive estimator of S130 (1500) is

L S130 (1500)  Var 





Var ( X ) + 0.52 Var ( Y ) + 2 (0.5) Cov ( X, Y ) 1302 78  52  20  110 130 130 130 + (0.52 )(130) 130 130 − (2)(0.5)(130)

1302 (78)(20) + −  3 3 130 130 1303  0.001846 + 0.000250 − 0.000710  0.001386

(78)(52)

78  20  130 130

(0.25)(20)(110)

23.7. The density function estimator is the number of losses in the interval, a multinomial random variable with m  4 + 3 + 5 + 88  100 and q  3/100  0.03 (estimated), divided by the length of the interval, 5, and the total number of losses, 100. The estimated variance of the multinomial random variable is (100)(0.03)(0.97)  2.91. So the variance of the estimator is

L f100 (7)  Var 



2.91  0.00001164 (1002 )(52 )

23.8. The random variable for losses in (1000, 5000) is multinomial with m  10 + 8 + 4 + 3  25 and 8 q 2  25  0.32 (estimated). The random variable for losses greater than 5000 is multinomial with m  25, 4+3 5000−1500 q 3  25  0.28 (estimated). S25 (1500) is estimated as 5000−1000  78 times the first, divided by 25, plus the second divided by 25. So the variance is:

!2  !     7 (0.32)(0.68) (0.28)(0.72) 7 (0.32)(0.28) L Var S25 (1500)  + −2 8

25

25

8

25

 0.006664 + 0.008064 − 2 (0.003136)  0.008456 23.9. The estimate of the number of losses Ni in the subinterval (800, 1000) is the number of losses times the length of the subinterval, 1000−800  200, divided by the length of the total interval, 2000−500  1500, 2 or 15 of the number of losses in the interval. The number of losses in the interval is a multinomial random variable with m  15 + 12 + 8  35 and q  12 35 (estimated). So the variance of the number of losses in (800, 1000) is !2 ! ! 2 12 23 L Var ( Ni )  (35)  0.140190 15 35 35 C/4 Study Manual—17th edition Copyright ©2014 ASM

390

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

The probability of a loss in the subinterval is the number of losses in the subinterval divided by the total number of losses, 35, so we divide 0.140190 by 352 to obtain the variance of the probability of a loss: 0.140190  0.00011444 352 23.10. By equation (23.2) with n x  34 + 12 + 2  48, the variance is

(48)(502) 5503

 0.0001448

23.11. Since the estimators are independent, there is no covariance between them, so the variance of the difference of the estimators is the sum of their variances. By equation (23.1), the variance of the first estimator is (0.60)(0.40) /300  0.0008 and the variance of the second estimator is (0.5)(0.5) /100  0.0025, so the variance of the difference is 0.0025 + 0.0008  0.0033 . (D) 23.12. We use equation (23.1), but with S instead of S n , as indicated in the paragraph after the equation. The conditional density of the survival function given survival to time 2 is uniform (since it is the unconditional density, which is uniform, divided by S (2) , which is constant), and since it is defined between 2 and 9, the density is 17 . Therefore the number of lives surviving the third year given that 9 lives begin the year is a binomial variable with m  9, q  67 and variance 9 (6/7)(1/7) . pˆ 2 is this binomial variable divided by 9, so its variance is 1/81 of the variance of the binomial variable or:

L ( pˆ 2 | n2  9)  Var

(1/7)(6/7) 9

 0.01361

(E)

23.13. If the underlying distribution is unknown, we use equation (23.1) with S n (2) . Since S n (3)  0.30 and 9 deaths occurred during the third year resulting in 18 survivors, the probability of death in the third year is q3  9/27  1/3, and q 3  1 − S n (3) /S n (2) , so S n (2)  0.30/ (2/3)  0.45. Then

L S n (2)  Var 



(0.45)(0.55) 60

 0.004125

If the underlying distribution is exponential, we use equation (23.1) with S (2) . Under an exponential, S (2)  e −2/θ  e 2 ln 0.7  0.49, so Var S n (2) 





(0.49)(0.51) 60

 0.004165

The difference is 0.004165 − 0.004125  0.00004 . (D) 23.14. U

Var

C/4 Study Manual—17th edition Copyright ©2014 ASM

S10 (3) 

1 1 2 2

 0.025 10   L S10 (3)  (0.3)(0.7)  0.021 Var 10 0.025 − 0.021  0.004 (E)





EXERCISE SOLUTIONS FOR LESSON 23

391

23.15. The average remaining lifetime equation is: d4 + 2d5 7  d4 + d5 6 while the equality of the variances at times 1 and 3 equation is:

(20)( d4 + d5 ) (6)(14 + d4 + d5 )  3 (20 + d4 + d5 ) (20 + d4 + d5 ) 3 or 84 + 6 ( d4 + d5 )  20 ( d4 + d5 ) d4 + d5  6 d4 + 2d5  7 d5  1 d4  5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

392

C/4 Study Manual—17th edition Copyright ©2014 ASM

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

Lesson 24

Kaplan-Meier and Nelson-Åalen Estimators Reading: Loss Models Fourth Edition 12.1 Exams routinely feature questions based on the material in this lesson. When conducting a study, we often do not have complete data, and therefore cannot use raw empirical estimators. Data may be incomplete in two ways: 1. No information at all is provided for certain ranges of data. Examples would be: • An insurance policy has a deductible d. If a loss is for an amount d or less, it is not submitted. Any data you have regarding losses is conditional on the loss being greater than d. • You are measuring amount of time from disablement to recovery, but the disability policy has a six-month elimination period. Your data only includes cases for which disability payments were made. If time from disablement to recovery is less than six months, there is no record in your data. When data are not provided for a range, the data is said to be truncated. In the two examples just given, the data are left truncated, or truncated from below. It is also possible for data to be truncated from above, or right truncated. An example would be a study on time from disablement to recovery conducted on June 30, 2009 that considers only disabled people who recovered by June 30, 2009. For a group of people disabled on June 30, 2006, this study would truncate the data at time 3, since people who did not recover within 3 years would be excluded from the study. 2. The exact data point is not provided; instead, a range is provided. Examples would be: • An insurance policy has a policy limit u. If a loss is for an amount greater than u, the only information you have is that the loss is greater than u, but you are not given the exact amount of the loss. • In a mortality study on life insurance policyholders, some policyholders surrender their policy. For these policyholders, you know that they died (or will die) some time after they surrender their policy, but don’t know the exact time of death. When a range of values rather than an exact value is provided, the data is said to be censored. In the two examples just given, the data are right censored, or censored from above. It is also possible for data to be censored from below, or left censored. An example would be a study of smokers to determine the age at which they started smoking in which for smokers who started below age 18 the exact age is not provided. We will discuss techniques for constructing data-dependent estimators in the presence of left truncation and right censoring. Data-dependent estimators in the presence of right truncation or left censoring are beyond the scope of the syllabus.1 1However, parametric estimators in the presence of right truncation or left censoring are not excluded from the syllabus. We will study parametric estimators in Lessons 30–33. C/4 Study Manual—17th edition Copyright ©2014 ASM

393

24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

394

S (t ) = a



s1 S (t ) = a 1 − r1

y1





y2

s1 S (t ) = a 1 − r1



s2 1− r2



t s1 deaths out of r1 lives

s2 deaths out of r2 lives

Figure 24.1: Illustration of the Kaplan-Meier product limit estimator. The survival function is initially a . After each event time, it is reduced in the same proportion as the proportion of deaths in the group.

24.1

Kaplan-Meier Product Limit Estimator

The first technique we will study is the Kaplan-Meier product limit estimator. We shall discuss its use for estimating survival distributions for mortality studies, but it may be used just as easily to estimate S ( x ) , and therefore F ( x ) , for loss data. To motivate it, consider a mortality study starting with n lives. Suppose that right before time y1 , we have somehow determined that the survival function S ( y1− ) is equal to a. Now suppose that there are r1 lives in the study at time y1 . Note that r1 may differ from n, since lives may have entered or left the study between inception and time y1 . Now suppose that at time y1 , s 1 lives died. See Figure 24.1 for a schematic. The proportion of deaths at time y1 is s 1 /r1 . Therefore, it is reasonable to conclude that the conditional survival rate past time y1 , given survival to time y1 , is 1 − s1 /r1 . Then the survival function at time y1 should be multiplied by this proportion, making it a (1 − s1 /r1 ) . The same logic is repeated at the second event time y2 in Figure 24.1, so that the survival function at time y2 is a (1 − s1 /r1 )(1 − s 2 /r2 ) . Suppose we have a study where the event of interest, say death, occurs at times y j , j ≥ 1. At each time y j , there are r j individuals in the study, out of which s j die. Then the Kaplan-Meier estimator of S ( t ) sets S n ( t )  1 for t < y1 . Then recursively, at the j th event time y j , S n ( y j ) is set equal to S n ( y j−1 )(1 − s j /r j ) , with y0  0. For t in between event times, S n ( t )  S n ( y j ) , where y j is the latest event time no later than t. The Kaplan Meier product limit formula is Kaplan-Meier Product Limit Estimator Sn ( t ) 

j−1  Y i1

1−

si , ri



y j−1 ≤ t < y j

(24.1)

r i is called the risk set at time y i . It is the set of all individuals subject to the risk being studied at the event time. If entries or withdrawals occur at the same time as a death—for example, if 2 lives enter at time 5, 3 lives leave, and 1 life dies—the lives that leave are in the risk set, while the lives that enter are not. Example 24A In a mortality study, 10 lives are under observation. One death apiece occurs at times 3, 4, and 7, and two deaths occur at time 11. One withdrawal apiece occurs at times 5 and 10. The study concludes at time 12. Calculate the product limit estimate of the survival function. Answer: In this example, the event of interest is death. The event times are the times of death: 3, 4, 7, and 11. We label these events y i . The number of deaths at the four event times are 1, 1, 1, and 2 respectively. We label these numbers s i . That leaves us with calculating the risk set at each event time. At time 3, there are 10 lives under observation. Therefore, the first risk set, the risk set for time 3, is r1  10. At time 4, there are 9 lives under observation. The life that died at time 3 doesn’t count. Therefore, r2  9. C/4 Study Manual—17th edition Copyright ©2014 ASM

24.1. KAPLAN-MEIER PRODUCT LIMIT ESTIMATOR

395

b

1.0

r

b r

0.8

b r

b

0.6 r

0.4 0.2 r

1

2

3

5

4

6

7

8

9

10

11

12

Figure 24.2: Graph of y  S10 ( x ) computed in Example 24A

At time 7, there are 7 lives under observation. The lives that died at times 3 and 4, and the life that withdrew at time 5, don’t count. Therefore, r3  7. At time 11, the lives that died at times 3, 4, and 7 aren’t in the risk set. Nor are the lives that withdrew at times 5 and 10. That leave 5 lives in the risk set. r4  5. We now calculate the survival function S10 ( t ) for 0 ≤ t ≤ 12 recursively in the following table, using formula (24.1). j

Time yj

Risk Set rj

Deaths sj

1 2 3 4

3 4 7 11

10 9 7 5

1 1 1 2

Survival Function S10 ( t ) for y j ≤ t < y j+1

(10 − 1) /10  0.9000 S10 (4− ) × (9 − 1) /9  0.8000 S10 (7− ) × (7 − 1) /7  0.6857 S10 (11− ) × (5 − 2) /5  0.4114

S10 ( t )  1 for t < 3. In the above table, y5 should be construed to equal 12.



We plot the survival function of Example 24A in Figure 24.2. Note that the estimated survival function is constant between event times, and for this purpose, only the event we are interested in—death—counts, not withdrawals. This means, for example, that whereas S10 (7)  0.6857, S10 (6.999)  0.8000, the same as S10 (4) . The function is discontinuous. By definition, if X is the survival time random variable, S ( x )  Pr ( X > x ) . This means that if you want to calculate Pr ( X ≥ x ) , this is S ( x − ) , which may not be the same as S ( x ) . Example 24B Assume that you are given the same data as in Example 24A. Using the product limit estimator, estimate: 1. the probability of a death occurring at any time greater than 3 and less than 7. 2. the probability of a death occurring at any time greater than or equal to 3 and less than or equal to 7. Answer:

1. This is Pr (3 < X < 7)  Pr ( X > 3) − Pr ( X ≥ 7)  S (3) − S (7− )  0.9 − 0.8  0.1 .

2. This is Pr (3 ≤ X ≤ 7)  Pr ( X ≥ 3) − Pr ( X > 7)  S (3− ) − S (7)  1 − 0.6857  0.3143 . C/4 Study Manual—17th edition Copyright ©2014 ASM



24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

396

Example 24A had withdrawals but did not have new entries. New entries are treated as part of the risk set after they enter. The next example illustrates this, and also illustrates another notation system used in the textbook. In this notation system, each individual is listed separately. d i indicates the entry time, u i indicates the withdrawal time, and x i indicates the death time. Only one of u i and x i is listed. Example 24C You are given the following data from a mortality study: i

di

xi

ui

1 2 3 4

0 0 2 5

— 5 — 7

7 — 8 —

Estimate the survival function using the product-limit estimator. Answer: There are two event times, 5 and 7. At time 5, the risk set includes individuals 1, 2, and 3, but not individual 4. New entries tied with the event time do not count. So S4 (5)  2/3. At time 7, the risk set includes individuals 1, 3, and 4, since withdrawals tied with the event time do count. So S4 (7)  (2/3)(2/3)  4/9. The following table summarizes the results: j

yj

rj

sj

1 2

5 7

3 3

1 1

S4 ( t ) for y j ≤ t < y j+1 2/3 4/9



In any time interval with no withdrawals or new entries, if you are not interested in the survival function within the interval, you may merge all event times into one event time. The risk set for this event time is the number of individuals at the start of the interval, and the number of deaths is the total number of deaths in the interval. For example, in Example 24A, to calculate S10 (4) , rather than multiplying two factors for times 3 and 4, you could group the deaths at 3 and 4 together, treat the risk set at time 4 as 10 and the number of deaths as 2, and calculate S10 (4)  8/10. These principles apply equally well to estimating severity with incomplete data. Example 24D An insurance company sells two types of auto comprehensive coverage. Coverage A has no deductible and a maximum covered loss of 1000. Coverage B has a deductible of 500 and a maximum covered loss of 10,000. The company experiences the following loss sizes: Coverage A: 300, 500, 700, and three claims above 1000 Coverage B: 700, 900, 1200, 1300, 1400 Let X be the loss size. Calculate the Kaplan-Meier estimate of the probability that a loss will be greater than 1200 but less than 1400, Pr (1200 < X < 1400) . Answer: We treat the loss sizes as if they’re times! And the “members” of Coverage B enter at “time” 500. The inability to observe a loss below 500 for Coverage B is analogous to a mortality study in which members enter the study at time 500. The loss sizes above 1000 for Coverage A are treated as withdrawals; they are censored observations, since we know those losses are greater than 1000 but don’t know exactly what they are. The Kaplan-Meier table is shown in Table 24.1. We will explain below how we filled it in. At 300, only coverage A claims are in the risk set; coverage B claims are truncated from below. Thus, the risk set at 300 is 6. Similarly, the risk set at 500 is 5; remember, new entrants are not counted at the C/4 Study Manual—17th edition Copyright ©2014 ASM

24.1. KAPLAN-MEIER PRODUCT LIMIT ESTIMATOR

397

Table 24.1: Survival function calculation for Example 24D

j

Loss Size yj

Risk Set rj

Losses sj

1 2 3 4 5 6 7

300 500 700 900 1200 1300 1400

6 5 9 7 3 2 1

1 1 2 1 1 1 1

Survival Function S11 ( t ) for y j ≤ t < y j+1 5/6 2/3 14/27 4/9 8/27 4/27 0

time they enter, only after the time, so though the deductible is 500, coverage B losses do not count  even  at 500. So we have that S11 (500)  65 45  23 . At 700, 4 claims from coverage A (the one for 700 and the 3 censored ones) and all 5 claims from coverage B are in the risk set, making the risk set 9. Similarly, at 900, the risk set is 7. So S11 (900)  2 7 6 4  3 9 7 9.   8 At 1200, only the 3 claims 1200 and above on coverage B are in the risk set. So S11 (1200)  94 23  27 .   4 8 1 Similarly, S11 (1300)  27 2  27 . 8 The answer to the question is Pr11 ( X > 1200) −Pr11 ( X ≥ 1400)  S11 (1200) −S11 (1400− ) . S11 (1200)  27 . 4 − − But S11 (1400 ) is not the same as S11 (1400) . In fact, S11 (1400 )  S11 (1300)  27 , while S11 (1400)  0. The final answer is then Pr11 (1200 < X < 1400) 

8 27



4 27



4 27

.



If all lives remaining in the study die at the last event time of the study, then S can be estimated as 0 past this time. It is less clear what to do if the last observation is censored. The two extreme possibilities are 1. to treat it as if it were a death, so that S ( t )  0 for t ≥ y k , where y k is the last observation time of the study. 2. to treat it as if it lives forever, so that S ( t )  S ( y k ) for t ≥ y k . A third option is to use an exponential whose value is equal to S ( y k ) at time y k . Example 24E In example 24A, you are to use the Kaplan-Meier estimator, with an exponential to extrapolate past the end of the study. Determine S10 (15) . Answer: S10 (12)  S10 (11)  0.4114, as determined above. We extend exponentially from the end 12 of the study at time 12. In other words, we want e −12/θ  0.4114, or θ  − ln 0.4114 . Then S10 (15) 

exp



15 ln 0.4114 12



 0.411415/12  0.3295 .



Notice in the above example that using an exponential to go from year 12 to year 15 is equivalent to raising the year 12 value to the 15/12 power. In general, if u is the ending time of the study, then exponential extrapolation sets S n ( t )  S n ( u ) t/u for t > u. If a study has no members before a certain time—in other words, the study starts out with 0 individuals and the first new entries are at time y0 —then the estimated survival function is conditional on the estimated variable being greater than y0 . There is simply no estimate for values less than y0 . For example, if Example 24D is changed so that Coverage A has a deductible of 250, then the estimates are C/4 Study Manual—17th edition Copyright ©2014 ASM

24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

398

H (t ) = b

H (t ) = b +

y1

s1 r1

y2

H (t ) = b +

s1 s2 + r1 r2

t s1 deaths out of r1 lives

s2 deaths out of r2 lives

Figure 24.3: Illustration of the Nelson-Åalen estimator of cumulative hazard function. The cumulative hazard function is initially b . After each event time, it is incremented by the proportion of deaths in the group.

for S11 ( x | X > 250) , and Pr11 (1200 < X < 1400 | X > 250)  4/27. It is not possible to estimate the unconditional survival function in this case. Note that the letter k is used to indicate the number of unique event times. There is a released exam question in which they expected you to know that that is the meaning of k.

?

Quiz 24-1 You are given the following information regarding six individuals in a study: dj

uj

xj

0 0 0 1 2 3

5 4 — 3 — 5

— — 3 — 4 —

Calculate the Kaplan-Meier product-limit estimate of S (4.5) . Now we will discuss another estimator for survival time.

24.2

Nelson-Åalen Estimator

The Nelson-Åalen estimator estimates the cumulative hazard function. The idea is simple. Suppose the cumulative hazard rate before time y1 is known to be b. If at that time s1 lives out of a risk set of r1 die, that means that the hazard at that time y1 is s 1 /r1 . Therefore the cumulative hazard function is increased by that amount, s j /r j , and becomes b + s1 /r1 . See Figure 24.3. The Nelson-Åalen estimator sets Hˆ (0)  0 and then at each time y j at which an event occurs, Hˆ ( y j )  Hˆ ( y j−1 ) + s j /r j . The formula is: Nelson-Åalen Estimator Hˆ ( t ) 

j−1 X si i1

ri

,

y j−1 ≤ t < y j

Example 24F In a mortality study on 98 lives, you are given that (i) 1 death occurs at time 5 (ii) 2 lives withdraw at time 5 (iii) 3 lives enter the study at time 5 (iv) 1 death occurs at time 8 Calculate the Nelson-Åalen estimate of H (8) . C/4 Study Manual—17th edition Copyright ©2014 ASM

(24.2)

24.2. NELSON-ÅALEN ESTIMATOR

399

Table 24.2: Summary of Formulas in this Lesson

Kaplan-Meier Product Limit Estimator Sˆ ( t ) 

j−1  Y i1

1−

si , ri



y j−1 ≤ t < y j

(24.1)

Nelson-Åalen Estimator Hˆ ( t ) 

j−1 X si i1

Exponential extrapolation

ri

y j−1 ≤ t < y j

,

Sˆ ( t )  Sˆ ( t0 ) t/t0

(24.2)

t ≥ t0

Answer: The table of risk sets and deaths is j

Time yj

Risk Set rj

Deaths sj

NA estimate Hˆ ( y j )

1

5

98

1

1 98

2

8

98

1

1 1 + 98 98

At time 5, the original 98 lives count, but we don’t remove the 2 withdrawals or count the 3 new entrants. At time 8, we have the original 98 lives minus 2 withdrawals minus 1 death at time 5 plus 3 new entrants, or 98 − 2 − 1 + 3  98 in the risk set. 1 1 1 Hˆ (8)  +  98 98 49



To estimate the survival function using Nelson-Åalen, exponentiate the Nelson-Åalen estimate; Sˆ ( x )  In the above example, the estimate would be Sˆ (8)  e −1/49  0.9798. This will always be higher than the Kaplan-Meier estimate, except when Hˆ ( x )  0 (and then both estimates of S will be 1). In the 97 2  0.9797. above example, the Kaplan-Meier estimate would be 98 Everything we said about extrapolating past the last time, or conditioning when there are no observations before a certain time, applies equally well to Sˆ ( t ) estimated using Nelson-Åalen. ˆ e −H ( x ) .

?

Quiz 24-2 In a mortality study on 10 lives, 2 individuals die at time 4 and 1 individual at time 6. The others survive to time 10. Using the Nelson-Åalen estimator, estimate the probability of survival to time 10.

C/4 Study Manual—17th edition Copyright ©2014 ASM

24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

400

Calculator Tip Usually it is easy enough to calculate the Kaplan-Meier product limit estimator by directly multiplying 1 − s j /r j . If you need to calculate several functions of s j and r j at once, such as both the Kaplan-Meier and the Nelson-Åalen estimator, it may be faster to enter s j /r j into a column of the TI-30XS/B Multiview’s data table, and the function ln (1 − L1) . The Kaplan-Meier estimator is a product, whereas the statistics registers only include sums, so it is necessary to log each factor, and then exponentiate the sum in the statistics register. Also, the sum is always of the entire column, so you must not have extraneous rows. If you need to calculate the estimator at two times, enter the rows needed for the earlier time, calculate the estimate, then add the additional rows for the second time. Example 24G Seven times of death were observed: 5

6

6

8

10

12

15

In addition, there was one censored observation apiece at times 6, 7, and 11. Calculate the absolute difference between the product-limit and Nelson-Åalen estimates of S (10) . Answer: Only times up to 10 are relevant; the rest should be omitted. The r i ’s and s i ’s are yi

ri

si

5 6 8 10

10 9 5 4

1 2 1 1

Here is the sequence of steps on the calculator: Clear table

data data 4

Enter s i /r i in column 1

Enter formula Kaplan-Meier in umn 2

for col-

Calculate statistics registers Clear display

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 ÷ 10

s% 2 ÷ 9 s% 1 ÷ 5 s% 1 ÷ 4 enter

t% data t% 1

ln 1- data 1 ) enter

2nd [stat]2 (Select L1 as first variable and L2 as enter

second)

s% s%

clear clear

L1

L2

L3

L1 L2 0.2222 0.2 0.25

L3

L1(1)=

L1(5)= L1 L2 L3 −0.105 0.1 0.2222 −0.251 −0.223 0.2 −0.288 0.25 L2(1)=−0.10536051.. 2-Var:L1,L2 1:n=4 ¯ 2:x=0.1930555556 3↓Sx=0.065322089

EXERCISES FOR LESSON 24

401

Calculator Tip Extract sum x (statistic 8) and sum y (statistic 10) from table Calculate difference of estimates

t%

2nd [ e x ] (−) 2nd [stat]38 − 2nd [ e x ] 2nd [stat]3 (Press 9 times to get to A)

s%

enter

2-Var:L1,L2 P 8↑ x=0.77222222 P 2 9: Px =0.1618827 A↓ y=−0.86750056 P P e− x − e y 0.041985293

The answer is 0.041985293 . Notice that the negative of the Nelson-Åalen estimator was exponentiated, but no negative sign is used for the sum of the logs of the factors of the product-limit estimator. 

Exercises 24.1. [160-F86:2] The results of using the product-limit data set are: 1.0,       49    ,   50   Sˆ ( x )   1,911   ,   2,000      36,309    ,  40,000

(Kaplan-Meier) estimator of S ( x ) for a certain 0≤x 100)  e −0.509524  0.6008 . and Pr

24.26. We now have 10 observations plus the censored observation of 100, so we calculate the cumulative hazard rate using risk sets of 11 at 74, 10 at 89, and 9 at 95. The risk sets at 102, 106, and 122 are the same as in the previous exercise, so we’ll add the sum computed there, 0.509524, to the sum of the quotients from the lowest three observations. 1 1 1 Hˆ (125)  + + + 0.509524  0.811544 11 10 9

D ( X > 125)  e −0.811544  0.4442 . and Pr 24.27. The Nelson-Åalen estimate of Hˆ (12) is 2 1 1 2 Hˆ (12)  + + +  0.65 15 12 10 6 Then Sˆ (12)  e −0.65  0.5220 . (B) 24.28. Since there is no censoring, we have yi 1 2 3 5

C/4 Study Manual—17th edition Copyright ©2014 ASM

ri 50 49 48 46

si 1 1 2 2

Hˆ ( y i ) 1/50  0.02 0.02 + 1/49  0.04041 0.04041 + 2/48  0.08207 0.08207 + 2/46  0.12555

EXERCISE SOLUTIONS FOR LESSON 24

417

24.29. To go from time 3.75 to time 4, since only one agent resigned in between, we multiply Sˆ (3.75) by r i −s i r i , where s i  1 for the one agent who resigned and r i is the risk set at the time that agent resigned. Since 11 agents were employed longer, the risk set is r i  11 + 1  12 (counting the agent who resigned and the 11 who were employed longer). If we let y i be the time of resignation, since nothing happens between y i and 4, ! 11  0.2292 Sˆ (4)  Sˆ ( y i )  0.25 12 The fact 2 agents were employed for 6 years is extraneous. 24.30. The product-limit estimator up to time 5, taking the 2 censored observations at 4 and 6 into account, is: yi

ri

si

Sˆ ( y i )

1 3 5

9 8 5

1 2 1

8/9 6/9 (6/9)(4/5)  24/45

D (3 ≤ T ≤ 5)  Sˆ (3− ) − Sˆ (5)  8 − 24  16  0.3556 Pr 9 45 45

(C)

24.31. You can calculate all five possibilities, but let’s reason it out. If the lapse occurred at time 5, 4 claims occurred; otherwise, only 3 claims occurred, so one would expect the answer to be 5 , (E). 24.32. Since there is no censoring (in every case, r i+1  r i − s i ), the products telescope, and the productlimit estimator becomes the empirical estimator. S T (4) 

(112 − 22) + (45 − 10)

200 + 100 45 − 10 S B (4)   0.35 100 ST (4) − S B (4)  0.067 (B) 1 24.33. We have n1 + n−1  mate the equation as

23 132



125  0.417 300

which is a quadratic, but since n must be an integer, it is easier to approxi2 23 ≈ n − 1/2 132 1 264 n− ≈  11.48 2 23

so n  12. Then

23 132

+

1 10

+

1 9

 0.3854 . (C)

6 24.34. Through time 5 there is no censoring, so Sˆ (5)  10 (6 survivors out of 10 original lives). Then   6 3 Sˆ (7)  10 5 (three survivors from 5 lives past 5), so Sˆ (7)  0.36. There are no further claims between 7 and 8, so the answer is 0.36 . (D)

24.35. Hˆ ( n )  − ln 1 − Fˆ ( n )  0.78



n X i i1 C/4 Study Manual—17th edition Copyright ©2014 ASM

100

 0.78



24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

418

n ( n + 1)  78 2 This quadratic can be solved directly, or by trial and error; approximate the equation with making n + 0.5 around 12.5, and we verify that 12 works. (E)

( n+0.5) 2 2

 78

24.36. The x i ’s are the events. d i ’s are entry times into the study, and u i ’s are withdrawal, or censoring, times. Every member of the study is counted in the risk set for times in the interval ( d i , u i ]. Before time 1.6, there are 2 event times, 0.9 and 1.5. (The other x i ’s are 1.7 and 2.1, which are past 1.6.) At time 0.9, the risk set consists of all entrants before 0.9, namely i  1 through 7, or 7 entries. There are no withdrawals or deaths before 0.9, so the risk set is 7. At time 1.5, the risk set consists of all entrants before 1.5, or i  1 through 8, minus deaths or withdrawals before time 1.5: the death at 0.9 and the withdrawal at 1.2, leaving 6 in the risk set. Note that entrants at time 1.5 are not counted in the risk set and withdrawals at time 1.5 are counted. The standard table with y j ’s, r j ’s, and s j ’s looks like this:

The Kaplan-Meier estimate is then

yj

rj

sj

Sˆ ( y j )

0.9 1.5

7 6

1 1

6/7 5/7

6 5 7 6



5 7

 0.7143 , or (E).

24.37. We must calculate n. Either you observe that the denominator 380 has divisors 19 and 20, or you estimate 2 39 ≈ n − 0.5 380

1 1 39 and you conclude that n  20, which you verify by calculating 20 + 19  380 . The Kaplan-Meier estimate is the empirical complete data estimate since no one is censored; after 9 deaths, the survival function is (20 − 9) /20  0.55 . (A)

Quiz Solutions 24-1. The risk set is 5 at time 3, since the entry at 3 doesn’t count. The risk set is 4 at time 4, after removing the third and fourth individuals, who left at time 3. The estimate of S (4.5) is (4/5)(3/4)  0.6 . 24-2.

The risk sets are 10 at time 4 and 8 at time 6. Therefore 2 1 Hˆ (10)  +  0.325 10 8 Sˆ (10)  e −0.325  0.7225

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 25

Estimation of Related Quantities Reading: Loss Models Fourth Edition 11,12.1–12.2 If the empirical distribution, whether based on complete or incomplete data, is used as the model, questions for the distribution are answered based on it. It is treated as a genuine, full-fledged distribution, even though it’s discrete. You can use the methods we learned in the first part of this course to calculate limited moments and similar items.

25.1

Moments

25.1.1

Complete individual data

When the distribution is estimated from complete individual data, the distribution mean is the same as the sample mean. Percentiles are not well-defined for a discrete distribution, so smoothing is desirable; we’ll discuss smoothed empirical percentiles in Lesson 31. The variance of the empirical distribution is σ2 

Pn

i1 ( x i

n

− x¯ ) 2

.

When using the empirical distribution as the model, do not divide by n − 1 when calculating the variance! This is not an unbiased estimator of some underlying variance—it is the variance, since the empirical distribution is the model. Thus, using the empirical distribution as the model results in a biased estimate of the variance.

25.1.2

Grouped data

When the distribution is estimated from grouped data, moments are calculated using the ogive or histogram. The mean can be calculated as the average of the averages—sum up the averages of the groups, weighted by the probability of being in the group. Higher moments, however, may require integration. The following example illustrates this. Example 25A You have the following data for losses on an automobile liability coverage: Loss Size

Number of Losses

0– 1000 1000– 2000 2000– 5000 5000–10,000

23 16 6 5

A policy limit of 8000 is imposed. Estimate the average claim payment and the variance of claim payments, taking the policy limit into account. Answer: Remember that the value of the histogram ( fˆ( x ) ) is the number of observations in the group divided by the total number of observations and by the length of the group’s interval. The histogram is constant in each interval. C/4 Study Manual—17th edition Copyright ©2014 ASM

419

25. ESTIMATION OF RELATED QUANTITIES

420

In this study, there are a total of 50 observations. We therefore calculate: Loss Size

Size of Interval

Number of Losses

0– 1000

1000

23

1000– 2000

1000

16

2000– 5000

3000

6

5000–10,000

5000

5

Histogram fˆ( x ) .  23 (50)(1000)  0.00046 .  16 (50)(1000)  0.00032 .  6 (50)(3000)  0.00004 .  5 (50)(5000)  0.00002

Figure 25.1 graphs the histogram.

fˆ( x ) 0.00050 0.00040 0.00030 0.00020 0.00010 s

1000

2000

3000

4000

5000

6000

7000

8000

9000

10,000

x

Figure 25.1: Histogram for Example 25A

The mean can be calculated using the shortcut mentioned before the example, namely as an average of averages. In each interval below the interval containing the policy limit, the average claim payment is the midpoint. However, we must split the interval (5000, 10,000) into two subintervals, (5000, 8000) and (8000, 10,000) , because of the policy limit. In the interval (5000, 8000) , the average claim payment is 6500, but in the interval (8000, 10,000) , the average claim payment is 8000. The weight given to each interval is the probability of being in the interval. For the first three intervals, the probabilities are the number of claims in the interval divided by the total number of claims in the study. For example, in the interval (2000, 5000) , the probability of a claim being in the interval is 6/50  0.12. The probability of a loss between 5000 and 8000 is determined from the ogive, F50 (8000) − F50 (5000) . However, the ogive is a line, so the probability of a loss between 5000 and 8000 is the probability of a loss in the loss size interval (5000, 10,000) , which is 0.1, times the proportion of the interval consisting of (5000, 8000) , which is (8000 − 5000) / (10,000 − 5000)  0.6. Thus the probability that a loss is between 5000 and 8000 is (0.1)(0.6)  0.06. By similar logic, the probability that a loss is in the interval (8000, 10,000) is (0.1)(0.4)  0.04. We therefore calculate the average claim payment as follows: C/4 Study Manual—17th edition Copyright ©2014 ASM

25.1. MOMENTS

421

Interval

Probability of loss in interval

Average claim payment in interval

0.46 0.32 0.12 0.06 0.04

500 1500 3500 6500 8000

0– 1000 1000– 2000 2000– 5000 5000– 8000 8000–10,000 Total

Product 230 480 420 390 320 1840

The average claim payment is 1840. You could also calculate this using integrals, the way we will calculate the second moment, but that is harder. To calculate the variance, we first calculate the second moment. This means integrating x 2 f50 ( x ) from 0 to 8000, as in this interval the payment is x, and integrating 80002 f50 ( x ) from 8000 to 10,000, as in this interval the payment is 8000. Since f50 is piecewise constant, we integrate it interval by interval, using formula (22.1) to calculate f50 on each interval. E ( X ∧ 8000) 2 

f

g

1000

Z 0

+ 

1 3



Z

0.00046x 2 dx + 8000

5000

Z

2000 1000

0.00002x 2 dx +

0.00032x 2 dx + 10,000

Z

8000

Z

5000 2000

0.00004x 2 dx

0.00002 (80002 ) dx

0.00046 (10003 − 03 ) + 0.00032 (20003 − 10003 ) + 0.00004 (50003 − 20003 )

+ 0.00002 (80003 − 50003 ) + 0.00002 (80002 )(10,000 − 8000)



 31 (460,000 + 2,240,000 + 4,680,000 + 7,740,000) + 2,560,000

 7,600,000

The variance is then 7,600,000 − 18402  4,214,400 . An alternative to integrating is to use equation (2.4) to calculate the second moment of the uniform distribution in each interval, and weight the results by the probabilities of losses in the interval. That formula says that for a uniform distribution on [d, u], the second moment is 13 ( d 2 + du + u 2 ) . In this example, the calculation would be: E[ ( X ∧ 8000) 2 ] 

  1 0.46 (10002 ) + 0.32 10002 + (1000)(2000) + 20002 3   + 0.12 20002 + (2000)(5000) + 50002 + 0.06 50002 + (5000)(8000) + 80002





+ 0.04 (80002 )

An alternative method for calculating variance is to use the conditional variance formula. This avoids integration, since the conditional distribution of claims within each interval is uniform, and the variance of a uniform is the range squared divided by 12. Let I be the variable for the interval. Then Var ( X ∧ 8000)  E[Var ( X ∧ 8000 | I ) ] + Var (E[X ∧ 8000 | I])

10002 10002 30002 30002 E , , , , 0 + Var (500, 1500, 3500, 6500, 8000) 12 12 12 12

"

#

where each variance is based on the length of the interval (1000, 1000, 3000, 3000, and 0 for the five intervals 0–1000, 1000–2000, 2000–5000, 5000–8000, and 8000 respectively) squared and divided by 12, and each expected value is the midpoint of each interval. C/4 Study Manual—17th edition Copyright ©2014 ASM

25. ESTIMATION OF RELATED QUANTITIES

422

The expected value of the five variances is weighted by the probabilities of the intervals, each of which is the number of losses divided by 50, so 10002 10002 30002 30002 23 (10002 /12) + 16 (10002 /12) + 6 (30002 /12) + 3 (30002 /12) E , , ,   200,000 12 12 12 12 50

"

#

The overall expected value is 1840, as computed above. The second moment of the five interval expected values is 23 (5002 ) + 16 (15002 ) + 6 (35002 ) + 3 (65002 ) + 2 (80002 )  7,400,000 50 so the variance of the expected values is 7,400,000 − 18402  4,014,400. The variance of claim payments is then 200,000 + 4,014,400  4,214,400 . 

?

Quiz 25-1 Use the same data as in Example 25A. A policy limit of 1000 is imposed. Estimate the variance of claim payments assuming uniform distribution of loss sizes within each interval.

25.1.3

Incomplete data

When data are incomplete due to censoring or truncation, the product limit estimator or the Nelson-Åalen estimator is used. You can then estimate S ( x ) . To obtain the expected values or limited expected values, P you can use the definition, namely xp x for the expected value where p x is Pr ( X  x ) as estimated and a discrete counterpart of equation (5.3) for limited moments, or you can use formulas (5.2) or (5.6): ∞

Z E[X] 

0 d

Z E[X ∧ d] 

0

S ( x ) dx

S ( x ) dx

Sˆ ( x ) , whether it is a Kaplan-Meier or a Nelson-Åalen estimate, is a step function, so integrating it can be done by summing up the areas under the horizontal lines in the graph of the function. In other words (but please don’t memorize this formula, just understand it): ∞

Z 0

Sˆ ( x ) dx 

∞ X j0

Sˆ ( y j )( y j+1 − y j )

where y0  0 and the other y j ’s are event times. Example 25B [160-F86:6] You are given:

C/4 Study Manual—17th edition Copyright ©2014 ASM

Age at

Individual

Entry

Withdrawal

Death

A B C D E F G H

0 0 0 0 5 10 15 20

6 27 – – – – 50 23

– – 42 42 60 24 – –

25.1. MOMENTS

423

Using the product-limit estimator of S ( x ) , determine the expected future lifetime at birth. (A) 46.0

(B) 46.5

(C) 47.0

(D) 47.5

(E) 48.0

Answer: Event times (deaths) are 24, 42, and 60. We have S ( t )  1 for t < 24 and: yj

rj

sj

24 42 60

6 4 1

1 2 1

S8 ( x ) , y j ≤ x < y j+1 5/6 5/12 0

We therefore have 5 1  6 6 5 5 5 Pr ( X  42)  −  6 12 12 5 Pr ( X  60)  12 Pr ( X  24)  1 −

and expected survival time is E[X] 

1 5 5 (24) + (42) + (60)  46.5 6 12 12

Alternatively, you could calculate the integral of the survival function. Figure 25.2 graphs S8 ( x ) . 1.0

S8 ( x )

0.8 0.6 0.4 0.2 q

x 10

20

30

50

40

60

Figure 25.2: Graph of estimated survival function for Example 25B

The expected value, the integral of S8 ( x ) , is the sum of the areas of the three rectangles under the graph of S8 ( x ) : E[X]  24 (1) + 18

C/4 Study Manual—17th edition Copyright ©2014 ASM

5 6

+ 18

5 12

 46.5

(B)



25. ESTIMATION OF RELATED QUANTITIES

424

?

Quiz 25-2 Using the data of Example 25B, determine the estimated variance of future lifetime at birth. Example 25C In a mortality study, 10 lives are under observation. One death apiece occurs at times 3, 4, and 7, and two deaths occur at time 11. Withdrawals occur at times 5 and 10. The study concludes at time 12. (This is the same data as in Example 24A.) You estimate the survival function using the product limit estimator, and extrapolate past time 12 using an exponential. Estimate expected survival time. Answer: We calculated the survival function in the answer to Example 24A on page 394. In Example 24E on page 397 we determined that the extrapolated function past time 12 is 0.4114t/12 at time t > 12. The graph of the estimated survival function is shown in Figure 25.3. The expected value of a nonnegative 1.0 0.8 0.6

A

0.4

B C

0.2 0 0

5

D

E

F

10

20

15

25

30

Figure 25.3: Graph of y  S10 ( x ) of examples 24E and 25C

random variable is equal to the integral of the survival function, or the shaded area under the graph of the survival function. This area consists of rectangles A, B, C, D, and E, plus area F. The areas of the rectangles are (base times height): Area(A)  (3 − 0)(1)  3

Area(B)  (4 − 3)(0.9)  0.9

Area(C)  (7 − 4)(0.8)  2.4

Area(D)  (11 − 7)(0.6857)  2.7429

Area(E)  (12 − 11)(0.4114)  0.4114

The area of F is the integral of 0.4114t/12 from 12 to ∞: Area(F) 

Z



12

0.4114t/12 dt





 ∞

12 0.4114t/12 ln 0.4114

 5.5583

12

An alternative method for evaluating the area of F is to use the fact that the distribution is exponential, and the integral we wish to calculate equals E[ ( X −12)+ ]. Since the survival function of the exponential (as seen in the integrand) is 0.4114t/12  e t ln 0.4114/12 , it follows that the mean of the exponential is −12/ ln (0.4114)  C/4 Study Manual—17th edition Copyright ©2014 ASM

25.2. RANGE PROBABILITIES

425

13.51063. Then the conditional mean of survival time, given that it is greater than 12, is equal to the mean of the exponential distribution, since an exponential has no memory. Thus, the conditional mean of survival time given that it is greater than 12 is 13.51063. Multiplying this mean by the probability of the variable being greater than 12, or 0.4114, we get that the area of F is 0.4114 (13.51063)  5.5583. Then the expected survival time is E[X]  3 + 0.9 + 2.4 + 2.7429 + 0.4114 + 5.5583  15.0126

25.2



Range probabilities

The probability that x is in the range ( a, b] is F ( b ) − F ( a ) . Since the empirical distribution is discrete, however, you must be careful to include or exclude the boundaries depending upon whether you’re interested in “less than” (or greater than) or “less than or equal” (or greater than or equal). The following two simple examples demonstrate this. Example 25D On an inland marine coverage, you experience loss sizes (in millions) of 2, 2, 3, 4, 6, 8, 9, 12. You use the empirical distribution based on this data as the model for loss size. Let X be the loss size. Determine Pr (3 ≤ X ≤ 6) . Answer: F (6)  0.625 and F (3− )  0.25 (there are only 2 claims below 3), so the answer is 0.625 − 0.25  0.375 . Another way to see this is to note that exactly three of the loss sizes are in the range [3, 6], and 3/8  0.375.  Example 25E For a dental coverage, amount of time to handle 8 claims (in days) was 15, 22, 27, 35, 40, 40, 45, 50. In addition, two claims were withdrawn at times 30 and 35 without being settled. Let X be the amount of time to handle a claim. Use the Nelson-Åalen estimator to model amount of time for handling claims. Determine Pr (27 < X < 45) . Answer: Pr ( X > 27)  S (27) , so we need S (27) . Pr ( X < 45)  1 − S (45− ) , so we need S (45− ) , which here is the same as S (40) . 1 1 1 Hˆ (27)  + +  0.33611 10 9 8 1 2 Hˆ (45− )  0.33611 + +  1.00278 6 4 Pr (27 < X < 45)  e −0.33611 − e −1.00278  0.34768

?



Quiz 25-3 In a study of survival time, the study begins with 10 lives. One death apiece occurs at times 4, 7, 9, 12, 20. One withdrawal apiece occurs at times 6 and 10. All other lives survive to time 50. Estimate the probability of death at exact time 7 using the product limit estimator.

25.3

Deductibles and limits

We use empirical limited expected values to calculate the average payment per loss and per payment in the presence of limits and deductibles. For a deductible of d and a maximum covered claim of u, the average payment per loss is E[X ∧ u] − E[X ∧ d] C/4 Study Manual—17th edition Copyright ©2014 ASM

25. ESTIMATION OF RELATED QUANTITIES

426

and the average payment per payment is E[X ∧ u] − E[X ∧ d] . 1 − F (d )

If the data are grouped, these expected values are computed from the ogive. If an entire group is included in the range between the deductible and the maximum covered claim, this is equivalent to placing all claims in the group at the midpoint. Calculating variance is more cumbersome. Example 25A showed how to handle policy limits. Example 25F You have the following data for losses on an automobile collision coverage: Claim Size

Number of Claims

0– 1000 1000– 2000 2000– 5000 5000–10,000

23 16 6 5

You are pricing another coverage that has a deductible of 500. Estimate the average payment per payment on this coverage. Answer: Split the first interval (0, 1000) into two subintervals, (0, 500] and (500, 1000) . The average payment in the first interval is 0 and in the second interval the average payment is 250. The probability that a loss will be in the second interval is 0.5 (23) /50  0.23. Similarly, the probabilities of the other three intervals are 16/50  0.32, 6/50  0.12, and 5/50  0.1. The average payment per loss in those intervals is the midpoint, minus 500. So the overall average payment per loss is E[ ( X − 500)+ ]  0.23 (250) + 0.32 (1000) + 0.12 (3000) + 0.1 (7000)  1437.5 Also, Pr ( X ≤ 500)  0.5 (23) /50  0.23, so the average payment per payment is 1437.5/ (1 − 0.23)  1866.88 . 

25.4

Inflation

To handle uniform inflation, multiply each observed loss by the inflation amount. Example 25G You have the following data for losses on an automobile liability coverage: Claim Size

Number of Claims

0– 1000 1000– 2000 2000– 5000 5000–10,000

23 16 6 5

You are pricing another coverage that has a policy limit of 10,000. It is expected that loss sizes will increase by 5% due to inflation. Estimate the average payment per loss before and after inflation. 1 Answer: Before inflation, the average claim size is 50 23 (500) + 16 (1500) + 6 (3500) + 5 (7500)  1880 . With inflation, each group is multiplied by 1.05, so the first group’s claims are 0–1050, the second group’s claims are 1050–2100, the third group’s claims are 2100–5250, and the fourth group’s claims are 5250– 10,500. In the fourth group, claims are capped at 10,000. We use the ogive to get the average claim there.



C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISES FOR LESSON 25

427

500 of the claims in the range (5250, 10,500) will be Since a uniform distribution is implicit in the ogive, 5250 above 10,000. The remaining claims will have an average of the midpoint of the interval (5250, 10,000). So the average capped claim in this interval will be:

500 4750 5250 + 10,000 + (10,000)  7851.19. 5250 2 5250

!

The average claim after inflation is then 1 50



23 (525) + 16 (1575) + 6 (3675) + 5 (7851.19)  1971.62 .





Exercises 25.1.

A sample has the following observations: 2 observations of 400 7 observations of 800 1 observation of 1600

Calculate the coefficient of skewness for the empirical distribution.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

25. ESTIMATION OF RELATED QUANTITIES

428

Probability Density

25.2. [4B-S91:34] The following relative frequency histogram depicts the expected distribution of policyholder claims. 0.20 0.19 0.18 0.17 0.16 0.15 0.14 0.13 0.12 0.11 0.10 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00

0

2

6 8 10 12 Size of Policyholder Loss

4

14

16

You are given that: (i) The policyholder pays the first $1 of each claim. (ii) The insurer pays the next $9 of each claim. (iii) The reinsurer pays the remaining amount if the claim exceeds $10. Determine the average net claim size paid by the insurer. (A) (B) (C) (D) (E) 25.3.

Less than 3.8 At least 3.8, but less than 4.0 At least 4.0, but less than 4.2 At least 4.2, but less than 4.4 At least 4.4 You have the following data for 100 losses: Loss Size

Number of Losses

0– 1000 1000– 2000 2000– 5000 5000–10000

42 21 19 18 100

Assuming that payments are uniformly distributed within each interval, calculate the empirical limited expected value at 1800.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 25

25.4.

429

Claim sizes have the following distribution: Claim Size

Number of Claims

0–1000 1000–5000 5000– ∞

25 15 10

Let X be claim size. Assume a uniform distribution of claim sizes within each interval. Calculate E[X ∧ 3000], the limited expected value of payment at 3000. 25.5.

Claim sizes have the following distribution: Claim Size

Number of Claims

0– 500 500–1000 1000–2000

60 35 5

Let X be claim size. Assume a uniform distribution of claim sizes within each interval. Calculate E ( X ∧ 1000) 2 .

f

25.6.

g

Claim sizes have the following distribution: Claim Size

Number of Claims

0–1000 1000–2000 2000– ∞

10 7 3

Let X be claim size. Assume a uniform distribution of claim sizes within each interval. Calculate Var ( X ∧ 800) . 25.7. [4B-S96:22] (2 points) Forty (40) observed losses have been recorded in thousands of dollars and are grouped as follows: Interval ($000)

Number of Losses

(1,4/3) [4/3,2) [2,4) [4, ∞)

16 10 10 4

Let X be the size of loss random variable.

Estimate the average payment for a coverage having a limit of 2 (thousand). (A) (B) (C) (D) (E)

Less than 0.50 At least 0.50, but less than 1.0 At least 1.0, but less than 1.5 At least 1.5, but less than 2.0 At least 2.0

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

25. ESTIMATION OF RELATED QUANTITIES

430

25.8.

Eleven observed losses for a coverage with no deductible or limit have been recorded as follows: 200

210

220

300

410

460

520

610

900

2000

2700

Estimate the average payment per payment for a coverage with a 250 deductible and a 1000 maximum covered claim using the empirical distribution. 25.9.

Nine losses from a coverage with no deductible or limit have been recorded as follows: 2000

3500

4000

4500

6000

9600

10,000

10,000

16,000

Estimate the average payment per payment after 5% inflation if a maximum covered claim of 10,000 is imposed, using the empirical distribution. 25.10. Six losses have been recorded as follows: 500

1000

4500

6000

8000

10,000

Using the empirical distribution, estimate the variance of claim size. 25.11. A study is conducted on 5 year term policies. For 10 such policies, number of years until lapse is as follows: Years

Number of lapses

1 2 3 4 5

3 2 1 0 1

All lapses occur at the end of the year. In addition, there is one death apiece in the middle of years 2, 3, and 4. Using the product limit estimator, estimate average number of years until lapse on 5-year term policies. 25.12. On a disability coverage, you have data on the length of the disability period, in months, for 15 lives. For 11 lives, the amounts of time are 3, 4, 6, 6, 8, 10, 12, 24, 36, 60, 72. The remaining 4 lives are still on disability, and have been on disability for 6, 12, 18, and 24 months respectively. You are to use the product limit estimator to estimate the amount of time on disability. If a 24 month limit is imposed, what will the average time on disability be? 25.13. An auto liability coverage is sold with two policy limits, 25,000 and 50,000. Your data include nine payments on this coverage for the following amounts: 25,000 limit: 10,000, 20,000, 25,000, 25,000, 25,000 50,000 limit: 20,000, 30,000, 40,000, 50,000 You use the product limit estimator to estimate the loss distribution. Determine the estimate of expected payment per loss for coverage with a policy limit of 45,000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 25

431

25.14. [1999 C4 Sample:7] Summary statistics for a sample of 100 losses are: Interval

Number of Losses

Sum

Sum of Squares

(0, 2,000] (2,000, 4,000] (4,000, 8,000] (8,000, 15,000] (15,000, ∞]

39 22 17 12 10

38,065 63,816 96,447 137,595 331,831

52,170,078 194,241,387 572,753,313 1,628,670,023 17,906,839,238

100

667,754

20,354,674,039

Total

Determine the empirical limited expected value E[Xˆ ∧ 15,000]. 25.15. [4-F01:2] You are given: Claim Size ( X )

Number of Claims

(0, 25] (25, 50] (50, 100] (100, 200]

30 32 20 8

Assume a uniform distribution of claim sizes within each interval. Estimate the second raw moment of the claim size distribution. (A) (B) (C) (D) (E)

Less than 3300 At least 3300, but less than 3500 At least 3500, but less than 3700 At least 3700, but less than 3900 At least 3900

25.16. [4-F03:37] You are given: Claim Size ( X )

Number of Claims

(0, 25] (25, 50] (50, 100] (100, 200]

25 28 15 6

Assume a uniform distribution of claim sizes within each interval. Estimate E[X 2 ] − E[ ( X ∧ 150) 2 ].

(A) (B) (C) (D) (E)

Less than 200 At least 200, but less than 300 At least 300, but less than 400 At least 400, but less than 500 At least 500

Additional released exam questions: C-F06:3, C-S07:7

C/4 Study Manual—17th edition Copyright ©2014 ASM

25. ESTIMATION OF RELATED QUANTITIES

432

Solutions 25.1. The coefficient of skewness is scale-free, so let’s divide all the observations by 100 to make the arithmetic easier. 2 + (16−8) 2 The mean is 2 (4) +710(8) +16  8. The variance is 2 (4−8) 10  9.6. It would make no sense to divide by 9 instead of 10; you are not estimating the underlying variance (or even the underlying skewness). You are using the empirical distribution to calculate skewness! The third central moment is 25.2.

2 (4−8) 3 + (16−8) 3 10

 38.4. The coefficient of skewness is

38.4 9.61.5

 1.290994 .

The question asks for average payment per payment, although the language is somewhat unclear. Pr (claim)  1 − 0.1  0.9 3

Z

E[claim] 

1

0.1 ( x − 1) dx +

5

Z 3

0.2 ( x − 1) dx +

10

Z 5

0.03 ( x − 1) dx + 0.15 (9)

 0.2 (1) + 0.4 (3) + 0.15 (6.5) + 1.35  3.725 Average claim  3.725/0.9  4.14

(C)

Another way to calculate the expected payment per loss is to use the usual trick for uniform distributions: use double expectation and the fact that on any uniform interval the mean is the midpoint. The probability that the loss is in the interval (1,3) is 2 (0.1)  0.2. The probability that the loss is in the interval (3,5) is 2 (0.2)  0.4. The probability that the loss is in the interval (5,10) is 5 (0.03)  0.15. The probability that the loss is greater than 10 is 0.03 (5)  0.15. So the average payment per loss is E[claim]  0.2 (2 − 1) + 0.4 (4 − 1) + 0.15 (7.5 − 1) + 0.15 (10 − 1)  3.725 25.3. The average amount paid in the interval (0, 1000) is 500. Split the second interval into [1000, 1800] and (1800, 2000) . In [1000, 1800] the average payment is 1400. The payment is 1800 for losses of 1800 and above. The empirical probability of (0, 1000) is 0.42. The empirical probability of [1000, 1800] is 0.8 (0.21)  0.168. Therefore, E[X ∧ 1800] 

 1  42 (500) + 16.8 (1400) + 41.2 (1800)  1186.80 100

25.4. Split the interval [1000, 5000) into [1000, 3000] and (3000, 5000) . Average payment in the interval [1000, 3000] is 2000, and 0.5 (15)  7.5 claims out of 50 are expected in that interval. 25 claims are expected in (0, 1000) and the remaining 17.5claims are expected to be 3000 or greater. Therefore E[X ∧ 3000]  1 50 25 (500) + 7.5 (2000) + 17.5 (3000)  1600 . 25.5. See Example 25A on page 419 to see how this type of exercise is solved. This exercise is easier, because the limit (1000) is an endpoint of an interval. Each value of the histogram is calculated by dividing the number of observations in the interval by the length of the interval and the total number of observations (100). In the following equation, the common denominator 100 is pulled outside the brackets. E ( X ∧ 1000) 2 

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

1 60 (5003 ) 35 (10003 − 5003 ) 5 (10002 ) + +  304,166 23 (100)(3) 500 500 100

!

EXERCISE SOLUTIONS FOR LESSON 25

25.6.

433

See Example 25A on page 419 to see how this type of exercise is solved. Using the same technique, E[X ∧ 800]  E ( X ∧ 800)

f

But

Z

2

1 20





∞ 800



800

Z

g

8 (400) + 12 (800)  640

0

2

x f20 ( x ) dx +



Z

800

8002 f20 ( x ) dx

f20 ( x ) dx  1 − F20 (800) 

12 20

and by equation (22.1), f20 ( x ) 

10  0.0005 (20)(1000)

in the interval (0, 1000)

so E ( X ∧ 800) 2 

f

g

0.0005 (8003 ) 12 + (8002 )  469,333 13 3 20

Var ( X ∧ 800)  469333 31 − 6402  59,733 13 However, an easier method is available here, using conditional variance and the Bernoulli shortcut. The variance of X ∧ 800 is conditioned on I, the indicator variable for whether the loss is less than 800: Var ( X ∧ 800)  E[Var ( X ∧ 800 | I ) ] + Var (E[X ∧ 800 | I]) If the loss is greater than 800, then E[X ∧ 800]  800 and Var ( X ∧ 800)  0. If the loss is less than 800, then it is uniformly distributed on (0, 800) with mean 400 and variance 8002 /12, since in general the variance of a uniform random variable on (0, a ) is a 2 /12. The probability of X < 800 is 8/20  0.4. So E[Var ( X ∧ 800 | I ) ]  E[8002 /12, 0]  0.4 (8002 /12)  21,333 31 For the variance of the expectations, use the Bernoulli shortcut. Var (E[X ∧ 800 | I])  Var (400, 800)  (0.4)(0.6)(4002 )  38,400 Therefore, Var ( X ∧ 800)  21,333 31 + 38,400  59,733 13 . 25.7.

16

7 6

+ 10

5 3

40

+ 14 (2)



63 13 40



19 12

(D)

25.8. Sum up the eight losses higher than 250, capping the ones higher than 1000 at 1000, subtract 250 from each one (or subtract 250 at the end), and divide by 8: 50 + 160 + 210 + 270 + 360 + 650 + 2 (750)  400 8 25.9. After inflation, there will be four losses capped at 10,000 (since 1.05 (9600) > 10,000). The other losses are multiplied by 1.05: 1.05 (2000 + 3500 + 4000 + 4500 + 6000) + 4 (10,000)  6777.78 9

C/4 Study Manual—17th edition Copyright ©2014 ASM

25. ESTIMATION OF RELATED QUANTITIES

434

25.10. The usual shortcut of using the second moment minus the square of the mean is available, and we will use it. 500 + 1000 + 4500 + 6000 + 8000 + 10,000  5000 6 1 X 2 5002 + 10002 + 45002 + 60002 + 80002 + 10,0002  36,916,666.67 xi  6 6 Var ( X )  36,916,666.67 − 50002  11,916,666.67 x¯ 

25.11. We estimate S ( x ) . Since the deaths occur in the middle of the year and the lapses occur at the end of the year, the risk set for each year must exclude the deaths occurring in that year. yj

rj

sj

1 2 3 5

10 6 3 1

3 2 1 1

S10 ( x ) , y j ≤ x < y j+1 7

10 7  4 10 6  14 2 30

3

0



7 15 14 45

5

We now calculate E[X]  0 S ( x ) dx by summing up the areas of the rectangles under the graph of the estimated survival function.

R

5

Z E[X] 

0

14 7 + (5 − 3) S ( x ) dx  (1 − 0)(1) + (2 − 1)(0.7) + (3 − 2) 15 45

!

!

 1 + 0.7 + 0.4667 + 0.6222  2.7889 25.12. The varying censoring times in this problem are by no means far-fetched. If four lives began disability on January 1, 2000, July 1, 2000, January 1, 2001, and July 1, 2001, and were still on disability as of year-end 2001, and you were using year-end 2001 data, you would have exactly this pattern of censoring. Unlike the previous exercise, we are estimating the limited expected value at 24 here, E[X ∧ 24]. First we estimate S ( x ) . However, we have no need to estimate S ( x ) for x ≥ 24. yj 3 4 6 8 10 12

rj 15 14 13 10 9 8

sj 1 1 2 1 1 1

S15 ( x ) , y j ≤ x < y j+1 14 15 13 15 11 15

11 9  15 10  0.66  0.66 98  0.5867  0.5867 87  0.5133

24

We now calculate E[X ∧ 24]  0 S15 ( x ) dx by summing up the areas of the rectangles under the graph of the estimated survival function.

R

E[X ∧ 24] 

24

Z 0

S15 ( x ) dx

 (3 − 0)(1) + (4 − 3)

14 15

+ (6 − 4)

13 15

+ (8 − 6)

+ (12 − 10)(0.5867) + (24 − 12)(0.5133)

11 15

+ (10 − 8)(0.66)

 3 + 0.9333 + 1.7333 + 1.4667 + 1.32 + 1.1733 + 6.16  15.7867 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 25

435

25.13. The idea is to use all the data, not just the data for the 50,000 policy limit. Therefore, we treat the three claims for 25,000 as censored, and estimate the survival function S9 ( x ) : yj 10,000 20,000 30,000 40,000

rj 9 8 3 2

sj 1 2 1 1

S9 ( x ) , y j ≤ x < y j+1 6 9

8 9

 4 9 2 9

2 3

To calculate the limited expected value at 45,000, we integrate S9 ( x ) between 0 and 45,000. We sum up the rectangles under the graph: E[X ∧ 45,000] 

45,000

Z 0

S9 ( x ) dx

 (10,000 − 0)(1) + (20,000 − 10,000)

8 9

+ (30,000 − 20,000)

+ (40,000 − 30,000) 49 + (45,000 − 40,000) 1000 (90 + 80 + 60 + 40 + 10)  31,111.11  9



2 9

2 3

25.14. We sum up the losses bounded by 15,000, then divide by the number of losses (100). For the first four intervals, the sum of losses is given. For the interval (15,000, ∞) , since the losses are bounded by 15,000, the sum of the ten bounded losses is 10 (15,000)  150,000. The answer is 38,065 + 63,816 + 96,447 + 137,595 + 150,000  4859.23 100 Since the sum of the losses is given, a faster way to perform the calculation is to start with the sum, subtract the losses above 15,000, and then add 150,000: 667,754 − 331,831 + 150,000  4859.23 100 32 20 25.15. The histogram is (2530 )(90) in (0, 25], (25)(90) in (25, 50], (50)(90) in (50, 100], and In each interval, x 2 integrates to ( c 3i − c 3i−1 ) /3 for interval ( c i−1 , c i ) . So

8

(100)(90)

in (100, 200].

1 (30)(253 ) (32)(503 − 253 ) (20)(1003 − 503 ) (8)(2003 − 1003 ) E[X ]  + + + (3)(90) 25 25 50 100 1  (18,750 + 140,000 + 350,000 + 560,000)  3958 13 (E) 270 2

!

25.16. The computation is shortened by noticing that the summands for calculating the two expected values differ only in the interval (100, 200]. We’ll only calculate the summands in this interval, not the full expected value. For E[X 2 ], the summand in this interval is 1 6 (2003 − 1003 ) 420,000   1891.89, 3 (74) 100 222

!

whereas for E[ ( X ∧ 150) 2 ] the summands in this interval are 6 (74)(100)

Z

150

100

2

x dx +

Z

200

150

6 1503 − 1003 150 dx  + (1502 )(50)  1554.05 7400 3 2

!

The difference is 1891.89 − 1554.05  337.84 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

!

25. ESTIMATION OF RELATED QUANTITIES

436

Quiz Solutions 25-1. Conditional variance will be easy to use, since there are only two possibilities: loss size below 1000 and loss size greater than 1000. Let I be the condition of whether the loss is below or above 1000. Var ( X ∧ 1000)  E[Var ( X ∧ 1000 | I ) ] + Var (E[X ∧ 1000 | I])  E[10002 /12, 0] + Var (500, 1000)

The expected value of the variances is 0.46 (10002 /12)  38,333 31 . The variance of the expected values, by the Bernoulli shortcut, is (0.46)(0.54)(1000 − 500) 2  62,100. The overall variance is Var ( X ∧ 1000)  38,333 31 + 62,100  100,433 13 25-2. We calculated the probabilities of death at 24, 42, and 60, and the expected lifetime, which is 46.5. The second moment of lifetime is E[X 2 ] 

5 5 1 (242 ) + (422 ) + (602 )  2331 6 12 12

The variance of future lifetime is Var ( X )  2331 − 46.52  168.75 . 25-3. The risk set at 4 is 10 and the risk set at 7 is 8, so Sˆ (7− )  0.9 and Sˆ (7)  0.9 (7/8)  0.7875. It follows D ( X  7)  0.9 − 0.7875  0.1125 . that Pr

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 26

Variance of Kaplan-Meier and Nelson-Åalen Estimators Reading: Loss Models Fourth Edition 12.2 Exam questions from this lesson are frequent. The Kaplan-Meier estimator is an unbiased estimator of the survival function. Greenwood’s approximation of the variance is: X   sj L Sˆ ( t )  Sˆ ( t ) 2 (26.1) Var r ( r − sj) y ≤t j j j

You can remember the summand in this formula as the number who died divided by the product of population before and after the deaths. A useful fact to remember is that if there is complete data—no censoring or truncation—the Greenwood approximation is identical to the empirical approximation of the variance, equation (23.1). The latter is easier to calculate. In the following, we will use some notation from MLC and LC: t px

is the probability that someone age x survives for another t years. In other words, it is the conditional probability of survival to x + t, given survival to x.

t qx

is the complement of t p x . It is the probability that someone age x dies in the next t years. In other words, it is the conditional probability of death in the interval ( x, x + t] given survival to age x.

To calculate variances for conditional probabilities like t p x , treat the study as if it started at time x and use the Greenwood approximation. Example 26A In a mortality study on 82 lives, you are given: yj

rj

sj

1 2 3 4

82 78 74 75

2 1 1 2

is estimated using the product limit estimator. Estimate the variance of the estimate. 2 q2

Answer: Treat this as a study of 74 lives starting at duration 2. Then 73 2 pˆ 2  74

73  0.9602. 75   1 2 L (2 qˆ2 )  (0.96022 ) Var +  0.0005075 (74)(73) (75)(73)

!

!

Notice that Var (2 q 2 )  Var (2 p2 ) . In both cases, the formula uses 2 p22 as the first factor. Never use 2 q22 . C/4 Study Manual—17th edition Copyright ©2014 ASM

437

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

438

The approximate variance of the Nelson-Åalen estimator1 is:

L Hˆ ( t )  Var 



X sj y j ≤t

(26.2)

r 2j

This formula is a recursive formula:

L Hˆ ( y j )  Var L Hˆ ( y j−1 ) + Var 







sj r 2j

Example 26B In a mortality study on 82 lives, you are given: yj

rj

sj

1 2 3 4

82 78 74 75

2 1 1 2

H (2) is estimated using the Nelson-Åalen estimator. Estimate the variance of the estimate. Answer:

  L Hˆ (2)  2 + 1  0.0004618 Var 822 782

?



Quiz 26-1 You are given the following information regarding five individuals in a study: dj

uj

xj

0 0 1 2 4

2 — 5 — 5

— 3 — 4 —

Calculate the estimated variance of the Nelson-Åalen estimate of H (4) . Calculator Tip The TI-30XS/B Multiview calculator may be useful for calculating variances. It is probably not worthwhile using the data table for a small number of times or for the variance of Nelson-Åalen, but for the variance of the Kaplan-Meier estimate, for which you must compute both the Kaplan-Meier estimate itself as well as a sum of quotients, the data table is useful. Enter r i in column 1, s i in column 2, and then compute the KaplanMeier estimate, save it, and compute the Greenwood sum. Multiply the sum by the square of the product-limit estimate. Remember that for the Kaplan-Meier estimate you must log the factors and save the sum register; at the end exponentiate twice the saved sum. The solution to exercise 26.9 illustrates the use of the Multiview for the purpose of calculating the 1In the textbook used on the pre-2002 syllabus, this was called the Åalen estimator of the variance, and you will find that old exam questions from 2000 and 2001 call it that. C/4 Study Manual—17th edition Copyright ©2014 ASM

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

439

Calculator Tip Greenwood approximation of the variance. The usual symmetric normal confidence intervals for S n ( t ) and Hˆ ( t ) , which we’ll call linear confidence intervals, can be constructed by adding and subtracting to the estimator the standard deviation times a coefficient from the standard normal distribution based on the confidence level, like 1.96 for 95%. If z p is the p th th quantile of a standard normal distribution, then the linear confidence interval for S ( t ) is



q

q

L S n ( t ) , S n ( t ) + z (1+p )/2 Var L Sn ( t ) S n ( t ) − z (1+p )/2 Var 







However, this confidence interval for S n ( t ) frequently includes points higher than 1, and correspondingly can include points below 0 for Hˆ ( t ) , which would imply that the other bound should be adjusted to truly include an interval with probability p. To avoid this  problem, an alternative formula may be used. Confidence intervals are constructed for ln − ln S n ( t ) and ln Hˆ ( t ) using the delta method, which will be discussed in Section 34.2, and then these confidence intervals are exponentiated back into confidence intervals for S n ( t ) and Hˆ ( t ) . The resulting interval for S n ( t ) is



S n ( t ) 1/U , S n ( t ) U







where z

(26.3)

(1+p ) /2 + U  exp * S n ( t ) ln S n ( t ) ,  L Sn ( t ) vˆ  Var

and the resulting interval for Hˆ ( t ) is

where

Hˆ ( t ) ˆ , H (t )U U

!

(26.4)

L ˆ * z (1+p )/2 Var H ( t ) +/ U  exp .. / Hˆ ( t ) q





-

,

You’re probably best off memorizing these two equations rather than rederiving them using the delta method every time you need them. These alternative confidence intervals are called log-transformed confidence intervals. The usual confidence intervals may be called “linear confidence intervals” to distinguish them from the log-transformed confidence intervals. Example 26C In a mortality study on 82 lives, you are given: yj

rj

sj

1 2 3 4

82 78 74 75

2 1 1 2

H (2) is estimated using the Nelson-Åalen estimator. Construct a 90% linear confidence interval and a 90% log-transformed confidence interval for H (2) . C/4 Study Manual—17th edition Copyright ©2014 ASM

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

440

Answer: In Example 26B, we calculated the approximate variance as 0.0004618. The estimate of H (2) is 2 1 Hˆ (2)  +  0.03721 82 78 √ A 90% linear confidence interval for H (2) is 0.03721 ± 1.645 0.0004618  (0.00186, 0.07256) . A 90% log-transformed confidence interval is √ ! 1.645 0.0004618 U  exp  e 0.9500  2.5857 0.03721     H 0.03721 , HU  , 0.03721 (2.5857)  (0.01439, 0.09622) U 2.5857 Note that the log transformation tends to move the confidence interval for H to the right.



Example 26D The linear 95% confidence interval for S ( t0 ) is given by (0.44, 0.56). Construct the log-transformed 95% confidence interval for S ( t0 ) . √ Answer: The mean for the linear confidence interval, S n ( t ) , is the midpoint, 0.50. Then z 0.975 vˆ is half the length of the linear confidence interval, or 0.06. By formula (26.3), √ ! ! 0.06 z0.975 vˆ  exp  0.8410 U  exp S n ( t ) ln S n ( t ) 0.5 ln 0.5 and

the

log-transformed

confidence

interval



S n ( t ) 1/U , S n ( t ) U



is

(0.51/0.8410 , 0.50.8410 )



(0.4386, 0.5582) .

Note that the log transformation tends to move the confidence interval for S to the left.



Exercises Greenwood formula 26.1. In a mortality study, there are initially 15 lives. Deaths occur at times 1, 3, 5, and 8. Individuals withdraw from the study at times 2 and 6. The remaining 9 lives survive to time 10. Calculate the estimated variance of the product limit estimator of S (10) using Greenwood’s formula. 26.2.

[160-F86:11] In a 3 year mortality study, we have the following data: yj 1 2 3

rj 100 200 200

sj 10 20 20

S (3) is estimated with the Kaplan-Meier estimator. Using Greenwood’s formula, estimate the variance of the estimate. (A) 0.0012

(B) 0.0015

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.0017

(D) 0.0020

(E) 0.0022

Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

441

Table 26.1: Summary of Variance Formulas

Greenwood formula for variance of Kaplan-Meier estimator

L Sn ( t )  Sn ( t ) 2 Var 



X

sj

y j ≤t

r j (r j − s j )

(26.1)

Formula for variance of Nelson-Åalen estimator: j   X si ˆ L Var H ( y j )  2

(26.2)

ri

i1

Log-transformed confidence interval for S ( t ) :



S n ( t ) 1/U , S n ( t ) U

where

q

(26.3)



L *. z (1+p )/2 Var S n ( t ) +/ U  exp . / S n ( t ) ln S n ( t ) 



-

, Log-transformed confidence interval for H ( t ) : Hˆ ( t ) ˆ , H (t )U U where

26.3.

!

(26.4)

  L ˆ *. z (1+p )/2 Var H ( t ) +/ U  exp . / Hˆ ( t ) , q

[160-F87:6] In a 3 year mortality study, we have the following data: yj

rj

sj

1 2 3

100 120 110

10 12 15

S (3) is estimated with the Kaplan-Meier estimator. Using Greenwood’s formula, estimate the standard deviation of the estimate. (A) 0.0412

(B) 0.0415

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.0420

(D) 0.0425

(E) 0.0432

Exercises continue on the next page . . .

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

442

26.4.

[160-S88:16] A mortality study is conducted on 15 individuals of age x. You are given:

(i)

In addition to the 15 individuals of age x, one individual apiece enters the study at ages x + 0.4 and x + 0.7. (ii) One individual apiece leaves the study alive at ages x + 0.2 and x + 0.6. (iii) One death apiece occurs at ages x + 0.1, x + 0.3, x + 0.5, and x + 0.8. Using Greenwood’s formula, calculate the estimated variance of the product limit estimate of q x . (A) 0.01337 26.5. (i) (ii) (iii)

(B) 0.01344

(C) 0.01350

(D) 0.01357

(E) 0.01363

[160-S90:9] From a two-year mortality study of 1000 lives beginning at exact age 40, you are given: Observed deaths are distributed uniformly over the interval (40, 42] Greenwood’s approximation of Var S (2)  0.00016. qˆ40 < 0.2.





Calculate the observed mortality rate qˆ40 . (A) 0.096 26.6.

(B) 0.097

(C) 0.098

(D) 0.099

(E) 0.100

[160-S90:14] A two year mortality study is conducted on 10 individuals of age x. You are given:

(i)

In addition to the 10 individuals of age x, one individual apiece enters the study at ages x + 0.8 and x + 1.0. (ii) One individual leaves the study alive at age x + 1.5. (iii) One death apiece occurs at ages x + 0.2, x + 0.5, x + 1.3, and x + 1.7. Using Greenwood’s formula, calculate the estimated variance of the product limit estimate of 2 q x . 26.7.

[160-F90:13] In a 3 year mortality study, we have the following data: yj

rj

sj

1 2 3

1000 1400 2000

20 14 10

S (3) is estimated with the Kaplan-Meier estimator. Using Greenwood’s formula, estimate the variance of the estimate. (A) 0.000028 26.8.

(B) 0.000029

(C) 0.000030

(D) 0.000031

(E) 0.000032

[160-81-96:8] For a study of 1000 lives over three years, you are given:

(i) There are no new entrants or withdrawals. (ii) Deaths occur at the end of the year of death. (iii) (iv)

r j ( r j − s j )  1.11 × 10−4 for y j  1, 2. The expected value of Sˆ (3) is 0.746. sj

.



Calculate the conditional variance of Sˆ (3) using Greenwood’s approximation. (A) 1.83 × 10−4

(B) 1.85 × 10−4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1.87 × 10−4

(D) 1.89 × 10−4

(E) 1.91 × 10−4

Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

26.9.

443

[160-82-96:8] In a five year mortality study, you are given: yj

sj

rj

1 2 3 4 5

3 24 5 6 3

15 80 25 60 10

Calculate Greenwood’s approximation of the conditional variance of the product limit estimator of S (4) . (A) 0.0055

(B) 0.0056

(C) 0.0058

(D) 0.0061

(E) 0.0063

26.10. [4-S00:38] A mortality study is conducted on 50 lives observed from time zero. You are given: (i)

Time t

Number of Deaths dt

Number Censored ct

15 17 25 30 32 40

2 0 4 0 8 2

0 3 0 c 30 0 0

Sˆ (35) is the Product-Limit estimate of S (35) .   L Sˆ (35) is the estimate of the variance of Sˆ (35) using Greenwood’s formula. (iii) Var (ii)

L Sˆ (35) Var 

(iv)



Sˆ (35)



 2  0.011467

Determine c 30 , the number censored at time 30. (A) 3

(B) 6

(C) 7

(D) 8

(E) 11

Variance of Nelson-Åalen estimator Use the following information for questions 26.11 and 26.12: 7+ ,

92 lives are under observation in a mortality study. The first seven observation times are 2, 2, 3, 6+ , 8, 8, where a plus sign indicates a censored observation.

26.11. Calculate the estimated variance of the product limit estimator S92 (8) using Greenwood’s formula. 26.12. Calculate the estimated variance of the Nelson-Åalen estimator of the cumulative hazard rate, Hˆ (8) . 26.13. In a mortality study, 2 deaths occur at time 3 and 3 at time 5. No other deaths occur before time 5. The estimated variance of the Nelson-Åalen estimator of H (3) is 0.0003125, and the estimated variance of the Nelson-Åalen estimator of H (5) is 0.0008912. Determine the number of withdrawals between times 3 and 5. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

444

26.14. In a mortality study, cumulative hazard rates are calculated using the Nelson-Åalen estimator. You are given: Hˆ ( y2 )  0.16000

L Hˆ ( y2 )  0.0037000 Var

Hˆ ( y3 )  0.31625

L Hˆ ( y3 )  0.0085828 Var

 

 

No withdrawals occur between times y2 and y3 . Determine the number of deaths at time y3 . 26.15. [4-F00:20] Fifteen cancer patients were observed from the time of diagnosis until the earlier of death or 36 months from diagnosis. Deaths occurred during the study as follows: Time In Months Since Diagnosis

Number Of Deaths

15 20 24 30 34 36

2 3 2 d 2 1

The Nelson-Åalen estimate Hˆ (35) is 1.5641.

Calculate the Åalen estimate of the variance of Hˆ (35) .

(A) (B) (C) (D) (E)

Less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20, but less than 0.25 At least 0.25

26.16. [4-S01:14] For a mortality study with right-censored data, you are given: yi

si

ri

1 8 17 25

15 20 13 31

100 65 40 31

Calculate the Åalen estimate of the standard deviation of the Nelson-Åalen estimator of the cumulative hazard function at time 20. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

445

Linear confidence intervals 26.17. In a mortality study on 12 lives, 2 die at time 3. The Kaplan-Meier estimator is used to estimate S (3) . Determine the upper bound for the linear 95% confidence interval for S (3) . 26.18. In a mortality study on 100 lives, 2 die at time 2 and 1 at time 5. The product-limit estimator is used to estimate S (5) . Determine the width of the linear 90% confidence interval for S (5) . 26.19. [4-S00:19] For a mortality study with right-censored data, the cumulative hazard rate is estimated using the Nelson-Åalen estimator. You are given: (i) No deaths occur between times t j and t j+1 . (ii) A 95% linear confidence interval for H ( t j ) is (0.07125, 0.22875) . (iii) A 95% linear confidence interval for H ( t j+1 ) is (0.15607, 0.38635) . Calculate the number of deaths observed at time t j+1 . (A) 4

(B) 5

(C) 6

(D) 7

(E) 8

26.20. [C-S05:15] Twelve policyholders were monitored from the starting date of the policy to the time of first claim. The observed data are as follows: Time of First Claim Number of Claims

1 2

2 1

3 2

4 2

5 1

6 2

7 2

Using the Nelson-Åalen estimator, calculate the 95% linear confidence interval for the cumulative hazard rate function H (4.5) . (A) (0.189, 1.361)

(B) (0.206, 1.545)

(C) (0.248, 1.402)

(D) (0.283, 1.266)

(E) (0.314, 1.437)

Log-transformed confidence intervals

L S n (365)  0.0019. 26.21. In a mortality study, S n (365)  0.76 and Var 



Determine the lower bound of the log-transformed 95% confidence interval for S (365) .

L S n (23)  0.022. 26.22. In a mortality study, S n (23)  0.55 and Var 



Determine the width of the log-transformed 90% confidence interval for S (23) . 26.23. The log-transformed 95% confidence interval for S ( y j ) is given by (0.400, 0.556). Construct the linear 95% confidence interval for S ( y j ) . 26.24. The log-transformed 90% confidence interval for S ( y j ) is given by (0.81, 0.88). Determine the width of the log-transformed 95% confidence interval.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

446

26.25. [4-F02:8] For a survival study, you are given: (i) The Product-Limit estimator Sˆ ( t0 ) is used to construct confidence intervals for S ( t0 ) . (ii) The 95% log-transformed confidence interval for S ( t0 ) is (0.695, 0.843) . Determine Sˆ ( t0 ) . (A) 0.758

(B) 0.762

(C) 0.765

(D) 0.769

(E) 0.779

26.26. In a study on 100 lives, 2 die at time 2 and 3 at time 8. Determine the lower bound of the log-transformed 99% confidence interval for H (8) . 26.27. The linear 95% confidence interval for H (100) is given by (0.8, 1.0) . Determine the width of the log-transformed 95% confidence interval for H (100) . 26.28. The log-transformed 95% confidence interval for H (80) is given by (0.4, 0.625). Determine the upper bound of the linear 90% confidence interval for H (80) . 26.29. In a mortality study, one death apiece occurred at times y1 and y2 . No other deaths occurred before time y2 . The 95% log-transformed confidence interval for the cumulative hazard rate H ( y2 ) calculated using the Nelson-Åalen estimator is (0.07837, 1.3477). There were no late entrants to the study. Determine the size of the risk set at time y2 . 26.30. [4-F01:37] A survival study gave (1.63, 2.55) as the 95% linear confidence interval for the cumulative hazard function H ( t0 ) . Calculate the 95% log-transformed confidence interval for H ( t0 ) . (A) (0.49, 0.94)

(B) (0.84, 3.34)

(C) (1.58, 2.60)

(D) (1.68, 2.50)

(E) (1.68, 2.60)

26.31. [4-F04:12] The interval (0.357, 0.700) is a 95% log-transformed confidence interval for the cumulative hazard rate function at time t, where the cumulative hazard rate function is estimated using the Nelson-Åalen estimator. Determine the value of the Nelson-Åalen estimate of S ( t ) . (A) 0.50

(B) 0.53

(C) 0.56

(D) 0.59

(E) 0.61

Use the following information for questions 26.32 and 26.33: For a survival study with censored and truncated data, you are given: Time(t) 1 2 3 4 5

Number at Risk at Time t 30 27 32 25 20

Failures at Time t 5 9 6 5 4

26.32. [4-F03:21] The probability of failing at or before Time 4, given survival past Time 1, is 3 q 1 . Calculate Greenwood’s approximation of the variance of 3 qˆ 1 . (A) 0.0067

(B) 0.0073

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.0080

(D) 0.0091

(E) 0.0105 Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

447

26.32–33. (Repeated for convenience) Use the following information for questions 26.32 and 26.33: For a survival study with censored and truncated data, you are given: Time(t) 1 2 3 4 5

Number at Risk at Time t 30 27 32 25 20

Failures at Time t 5 9 6 5 4

26.33. [4-F03:22] Calculate the 95% log-transformed confidence interval for H (3) , based on the NelsonÅalen estimate. (A) (0.30, 0.89)

(B) (0.31, 1.54)

(C) (0.39, 0.99)

(D) (0.44, 1.07)

(E) (0.56, 0.79)

Additional released exam questions: C-F05:17, C-F06:7, C-S07:12,33

Solutions 26.1.

The table of risk sets and deaths is

14 S15 (10)  15

26.2.

!

11 12

rj

sj

1 3 5 8

15 13 12 10

1 1 1 1

9  0.7108 10     1 1 1 1 L Sˆ15 (10)  (0.7108) 2 Var + + +  0.0151 (15)(14) (13)(12) (12)(11) (10)(9)

!

12 13

yj

!

!

Using equation (26.1), S n (3)  (0.9)(0.9)(0.9)  0.729    10 20 20 2 L Var S n (3)  0.729 + +  0.00118098 (100)(90) (200)(180) (200)(180)



26.3.

Using equation (26.1), 95 S n (3)  (0.9)(0.9)  0.6995 110     10 12 15 L S n (3)  0.69952  0.001699 Var + + (100)(90) (120)(108) (110)(95) √ 0.1699  0.04122 (A)

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

448

26.4.

The table of risk sets and deaths is: yj

rj

sj

0.1 0.3 0.5 0.8

15 13 13 12

1 1 1 1

Using equation (26.1),

L ( qˆ x )  *. 14 Var 15

!

12 13

!

12 13

!

11 + / 12

,

!

2



1

(14)(15)

+

1

(12)(13)

+

1

(12)(13)

+

1



(11)(12)

-

 0.01337

(A)

26.5. We suspect from condition (iii) that this is going to be a quadratic. Let d be the number of deaths in each year. (It is uniform). Since there is no truncation or censoring, Greenwood’s approximation is the empirical variance, so 2d  1000−2d  1000 1000

1000

 0.00016

Multiplying out and solving: 2d (1000 − 2d )  160,000

0  −2000d + 4d 2 + 160,000

2000 ± 20002 − 16 (160,000) d 8 2000 ± 1200   400 or 100 8

p

Only 100 satisfies qˆ40 < 0.2, so qˆ40  100/1000  0.1 . (E) 26.6.

Develop the following table of risk sets and deaths: yj

rj

sj

0.2 0.5 1.3 1.7

10 9 10 8

1 1 1 1

S n (2)  (0.8)(0.9)(0.875)  0.63    1 1 1 1 L S n (2)  0.632 Var + + +  0.02142 (10)(9) (9)(8) (10)(9) (8)(7)



26.7. S n (3)  (0.98)(0.99)(0.995)  0.965349    20 14 10 L S n (3)  0.9653492 Var + +  0.000028 (1000)(980) (1400)(1386) (2000)(1990)



C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

EXERCISE SOLUTIONS FOR LESSON 26

26.8.

449

Apparently, the exam setters wanted to coax students into wasting their time backing out s3

.

r3 ( r3 −

s3 ) , and then using Greenwood’s formula. But you’re too smart for that—you know that when you have complete data, .  Greenwood’s  formula reduces to the empirical variance, and you don’t need any of the individual s j r j ( r j − s j ) ’s. (Believe it or not, the official solution did not use this shortcut, but actually



went to the trouble of backing out s3

.

r3 ( s3 − r3 ) !)

Var S1000 (3) 







(0.746)(1 − 0.746) 1000

 0.000189484

(D)

L S n (4) . 26.9. They tried to confuse you a little by giving you five years of data but then asking for Var (Did you get answer D? Shame on you!) 

S n (4)  (0.8)(0.7)(0.8)(0.9)  0.4032    24 5 6 3 L S n (4)  0.40322 Var + + + (15)(12) (80)(56) (25)(20) (60)(54)



 0.005507

(A) Calculator Tip

Here’s how the calculation could be done on a Multiview calculator:

Clear table

data data 4

L1

15

Enter s j in column 2

t% 3 s% 24 s% 5 s% 6 enter

Enter formula Kaplan-Meier in umn 3

for col-

Calculate statistics and save the sum

C/4 Study Manual—17th edition Copyright ©2014 ASM

t% data t% 1 ln 1 − t

enter 5 sto

L1 80 25 60 L1(5)=

L2

L3

L1 80 25 60

L2 24 5 6

L3

L2 3 24 5 6

L3 −0.223 −0.357 −0.223 −0.105

L2(5)=

data 2 ÷ data 1 ) enter

clear 2nd [stat]1 Select L3 for data y x a zb tc

L3

L1(1)=

s% 80 s% 25 s% 60 enter

Enter r j in column 1

L2

enter

s% s%

L1 15 80 25 60

P

x→x

−0.908322562



26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

450

Calculator Tip

t% t%

t%

Enter formula for Greenwood sum in column 3

data data 1 data 2 ÷ ( data 1 ( data 1- data 2 ) ) enter

Get sum statistic and finish up the calculation

2nd [stat]1 enter

s% s% enter 5 ×

L1 L2 L3 15 3 0.0167 80 24 0.0054 25 5 0.01 60 6 0.0019 L3(1)=0.016666666..

y

2nd [ e x ]2 × x a zb tc

P

x ∗ e 2×x

0.005507174

26.10. The variance divided by Sˆ 2 is 0.011467 

2

(50)(48)

+

4

(45)(41)

+

 0.0008333 + 0.002168 + 8

8

(41 − c30 )(33 − c30 ) 8

(41 − c30 )(33 − c30 )

 0.008466

(41 − c30 )(33 − c30 ) (41 − c30 )(33 − c30 )  945

We can solve the quadratic, or we can note that 945 is about 312 , and the two factors differ by 8, so by making the factors 35 and 27, we can verify that they multiply out to 945. Then c 30  41 − 35  6 . (B)

26.11. Use formula (26.1).

85  0.9452 87     2 1 2 L S92 (8)  (0.94522 ) Var + +  0.0005689 . (92)(90) (90)(89) (87)(85) 89 S92 (8)  92

!

!

26.12. Use formula (26.2).

  L Hˆ (8)  2 + 1 + 2  0.0006240 . Var 922 902 872 26.13. 2 r12 3 0.0008912 − 0.0003125  0.0005787  2 r2 0.0003125 

r1  80 r2  72

There were 80 − 72 − 2  6 withdrawals.

26.14. As usual, s3 will denote the number of deaths at time y3 and r3 will denote the risk set at time y3 . 0.0085828 − 0.0037000  0.0048828  0.31625 − 0.16000  0.15625  C/4 Study Manual—17th edition Copyright ©2014 ASM

s3 r3

s3 r32

EXERCISE SOLUTIONS FOR LESSON 26

451

Dividing, 0.15625  32 0.0048828 s 3  32 (0.15625)  5 . r3 

26.15. To back out d, we write 2 3 2 d 2 + + + +  1.5641 15 13 10 8 8 − d 2 d  1.5641 0.5641 + + 8 8−d 2 d + 1 8 8−d

It’s easiest to solve this by plugging in values of d, but if you wish to solve the quadratic: d (8 − d ) + 16  8 (8 − d ) d 2 − 16d + 48  0 d  4, 12

We reject 12 as being larger than the population, so d  4. Now we calculate the variance as 3 2 4 2 2 + 2 + 2 + 2 + 2  0.2341 2 15 13 10 8 4

(D)

26.16. Straightforward; the only trick is that you must ignore time 25 and must take the square root at the end.

  L Hˆ (20)  15 + 20 + 13  0.01439 Var 1002 652 402 √ 0.01439  0.1198 (C) 26.17. 10  0.8333 12 !2   2 L S12 (3)  5 Var  0.01157 6 (12)(10) S12 (3) 

√ Upper bound of confidence interval is 0.8333 + 1.96 0.01157 > 1, so it is 1 . 26.18. S100 (5) 

97  0.97 100

L S100 (5)  0.972 Var 





2 1 +  0.000291 (100)(98) (98)(97)

√ Width of confidence interval 2 (1.645) 0.000291  0.05612 .

C/4 Study Manual—17th edition Copyright ©2014 ASM



26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

452

 0.15 26.19. The midpoints of the intervals are the estimates for H ( t j ) and H ( t j+1 ) ; they are 0.07125+0.22875 2 and 0.15607+0.38635  0.27121 respectively. It follows that s /r  0.27121 − 0.15  0.12121. j+1 j+1 2 The difference between the top of each interval and the midpoint is 1.96D σ , so D σ2 (the estimate of the variance) is the difference divided by 1.96, squared, or

!2   0.22875 − 0.15 L Var H ( t j )   0.001614 1.96

L H ( t j+1 )  0.38635 − 0.27121 Var 



!2

1.96

 0.003451

The difference, 0.003451 − 0.001614  0.001837 is s j+1 /r 2j+1 . Then s j+1  0.121212 /0.001837  8 . (E) 26.20. We have

P Hˆ ( y i )  ij1

yi

ri

si

si ri

1 2 3 4

12 10 9 7

2 1 2 2

0.166667 0.100000 0.222222 0.285714

sj rj

si r i2

L Hˆ ( y i )  Pi Var j1 



sj r 2j

0.166667 0.013889 0.013889 0.266667 0.010000 0.023889 0.488889 0.024691 0.048580 0.774603 0.040816 0.089397 √ The confidence interval is 0.774603 ± 1.96 0.089397  (0.189, 1.361) . (A) 26.21. √ ! 1.96 0.0019 U  exp 0.76 ln 0.76  e −0.4096  0.6639 S1/U  (0.76) 1/0.6639  0.6614 26.22. √ ! 1.645 0.022 U  exp 0.55 ln 0.55  exp (−0.7420)  0.4761

SU  0.550.4761  0.7523,

S1/U  0.551/0.4761  0.2849

Width of interval is 0.7523 − 0.2849  0.4674 .

26.23. Take the logarithms of the lower and upper bounds of the given confidence interval, S1/U  0.400 and SU  0.556. U ln S  ln 0.556 1 ln S  ln 0.4 U ln 0.556 U2   0.64 ln 0.4 U  0.80 S  exp (U ln 0.4)  0.40.80  0.48 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 26

453

√ ! z 0.975 vˆ exp  0.8 S ln S √ z 0.975 vˆ  (ln 0.8) S ln S  (ln 0.8)(0.48)(ln 0.48)  0.0786 Interval is 0.48 ± 0.0786  (0.4014, 0.5586) .

26.24. Take the logarithms of the lower and upper bounds of the given confidence interval, S1/U  0.81 and SU  0.88. U ln S  ln 0.88 1 ln S  ln 0.81 U ln 0.88 U2   0.6066 ln 0.81 U  0.7789 ln 0.88 ln 0.88 ln S    −0.1641 U 0.7789 S  e −0.1641  0.8486 √ z0.95 vˆ  ln 0.7789  −0.2500 S ln S √ ! z0.975 vˆ 1.96  −0.2500  −0.2978 S ln S 1.645 √ ! z 0.975 vˆ 0  e −0.2978  0.7425 U  exp S ln S SU  0.84860.7425  0.8852 and S 1/U  0.84861/0.7425  0.8016 0

0

Width is 0.8852 − 0.8016  0.0836 .

26.25. We have S1/U  0.695 and SU  0.843. Therefore 1 ln S  ln 0.695 U U ln S  ln 0.843

r U

ln 0.843  0.6851 ln 0.695

S  S1/U



U

 0.6950.6851  0.7794

(E)

26.26. 2 3 Hˆ (8)  +  0.05061 100 98   L Hˆ (8)  2 + 3  0.00051237 Var 1002 982

q

L Hˆ (8)  0.02264 Var 



U  exp



(2.576)(0.02264) 0.05061



 3.1649

99% interval is 0.05061/3.1649, (3.1649)(0.5061)  (0.01599, 0.16018) . The lower bound is 0.01599 .



C/4 Study Manual—17th edition Copyright ©2014 ASM



26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

454

L Hˆ (100) 26.27. The estimate for H (100) is the midpoint of the interval, or Hˆ (100)  0.9. Then z0.975 Var is the half-width of the linear confidence interval, or 0.1, and U  exp (0.1/0.9)  1.1175. The logtransformed 95% confidence interval is q







 (0.9/1.1175) , 0.9 (1.1175)  (0.8054, 1.0058)

The width of the interval is 1.0058 − 0.8054  0.2004 . 26.28.

q

0.625 U  0.4 2

U  1.25  exp

q

H  0.4 (1.25)  0.5

L (H ) z0.975 Var

!

H

L ( H )  (ln 1.25)( .5) /1.96  0.05692 Var

Upper bound of 90% interval for H (80) is 0.5+1.645(0.05692)= 0.5936 . 26.29. U

p

1.3477/0.07837  4.1469

Hˆ ( y2 )  4.1469 (0.07837)  0.325

L Hˆ ( y2 ) 1.96 Var q

ln U  ln 4.1469  1.42236 

  L Hˆ ( y2 )  (0.325)(1.42236) Var 1.96

!2





Hˆ ( y2 )

 0.055625

Let x1  1/r1 , x2  1/r2 . We have two equations: x1 + x 2  0.325 x12 + x22  0.055625

x12 + (0.325 − x 1 ) 2  0.055625

2x12 − 0.65x1 + 0.05  0

x 1  0.2, 0.125

So the two risk sets are 8 and 5. Since there were no late entrants, the second risk set must be the smaller one, making it 5 . 26.30. The midpoint is Hˆ ( t0 )  calculate

1.63+2.55 2

q

L H ( t0 )  2.55 − 2.09  0.46. Then we  2.09, and z0.975 Var q

U  exp

L H ( t0 ) ! z 0.975 Var 

H







 1.2462

  ( H/U, HU )  2.09/1.2462, 2.09 (1.2462) (1.6771, 2.6045) 26.31. The confidence interval (0.357, 0.700) is Hˆ ( t ) /U, Hˆ ( t ) U , so





(0.357)(0.700)  Hˆ ( t ) 2  0.2499 and Hˆ ( t )  0.5, so Sˆ ( t )  e −0.5  0.6065 . (E) C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

QUIZ SOLUTIONS FOR LESSON 26

455

26.32. We start the study at time 1, which means we ignore the events at time 1. 18 3 pˆ 1  27

!

26 32

!

20  0.4333 25

!

2 3 pˆ 1

 0.1878 6 5 + +  0.03573 (27)(18) (32)(26) (25)(20) 9

L (3 qˆ1 )  (0.1878)(0.03573)  0.006709 Var

(A)

26.33. We now calculate from time 0, unlike the previous exercise, since H (3) is unconditional. 5 9 6 Hˆ (3)  + +  0.6875 30 27 32   L Hˆ (3)  5 + 9 + 6  0.02376 Var 302 272 322 √ ! 1.96 0.02376 U  exp  exp (0.43945)  1.5519 0.6875

  ( H/U, HU )  0.6875/1.5519, 0.6875 (1.5519)  (0.4430, 1.0669)

(D)

Quiz Solutions 26-1. Risk set is 3 at time 3 (second, third, and fourth individuals) and 2 at time 4 (third and fourth individuals), so   L Hˆ (4)  1 + 1  13 Var 32 22 36

C/4 Study Manual—17th edition Copyright ©2014 ASM

456

C/4 Study Manual—17th edition Copyright ©2014 ASM

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

Lesson 27

Kernel Smoothing Reading: Loss Models Fourth Edition 12.3 Exams routinely ask one question from kernel smoothing.

27.1

Density and distribution

The empirical distribution is a discrete distribution and therefore not a good model for continuous loss sizes. Since it is a discrete distribution, its distribution function is constant most of the time, except that it jumps at each observation point. Its probability density function is 0, except at observation points, where it is undefined. To correct these drawbacks, the distribution may be smoothed. We will discuss one method for smoothing, known as kernel smoothing. Suppose you build an empirical distribution from a sample of size n. One way to describe this distribution is as an equally-weighted mixture of n distributions, each of which is a degenerate distribution, a constant whose value is the observation point. To make this discussion clearer, let’s use an example: A sample of size 4 from a random variable X is {5, 12, 15, 20}. Then the empirical distribution can be written as an equally-weighted mixture of the following distributions:

 0 • A constant distribution which is always equal to 5, so Pr ( X1  5)  1, FX1 ( x )   1   0 • A constant distribution which is always equal to 12, so Pr ( X2  12)  1, FX2 ( x )   1   0 • A constant distribution which is always equal to 15, so Pr ( X3  15)  1, FX3 ( x )   1   0 • A constant distribution which is always equal to 20, so Pr ( X4  20)  1, FX4 ( x )   1    Then FX ( x )  0.25 FX1 ( x ) + FX2 ( x ) + FX3 ( x ) + FX4 ( x ) .

x xi + b

Suppose we set b  1. Then the kernel density function for x i is 1/ (2b )  1/2 for |x − x i | ≤ 1. The kernel-smoothed density function in the example above is the sum of the kernel density functions over the four observation points, divided by 4, or

 1/8 fˆ( x )   0 

4 ≤ x ≤ 6, 11 ≤ x ≤ 13, 14 ≤ x ≤ 16, 19 ≤ x ≤ 21 otherwise

 1If not all x i are distinct, you can shorten the sum in the formula by summing up over distinct values of x i and using n i /n instead of 1/n, where n i is the number of observations equal to x i . C/4 Study Manual—17th edition Copyright ©2014 ASM

27.1. DENSITY AND DISTRIBUTION

459

The kernel-smoothed distribution function will grow with a slope of 1/8 in the ranges where fˆ( x ) , 0. The empirical distribution function and the kernel-smoothed distribution function are shown in Figure 27.1. Fˆ ( x ) kernel smoothed 1

F4 ( x ) unsmoothed 1 0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2 5

10

15

(a) Unsmoothed distribution function

20

5

10

15

20

(b) Kernel-smoothed distribution function

Figure 27.1: Two versions of estimated cumulative distribution function: unsmoothed, and smoothed using uniform kernel with bandwidth 1

Let’s calculate a couple of sample values of the kernel-smoothed density and distribution: • fˆ(5)  0.125 • fˆ(11)  0.125. Notice that the density function is non-zero at the boundaries of the bandwidth. However, f (11− )  0. Thus the smoothed density function is discontinuous. • Fˆ (3)  0 • Fˆ (4.5)  0.0625. This is obtained by linear interpolation between F (4)  0 and F (6)  0.25. • Fˆ (12.5)  0.4375. This is obtained by linear interpolation between F (11)  0.25 and F (13)  0.5. This example was easy because the kernels for distinct points did not overlap. However, usually the kernels do overlap. As an example, set b  5. Then the kernel-smoothed density function will be 1/4 of the sum of the individual kernel densities. The kernel density of x i is 1/ (2b )  1/10 for |x − x i | ≤ 5. Therefore, the kernel-smoothed density function at x will be 1/40 times the number of observation points within 5 of x. Here are some examples: • fˆ(5)  1/40  0.025; the only observation within 5 of 5 is 5 itself. • fˆ(8)  2/40  0.05; both 5 and 12 are within 5 of 8. • fˆ(10)  3/40  0.075; 5, 12, and 15 are within 5 of 10. However, f (10− ) and f (10+ ) are both 2/40. • fˆ(16)  3/40  0.075; 12, 15, and 20 are within 5 of 16. C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

460 fˆ( x ) 0.075

0.05

0.025

7

10

15

20

17

25

Figure 27.2: Kernel-smoothed density function using uniform kernel with bandwidth 5

Figure 27.2 graphs the kernel-smoothed density function. To calculate the kernel-smoothed distribution function, we must add the four kernel distribution functions for the four sample points, then divide by 4. As an example of this calculation, let’s calculate Fˆ (11) . Then • K 5 (11)  1, because K x i ( x )  1 for x ≥ x i + b. In general, the kernel distribution function is 1 for observations points more than one bandwidth to the left. • K 20 (11)  0, because K x i ( x )  0 for x ≤ x i − b. In general, the kernel distribution function is 0 for observation points more than one bandwidth to the right. • K 12 (11)  0.4. You can use the formula for K x i ( x ) , but try to reason it out: linearly interpolate between K 12 (7)  0 and K 12 (17)  1. • K 15 (11)  0.1, by interpolation between K 15 (10)  0 and K 15 (20)  1. We conclude that Fˆ (11)  0.25 (1 + 0.4 + 0.1 + 0)  0.375. A graph of Fˆ ( x ) is shown in Figure 27.3. Here’s another example. Example 27A You are studying how long you have to wait for the bus to work every morning. Over a four day period, you waited 6 minutes on 1 day, 10 minutes on 2 days, and 25 minutes on 1 day. You will now construct a distribution function for amount of time until the bus arrives using this data. To smooth the distribution, you will use a uniform kernel with bandwidth 5. Calculate the kernel smoothed functions at 13, fˆ(13) and Fˆ (13) . Answer: The empirical distribution sets f4 (6)  0.25, f4 (10)  0.5, and f4 (25)  0.25. The kernel density functions at each of the three observations have the following graphs:

0.1 0

0.1 1

(a) k 6 (13)

C/4 Study Manual—17th edition Copyright ©2014 ASM

11 13

0

0.1 5 (b) k 10 (13)

13 15

0

13

20 (c) k 25 (13)

30

27.1. DENSITY AND DISTRIBUTION

461

Fˆ ( x ) 1 0.8 0.6 0.4 0.2 7

10

15

20

17

25

Figure 27.3: Kernel-smoothed distribution function using uniform kernel with bandwidth 5

Thus k6 (13)  0, k10 (13)  0.1, and k 25 (13)  0. Each observation has probability 1/4 in the empirical distribution. The contribution of each point to the kernel density estimate of f (13) is the product of the probability of the point and the kernel density of the point. The following table computes the answer. Point

Probability

Kernel Density

6 10 25 Total

1/4 1/2 1/4

0 0.1 0

Contribution 0 0.05 0 0.05

This table computes the density from first principles, but it is faster, when using the uniform kernel to estimate density, to count up all observations within the bandwidth and divide by 2b, twice the bandwidth, and by n, the number of observations. Here, .  there are  2 observations within the bandwidth (at 10), so the answer would be computed as fˆ(13)  2 (10)(4)  0.05. The kernel distribution functions for each of the three observations have the following graphs:

1 0.8 0.6 0.4 0.2 0

1

(a) K6 (13)

11 13

1 0.8 0.6 0.4 0.2 0

5 (b) K10 (13)

13 15

1 0.8 0.6 0.4 0.2 0

13

20 (c) K25 (13)

Thus K 6 (13)  1, K 10 (13)  0.8, and K 25 (13)  0. The following table computes Fˆ (13) . C/4 Study Manual—17th edition Copyright ©2014 ASM

30

27. KERNEL SMOOTHING

462

Point

Probability

Kernel Distribution

6 10 25 Total

1/4 1/2 1/4

1 0.8 0

Contribution 0.25 0.40 0 0.65

From a calculation viewpoint, you may be better off  pulling out the fraction 1/n (here 1/4) and dividing by n at the end, so you would do the calculation as 1 + 2 (0.8) /4. Figure 27.4 graphs the density and distribution functions, which are obtained by adding up the first and third graphs and twice the second graph in each set above. 

0.1

1 0.8

0.075

0.6 0.05 0.4 0.025

6

20 25 10 13 15 (a) Density function, y  fˆ( x )

30

35



∗ ∗∗



∗ ∗∗

0.2

6 20 25 30 10 13 15 (b) Distribution function, y  Fˆ ( x )

35

Figure 27.4: Example of calculating fˆ(13) and Fˆ (13) . Asterisks represent data points.

The above example should be studied carefully. You must be careful about two things: 1. Notice that the kernel distribution function for a fixed estimation point x decreases as the observation point increases: K 6 (13) > K 10 (13) > K 25 (13) This is quite clear from the graphs. When calculating the kernel estimate of the distribution function at the point 13, you may be tempted to (incorrectly) increase the kernel distribution function as the observation point increases. After all, the kernel function is a distribution function, and any distribution function is an increasing (or at least non-decreasing) function of its argument. From the perspective of the observation point, the function increases as the estimation point increases. However, our situation reverses the perspective. We are fixing the estimation point, and summing up kernels over all observation points. As the observation point increases, the kernel distribution function will decrease. In fact, a faster way to calculate Fˆ (13) in Example 27A is to express the kernel distribution function C/4 Study Manual—17th edition Copyright ©2014 ASM

27.1. DENSITY AND DISTRIBUTION

463

as a function of the observation point. From the equation for K x i ( x ) on page 458, 0       13 − ( x i − b ) K x i (13)     2b    1 

13 ≤ x i − b x i − b ≤ 13 ≤ x i + b

13 > x i + b

or in this example, with b  5, 1       18 − x i K x i (13)     10   0  So K x i (13) is a linear function of x i going from 1 at 8 to 0 at 18, and intermediate values can be calculated with linear interpolation. From Figure 27.5, we see that K 10 (13)  0.8, and can read off K x i (13) for other values of x i as well. In general, when calculating the value of the kernel distribution function at x for a uniform kernel, draw a graph consisting of a straight line from ( x − b, 1) to ( x + b, 0) and read off the values of K x i ( x ) from the graph.

xi < 8 8 ≤ x i < 18 x i ≥ 18

K x i (13) 1 0.8 0.6 0.4 0.2 8

10

13

18

xi

Figure 27.5: K x i (13) 2. Even though we can ignore observation points outside the bandwidth when calculating the density function, we cannot ignore any points less than the estimation point, even more than a bandwidth away, when calculating the distribution function. For a point that is more than a bandwidth below the estimation point, the kernel distribution function will be equal to 1. This is the same as the situation for any random variable beyond its support: the density will be 0, but the cumulative distribution will be 1.

?

Quiz 27-1 A sample has 67 observations less than 82, 27 observations greater than 90, and the following observations: 82

83

85

86

87

89

Kernel-density methods are used to estimate the cumulative distribution function. A uniform kernel with bandwidth 4 is used. Estimate F (86) .

27.1.2

Triangular kernel

The triangular kernel has a density function that is an isosceles triangle centered at the observation point and whose base is twice the bandwidth b. Since the area of the triangle must be 1 to make it a proper density function, the height is 1/b. Let’s repeat the 4-point example from the beginning of the lesson: A sample of size 4 from a random variable X is {5, 12, 15, 20}. C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

464

Consider a triangular kernel with bandwidth 1. Then the kernel’s triangle would have a height of 1. Because of the narrow bandwidth, the kernel-smoothed density function at x is 1/4 of the kernel density of the only observation that is within 1 of x, if such an observation exists, otherwise 0. For example: • To compute fˆ(5) : since 5 is the only point within 1 of 5, fˆ(5) is equal to 0.25k 5 (5) . The height of the kernel is 1, and the height is assumed at the center of the bandwidth, the observation point, so k5 (5)  1 and fˆ(5)  0.25. • To compute fˆ(12) : since 12 is the only point within 1 of 12, fˆ(12) is equal to 0.25k12 (12)  0.25. • The same logic and results as the previous two paragraphs would apply when calculating fˆ(15) and fˆ(20) . • To compute fˆ(11.5) : it is equal to 0.25k12 (11.5) . The triangle around 12 has k12 (12)  1 and k 12 (11)  0, so by linear interpolation, k12 (11.5)  0.5. Therefore fˆ(11.5)  0.25 (0.5)  0.125. A graph of fˆ( x ) is shown in Figure 27.6. fˆ( x ) 0.25 0.2 0.15 0.1 0.05 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Figure 27.6: Kernel-smoothed density function using triangular kernel with bandwidth 1

To calculate Fˆ ( x ) , we must integrate fˆ( x ) . This means calculating areas of portions of the triangle. To show you how such a calculation goes, let’s calculate Fˆ (11.6) . We know that K 5 (11.6)  1 since 11.6−5 ≥ 1, and that K 15 (11.6)  K 20 (11.6)  0 since 15 − 11.6 ≥ 1 and 20 − 11.6 ≥ 1. That leaves K 12 (11.6) . To calculate this, we need the area of the part of the triangle around 12 to the left of 11.6. This part of the triangle is itself a triangle. The height is kˆ (11.6)  0.6, and the base is 0.6, so the triangle’s area is 0.5 (0.6)(0.6)  0.18. The triangle is shown in Figure 27.7a. We conclude that Fˆ (11.6)  0.25 (1 + 0.18 + 0 + 0)  0.295. To calculate K 12 (12.2) , we need the area to the left of 12.2. We can calculate this as 1 minus the area to the right of 12.2, as shown in Figure 27.7b. The unshaded triangle has base 0.8 and height 0.8, so its area is 0.5 (0.8)(0.8)  0.32, and K 12 (12.2)  1 − 0.32  0.68. Therefore Fˆ (12.2)  0.25 (1 + 0.68 + 0 + 0)  0.42. A graph of Fˆ ( x ) is in Figure 27.8. C/4 Study Manual—17th edition Copyright ©2014 ASM

27.1. DENSITY AND DISTRIBUTION

465

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2 11

11.6 12

(a) K12 (11.6)

13

11

12.2

(b) K12 (12.2)

13

Figure 27.7: Calculation of kernel distribution for triangular kernel with bandwidth 1

Formulas for triangular kernel density and distribution functions are b − |x − x i |    xi − b ≤ x ≤ xi + b  b2 k xi ( x )   0 otherwise  0 x ≤ xi − b      2  (x − xi + b )    xi − b ≤ x ≤ xi   2b 2 K xi ( x )     (xi + b − x )2   1 − xi ≤ x ≤ xi + b    2b 2    x ≥ xi + b 1 but you shouldn’t memorize these; rather you should reason out the calculation as discussed above. Let’s continue with the 4-point example (observations at 5, 12, 15, and 20), but now set the bandwidth of the triangular kernel equal to 5. Let’s calculate fˆ(11) and Fˆ (11) . For fˆ(11) we need k12 (11) and k 15 (11) . The height of the kernel’s triangle is k12 (12)  1/b  1/5  0.2. The height at 11 is linearly interpolated between k12 (7)  0 and k 12 (12)  0.2, or k12 (11)  (0.8)(0.2)  0.16. For k 15 (11) the height at 11 is linearly interpolated between k15 (10)  0 and k 15 (15)  0.2, or k 15 (11)  (0.2)(0.2)  0.04. So fˆ(11)  0.25 (0.16 + 0.04)  0.05. A graph of fˆ( x ) is shown in Figure 27.9. For Fˆ (11) , we know that K 5 (11)  1 and K 20 (11)  0. For K 12 (11) , we need the area of the triangle to the left of 11, which has base 4 and height 0.16, and therefore area K 12 (11)  0.5 (4)(0.16)  0.32. For K 15 (11) , we need the area of the triangle to the left of 11, which has base 1 and height 0.04 and therefore area K 15 (11)  0.5 (1)(0.04)  0.02. Therefore, Fˆ (11)  0.25 (1 + 0.32 + 0.02 + 0)  0.335. Graphs of the triangles are shown in Figure 27.10. Here’s another example: Example 27B Ten light bulbs were observed until they failed. Times of failure in hours were 620, 635, 638, 640, 657, 657, 665, 695, 705, 710. The distribution function is estimated using the empirical distribution, smoothed with a uniform kernel of bandwidth 20. C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

466 Fˆ ( x ) 1 0.8 0.6 0.4 0.2 5

10

15

20

25

Figure 27.8: Kernel-smoothed disribution function using triangular kernel with bandwidth 1

fˆ( x ) 0.1 0.08 0.06 0.04 0.02 5

10

15

20

25

Figure 27.9: Kernel-smoothed density function using triangular kernel with bandwidth 5

Determine fˆ(655) and Fˆ (655) . Answer: The empirical distribution gives probability 1/10 to each observation. The kernel density function is 1/ (2b )  1/40. There are six observation points within 20 of 655. Therefore 1 1 1 fˆ(655)  {#x i : 635 ≤ x i ≤ 675}  (6)  0.015 . 40 10 400 For the distribution function, we must use the kernel distribution function instead of the kernel density function. For a uniform kernel of bandwidth 20, the kernel distribution function is a straight line starting at (0, 0) ending at (40, 1) . When we fix the point 655, this perspective is reversed; the kernel evaluated at 655 starts at 1 for the observation 635 and decreases in a straight line to 0 at the observation 675. So we C/4 Study Manual—17th edition Copyright ©2014 ASM

27.1. DENSITY AND DISTRIBUTION

467

0.2

0.2

0.16

0.16

0.12

0.12

0.08

0.08

0.04

0.04 7

11 12

10 11

17

(a) K12 (11)

15

(b) K15 (11)

20

Figure 27.10: Calculation of kernel distribution for triangular kernel with bandwidth 5

have: 1 1 1 37 Fˆ (655)  (for 620) + (for 635) + (for 638) 10 10 10 40 1 35 2 18 1 10 + (for 640) + (for 657) + (for 665) 10 40 10 40 10 40 198   0.495 400 The points to the right of 675 have K x i ( x )  0.



Example 27C Repeat the previous example with a triangular kernel of bandwidth 20. Answer: The graphs of the kernel density functions of each of the observations within a bandwidth of 655 are shown in Figure 27.11: For a triangular kernel, the kernel density at the boundaries, the estimation point plus or minus the bandwidth, is zero, so the point 635 can be ignored. A triangle of width 40 and height 1/20 at its center has height 3 1 20 20 5 1 20 20 18 1 20 20 10 1 20 20

3 400 5  400 18  400 10  400 

at 638, at 640, at 657, and at 665.

Therefore (the common denominator of the above fractions, 400, is pulled out)

 1 1  54 fˆ(655)  3 + 5 + 2 (18) + 10   0.0135 . 10 400 4000 In the graphs of Figure 27.12, the shaded areas are the kernel distributions of the observations. Observations above 675 have a kernel distribution of 0 at 655. Notice that for the points 638 and 640, the area C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

468

0.05 0.04 0.03 0.02 0.01 0

0.05 0.04 0.03 0.02 0.01 0

3 400

618

655 658

(a) k 638 (655) 18 400

637

655 (c) k 657 (655)

677

0.05 0.04 0.03 0.02 0.01 0

0.05 0.04 0.03 0.02 0.01 0

5 400

620

(b) k 640 (655)

655

660

10 400

645

655

685

(d) k 665 (655)

Figure 27.11: Kernel density functions for Example 27C

0.05 0.04 0.03 0.02 0.01 0

0.05 0.04 0.03 0.02 0.01 0

0.05 0.04 0.03 0.02 0.01 0

600

(a) K620 (655)

640

655

3 400

618

(c) K638 (655)

655 658

18 400

637

655 (e) K657 (655)

677

0.05 0.04 0.03 0.02 0.01 0

0.05 0.04 0.03 0.02 0.01 0

0.05 0.04 0.03 0.02 0.01 0

615

5 400

620

(d) K640 (655)

655

660

10 400

645

655

(f) K665 (655)

Figure 27.12: Kernel distribution graphs for Example 27C

C/4 Study Manual—17th edition Copyright ©2014 ASM

655

(b) K635 (655)

685

27.1. DENSITY AND DISTRIBUTION

469

we are interested in, the shaded area, is the complement of the area of the unshaded subtriangle, whereas for the points 657 and 665 it is the area of the shaded subtriangle. Point

Subtriangle Base

Subtriangle Height

Subtriangle Area

Kernel Distribution

620 635 638 640 657 657 665

40 40 3 5 18 18 10

1/20 1/20 3/400 5/400 18/400 18/400 10/400

1 1 9/800 25/800 324/800 324/800 100/800

800/800 800/800 791/800 775/800 324/800 324/800 100/800

Total

3914/800

This sum is divided by n  10 to obtain 3914 Fˆ (655)   0.48925 8000



If you don’t want to draw triangles, you can calculate the kernel cumulative distribution function for the triangular kernel density as follows: • If the observation x i is more than the bandwidth to the left of the estimation point x, K x i ( x ) is 1. • If the observation is within a bandwidth to the left of the estimation point, consider the interval [x − b, x], where b is the bandwidth. Calculate the proportion of this interval that the observation point x i is to the right of the x − b (0 if x i  x − b, 1 if x i  x). Square this proportion, multiply it by a half, and subtract it from 1. For example, if the estimation point is 500, the bandwidth is 100, and  2 the observation is 425, then it is 1/4 of the way from x − b to x, so subtract 12 14 from 1 to obtain K 425 (500)  31/32. Now calculate K 475 (500) . (Answer below2) • If the observation is within a bandwidth to the right of the estimation point, consider the interval [x, x + b], where b is the bandwidth. Calculate the proportion of this interval that the observation point x i is to the left of x + b (0 if x i  x + b, 1 if x i  x). Square this proportion and multiply it by a half. For example, if the estimation point is 500, the bandwidth is 100, and the observation point  2 9 . Now calculate K 540 (500) . is 525, it is 3/4 of the way from x + b to x, so K 525 (500)  12 34  32 (Answer below3) • If the observation is more than the bandwidth to the right of the estimation point, K x i ( x ) is 0.

82

83

85

86

87

89

Kernel-density methods are used to estimate the probability density function. A triangular kernel with bandwidth 4 is used. Estimate f (86) .

Answer: K540 (500) 

1 2 2 (0.6 )



23 32 .

 0.18

3

1 3 2 2 4

2

Answer: K475 (500)  1 −

?

Quiz 27-2 A sample has 67 observations less than 82, 27 observations greater than 90, and the following observations:

C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

470

27.1.3

Other symmetric kernels

Occasionally, exams expect you to calculate densities or distributions using other kernels. In all cases, calculate the kernel-smoothed density function using the formula fˆ( x ) 

X

k xi ( x ) fn ( x i )

xi

with the sum taken over all observations x i , to obtain the kernel-smoothed density, where f n ( x i ) is the probability of the point x i in the empirical distribution (usually 1/n). Compute the kernel-smoothed cumulative distribution function using the formula Fˆ ( x ) 

X

K xi ( x ) fn ( x i )

xi

to obtain the kernel-smoothed distribution function. Even though in principle both sums are over all observations x i , usually k x i ( x )  0 except in a finite interval. Sometimes you won’t be given K x i ( x ) and will have to integrate k x i ( x ) . Just like it is possible to calculate the kernel-smoothed distribution function by using the kernel distribution function, it is also possible to use kernel survival functions K xSi ( x )  1 − K x i ( x ) to calculate the kernel-smoothed survival function. As an example of another kernel, consider the Epanechnikov kernel, which was featured on the pre2003 syllabus. It has a quadratic kernel density, defined by

!2   x − xi + 3 *   1− xi − b ≤ x ≤ xi + b  4b b k xi ( x )   ,    0 otherwise 

where b is the bandwidth. This kernel is not defined in the current syllabus. If an exam question wanted to use this, it would define k x i ( x ) in the question. Example 27D You are given a sample with the following five observations: 24

30

40

56

80

You are estimating the underlying distribution using the empirical distribution with kernel-density methods. The kernel is an Epanechnikov kernel (as defined before the example) with bandwidth 10. Determine the estimate of S (32) . Answer: For variety, we’ll compute kernel survival distribution functions instead of kernel cumulative distribution functions. We’ll use the non-standard notation K xSi ( x ) for the kernel survival function. Since S S the kernel is symmetric, K 24 (32)  1 − K40 (32) , so the sum is 1 and we don’t have to calculate either one. S We therefore only have to calculate K 30 (32) . S K 30 (32)

Z 

40 32

!2

3 * x − 30 + 1− dx 40 10

,

! 40 ! 32

3 ( x − 30) 3  x− 40 300 

-

3 103 − 23 8−  0.352 40 300

S S K 56 (32)  K80 (32)  1

The estimate is Sˆ (32)  0.2 (0.352 + 1 + 1 + 1)  0.6704 . C/4 Study Manual—17th edition Copyright ©2014 ASM



27.1. DENSITY AND DISTRIBUTION

27.1.4

471

Kernels using two-parameter distributions

Both the uniform and the triangular kernel are symmetric around each observation. The bandwidth b controls the smoothing; the bigger b is, the bigger the spread and the greater the smoothing. Also, both kernels do not vary with the observation; k x1 is a translated version of k x2 for any two points x1 and x2 . This means that if x < b, these kernels assign positive probability to negative x. This is undesirable. Another type of kernel avoids this problem by varying with x i . It assigns a continuous two-parameter distribution defined on 0 to ∞ to each observation. One parameter controls the spread, and the other is adjusted to make the mean equal to the observation. The textbook considers only one example, the gamma kernel. Every point is assigned a gamma distribution with parameters α i and θi . The parameter α i is adjusted to control the spread; increasing α decreases the spread and decreases smoothing. The parameter θi is then set equal to x i /α i , so that the mean of each kernel equals the observation point. Figures 27.13 and 27.14 show gamma kernel density functions for x i  70. Figure 27.13 uses α  25 and Figure 27.14 uses α  100. Notice the more concentrated peak for α  100. 0.1

1

0.08

0.8

0.06

0.6

0.04

0.4

0.02

0.2

40

50

60

70

80

90

(a) Kernel density function, k 70 ( x )

100

40

50

60

70

80

90

100

70

80

90

100

(b) Kernel distribution function, K70 ( x )

Figure 27.13: Gamma kernel with α  25

0.1

1

0.08

0.8

0.06

0.6

0.04

0.4

0.02

0.2

40

50

60

70

80

90

(a) Kernel density function, k 70 ( x )

100

40

50

60

(b) Kernel distribution function, K70 ( x )

Figure 27.14: Gamma kernel with α  100

Unlike bounded symmetric kernels, calculating kernel-smoothed fˆ( x ) and Fˆ ( x ) for two-parameter distribution kernels require summing over all observations 0 ≤ x < ∞. Because of the difficulty in doing C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

472

this, as well as the fact that the gamma distribution function usually cannot be calculated with a calculator, exam questions requesting the calculation of fˆ( x ) or Fˆ ( x ) with this type of kernel are unlikely. On the other hand, a question asking for moments of these kernel-smoothed distributions is doable. We will discuss calculating moments for kernel-smoothed distributions in the next section. Example 27E You are given the following two observations: 5, 15. You will use the empirical distribution with kernel-density methods to model the underlying random variable generating these observations. For each observation x i , the kernel will have a Pareto distribution with α  10 and mean matching the observation. Determine the kernel-smoothed value of F (8) . Answer: For the point 5, in order to have θ/ ( α −1)  5, we need θ  45. For the point 15, we need θ  135. Then

! 10

K 5 (8)  1 −

45 45 + 8

K 15 (8)  1 −

135 135 + 8

 0.8053

! 10

 0.4377

The kernel-smoothed F (8) is 0.5 (0.8053 + 0.4377)  0.6215 .



Example 27F You are given the following two observations: 6, 30. You will estimate the underlying distribution using kernel-density methods. The kernel is to have a gamma distribution with α  5 and mean matching the observation. Determine the resulting estimate of f (20) . Answer: For the point 6, the gamma parameters will be α  5, θ  1.2, so k 6 (20) 

204 e −20/1.2  0.000154797 Γ (5) 1.25

For the point 30, the gamma parameters will be α  5, θ  6, so k 30 (20) 

204 e −20/6  0.030585 Γ ( 5 ) 65

The kernel density estimate of f (20) is therefore 0.5 (0.000154797 + 0.030585)  0.015370 .



Note All of our kernels were unbiased: the mean of the kernel around an observation point equals the observation point. However, it is possible to specify a biased kernel. An example of a biased kernel would be: for each observation y, the kernel density is 1/4 from y − 1 to y + 3 and 0 otherwise.

27.2

Moments of kernel-smoothed distributions

The kernel-smoothed distribution is a legitimate distribution function, and one can calculate moments of the resulting random variable. Students reported a question involving calculation of moments on the unreleased Spring 2006 exam, so let’s discuss R ∞ how to do the calculation efficiently. Of course, you can calculate E[X n ]  −∞ x n f ( x ) dx using the kernel-smoothed density function as f ( x ) . However, since a kernel-smoothed random variable is a mixture, it is easier to use the conditional moment or double expectation formulas. The conditional mean formula is equation (1.3) on page 9:

f

E[X]  E E[X | Y] C/4 Study Manual—17th edition Copyright ©2014 ASM

g

27.2. MOMENTS OF KERNEL-SMOOTHED DISTRIBUTIONS

473

The conditional variance formula is equation (4.2) on page 64: Var ( X )  E Var ( X | Y ) + Var E[X | Y]

f

g





Here Y will be the original random variable. X | Y is a random variable distributed with the kernel as its distribution. Example 27G You are given the following sample data: 10

15

18

20

The data are smoothed with a uniform kernel with bandwidth 2. Calculate the mean and variance of the kernel-smoothed distribution. Answer: It is unnecessary to explicitly compute the kernel-smoothed density function at each point. Let X be the smoothed distribution, and Y be the unsmoothed distribution. We are going to condition X on Y. The meaning of this is that if we know Y, then X is in the kernel of the known point of Y. For example, in this case, X | Y  10 would mean that X is in the interval [8, 12]. f g We want to calculate E[X] and Var ( X ) . The conditional mean formula says E[X]  E E[X | Y] , but E[X | Y]  Y, since the mean of the kernel-smoothed distribution given the unsmoothed observation is the observation. So E[X]  E[Y]; the expected value of the kernel smoothed distribution is the same as the expected value of the original distribution, or E[X]  E[Y] 

10 + 15 + 18 + 20  15.75 4

By the conditional variance formula, Var ( X )  Var E[X | Y] + E Var ( X | Y )





f

g

E[X | Y]  Y, as we said in the preceding paragraph, so Var E[X | Y]  Var ( Y ) , the variance of the original distribution. We compute the variance of the original distribution in the usual way, by subtracting the square of the mean from the second moment.





102 + 152 + 182 + 202  262.25 4 Var ( Y )  262.25 − 15.752  14.1875 E[Y 2 ] 

2

4 Var ( X | Y ) , the variance of a uniform distribution on [x − 2, x + 2] is 12  43 regardless of x. In general, k2 the variance of a uniform distribution on an interval of length k is 12 . The expectation of a constant is the f g constant, so E Var ( X | Y )  34 . Then

Var ( X )  14.1875 +

4  15.52083 3



From the above, we can write a general formula for the variance of a uniform kernel-smoothed density. Letting X be the kernel-smoothed density and Y the original distribution, Var ( X )  Var ( Y ) +

b2 3

(27.1)

Let’s derive a general formula for the variance of a triangular kernel-smoothed density. From the above, Var ( X )  Var ( Y ) + Var[X | Y]. Without loss of generality, we can set Y  0. The kernel density is then C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

474

a triangle around 0 starting at (−b, 0) , rising to (0, 1/b ) in a straight line, then dropping to ( b, 0) . The equations for these lines are: x+b    −b ≤ x ≤ 0   b2 f (x )    −x + b    0≤x≤b  b2 The mean is 0, so the variance is the second moment. By symmetry around 0 (substitute −x for x), the integrals of x 2 over these two components are equal, so let’s compute just the second one and then double it. b

Z 0

! b 0 ! 4

1 −x 4 bx 3 x 2 (−x + b ) dx  + 4 3 b2 b2 

1 −b 4 b + 4 3 b2



b2 12

So Var ( X | Y )  b 2 /6, and the formula for the variance of the triangular kernel-smoothed density is Var ( X )  Var ( Y ) +

?

b2 6

(27.2)

Quiz 27-3 For a random sample of 20,

X

x i  780

X

x 2i  35,212

The underlying distribution is estimated using kernel-density methods. A uniform kernel with bandwidth 10 is used. Calculate the variance of the kernel-smoothed distribution.

Exercises 27.1.

From a population having density function f , you are given the following sample: 31,

37,

39,

42,

42,

45,

48,

51

Calculate the kernel density estimate of f (40) , using the uniform kernel with bandwidth 5. 27.2.

[4-F04:20] From a population having distribution function F, you are given the following sample: 2.0,

3.3,

3.3,

4.0,

4.0,

4.7,

4.7,

4.7

Calculate the kernel density estimate of F (4) , using the uniform kernel with bandwidth 1.4. (A) 0.31

(B) 0.41

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.50

(D) 0.53

(E) 0.63

Exercises continue on the next page . . .

EXERCISES FOR LESSON 27

475

27.3. Based on a mortality study, all deaths occurred at times 10, 12, 16, 17, 19, 20, or after 20. The empirical distribution function is as follows: t Fn ( t )

10 0.20

12 0.40

16 0.50

17 0.80

19 0.86

20 0.95

Estimate f (15) by using kernel density methods with a uniform kernel of bandwidth 4. 27.4. You are using a kernel-smoothed estimate of the density function. There are 10 observations. You are using a uniform kernel with bandwidth 4. You calculate fˆ(5)  0.0125. You then find out that an observation of 10 should have been recorded as 8. Determine the corrected estimate of f (5) . 27.5.

You are given the following information about the empirical distribution function: 2 0.1

t Fn ( t )

5 0.2

7 x

9 0.7

All deaths between times 2 and 9 occurred at times 5 or 7.

You estimate f ( t ) using a uniform kernel with bandwidth 2.5 and calculate fˆ(6)  0.06. Determine Fn (7) . 27.6.

You are given the following data on time to death: Time tj

Number of Deaths sj

Number of Risks rj

2 4 7 8 11

1 1 1 2 3

100 99 98 97 95

The distribution function F underlying these data is estimated using kernel density methods. Using the uniform kernel with bandwidth 4, estimate F (7) . 27.7.

You are given the following data on time to death: Time yj

Number of Deaths sj

Number of Risks rj

10 25 48 53 82

1 1 2 1 1

30 26 22 20 16

You are to use the Kaplan-Meier estimator to estimate S ( t ) , and then to use kernel density methods to estimate f ( t ) . Kernel smoothing is to be done using the uniform kernel with bandwidth 14. Determine fˆ(39) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

27. KERNEL SMOOTHING

476

27.8. [4-F03:4] You study five lives to estimate the time from the onset of a disease to death. The times to death are: 2

3

3

3

7

Using a triangular kernel with bandwidth 2, estimate the density function at 2.5. (A) 8/40

(B) 12/40

(C) 14/40

(D) 16/40

(E) 17/40

27.9. Based on a mortality study, all deaths occurred at times 10, 12, 16, 17, 19, 20, or after 20. The empirical distribution function is as follows: t Fn ( t )

10 0.20

12 0.40

16 0.50

17 0.80

19 0.86

20 0.95

Estimate f (15) by using kernel density methods with a triangular kernel of bandwidth 5. 27.10. Losses on an insurance coverage are estimated using kernel density methods in conjunction with the following sample of eight losses: 15

18

36

40

45

60

77

100

Let X be the random variable for loss size. Using a triangular kernel with bandwidth 20, estimate the probability that a loss is greater than 40 and less than 50, or Pr (40 < X < 50) . Use the following information for questions 27.11 and 27.12: A study of time on disability is done on 10 lives. The amount of time in months each one is on disability is: 8

10

12

13

14

17

30

36

47

60

27.11. You will use a uniform kernel with bandwidth 10 to estimate S ( x ) and f ( x ) . Using these estimates, determine hˆ (20) . 27.12. You will use a triangular kernel with bandwidth 8 to estimate S ( x ) and f ( x ) . Using these estimates, determine hˆ (20) . 27.13. For a mortality study on 72 lives, you have the following information on death times: Time Number of Deaths

35 1

42 1

49 1

52 1

58 2

65 1

There were no deaths before time 35. You estimate S ( x ) using a triangular kernel with bandwidth 10. Determine Sˆ (50) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 27

477

27.14. A mortality study is conducted on 10 people. One death apiece occurred at times 8, 12, 13, and 14. A fifth person died at some time x between 7 and 15. No other life died before time 15. A triangular kernel with bandwidth 4 was used to estimate the density function. Based on this estimate, fˆ(11)  0.06875. Determine all possible values for x. 27.15. You perform an agent persistency study. Times to termination for the 4 agents in the study are 1, 2, 4, 4. You use an exponential kernel which preserves the mean at each observation point to smooth the survival function. Determine fˆ(3) . 27.16. [C-S05:22] You are given the kernel:

p 2   1 − (x − y )2 , y − 1 ≤ x ≤ y + 1 π k y (x )    0, otherwise  You are also given the following random sample: 1

3

3

5

Determine which of the following graphs shows the shape of the kernel density estimator:

(A)

(B)

(C)

(D)

(E) 27.17. You are given: (i)

Three observations: 4

(ii)

5

7

The selected kernel, which does not have the same mean as the empirical estimate, has distribution function: 0, x < y−1      x − y +1 K y (x )   , y−1 ≤ x ≤ y+3   4    x > y+3  1,

Calculate the kernel-smoothed fˆ(5.5) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

27. KERNEL SMOOTHING

478

27.18. For a group of 50 policyholders, you have the following information on time until surrender of policy: Time of Surrender

Number of Surrendered Policies

1 2 3 4 5

7 5 4 2 2

To estimate f ( x ) , you use the kernel

 0 x < y−2     !  2  3 * x−y + k y (x )   1− y−2 ≤ x ≤ y+2  8 2   ,    0 x > y+2  Determine fˆ(3) , the estimate of f (3) . 27.19. [Sample:300] You are given: (i)

Three observations:

(ii)

The selected kernel, which does not have the same mean as the empirical estimate, has distribution function: 0, x < y−1       x − y + 1 K y (x )   , y−1 ≤ x ≤ y+2   3    x > y+2  1,

2

5

8

Calculate the coefficient of variation of the kernel density estimator. (A) 0.47

(B) 0.50

(C) 0.52

(D) 0.57

(E) 0.58

27.20. Regarding kernel density methods, which of the following statements is false? (A) (B) (C) (D) (E)

If the expected value of the kernel distribution for each point equals the point, then the variance of the kernel smoothed distribution is at least as high than the variance of the raw distribution. For the gamma kernel, a higher shape parameter leads to less smoothing. Kernel smoothing results in a continuous density function. Kernel smoothing may result in assigning probabilities to negative values even when all the observations are positive. For a kernel with a bandwidth, a larger bandwidth results in more smoothing.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 27

479

27.21. You are given the following random sample: 10

25

37

52

72

100

The data are smoothed using a uniform kernel with bandwidth 20. Calculate the variance of the smoothed distribution. 27.22. You are given the following random sample: 6

8

8

15

22

The data are smoothed using a uniform kernel with bandwidth 5. Calculate the raw third moment of the smoothed distribution. 27.23. You are given the following random sample: 4

6

8

10

12

16

20

26

40

50

The data are smoothed using a uniform kernel with bandwidth 10. Let X be the smoothed random variable. Calculate E[X ∧ 20]. 27.24. You are given a sample x1 , . . . x 20 , with the following summary statistics: 20 X

20 X

x i  244

i1

i1

x 2i  3604

The underlying distribution is estimated using kernel density methods, using a uniform kernel with bandwidth 3. Calculate the variance of the kernel-smoothed distribution. 27.25. You are given the following random sample: 3

4

6

9

15

18

19

24

25

25

The data are smoothed using a triangular kernel with bandwidth 5. Calculate the mean of the smoothed distribution. 27.26. You are given the following random sample: 23

28

29

35

50

The data are smoothed using a triangular kernel with bandwidth 10. Calculate the variance of the smoothed distribution.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

27. KERNEL SMOOTHING

480

27.27. You are given the following random sample: 10

10

13

18

21

24

26

30

40

50

The data are smoothed using a triangular kernel with bandwidth 6. Let X be the smoothed random variable. Calculate E[X 2 ]. 27.28. You are given the following random sample: 15

20

25

25

The data are smoothed using a gamma kernel with α  5. Calculate the variance of the smoothed distribution. 27.29. You are given the following random sample: 2

3

5

6

10

The data are smoothed using kernel density methods. The kernel-smoothed distribution is a mixture of two-parameter Pareto distributions, with the kernel at each point having a Pareto distribution with α  3 and θ selected so that the mean of the kernel distribution equals the point. Let X be the kernel-smoothed random variable. Calculate Var ( X ) . Additional released exam questions: C-F05:9, C-F06:24, C-S07:16

Solutions 27.1. The kernel density is 1/ (2b )  1/10, and the five points 37, 39, 42, 42, and 45 are within 5 of 40 (since points on the boundaries count). There are 8 points, so each point gets weight 1/8. The estimate is fˆ(40)  5 (1/8)(1/10)  0.0625 . 27.2. Notice that even though a kernel density estimate is used (that’s the name of the method), we’re estimating the distribution function, F. The kernel distribution is 1 for 2, which is more than 1.4 below the estimation point 4. Since 3.3 is half way to the center of the kernel range, K 3.3 (4)  0.75. Since 4 is right in the center, K 4 (4)  0.5. Since 4.7 is half way past the center, K 4.7 (4)  0.25. We add up the kernels and divide by 8, the number of points: 1 + 2 (0.75) + 2 (0.5) + 3 (0.25) 4.25 Fˆ (4)    0.53125 8 8

(D)

27.3. We need to sum up the empirical probabilities of 12, 16, 17, and 19 multiplied by the k yi (15) for y i  12, 16, 17, and 19. For all other y, k y (15)  0. The empirical probabilities of those four points can be deduced from the increase in Fn ( y i ) at those points. Letting f n ( y i ) be the probability of y i , f n (12)  Fn (12) − Fn (10)  0.40 − 0.20  0.20

f n (16)  Fn (16) − Fn (12)  0.50 − 0.40  0.10

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 27

481

f n (17)  Fn (17)  Fn (16)  0.80 − 0.50  0.30 f n (19)  Fn (19) − Fn (17)  0.86 − 0.80  0.06

The kernel density function at all four points is 1/2b  1/8. Since the kernel density function is constant, we can multiply it by the sum of the f n ’s at the four points, which is Fn (19) − Fn (10)  0.66 (so we could skip the detailed calculations of f n ( y i ) ). Then fˆ(15)  81 (0.66)  0.0825 .

1 27.4. The effect on the estimate is to increase it by the probability weight of the observation ( 10 ) times 1 1 the kernel ( 8 ), or by 80  0.0125, so 0.0125 + 0.0125  0.025 . 27.5. The key observation is that for a uniform kernel, fˆ( t ) is the number of deaths within a bandwidth, divided by twice the bandwidth and the sample size. But the empirical distribution function Fn ( x ) is the number of deaths less than or equal to x divided by the sample size. Thus,

Fn ( t + b ) − Fn ( t − b − ) fˆ( t )  2b Note that Fn (3.5)  Fn (2)  0.1 and Fn (8.5)  Fn (7)  x, since the empirical function is constant between event times. Thus Fn (8.5) − Fn (3.5) x − 0.1 fˆ(6)  0.06   2 (2.5) 5 0.3  x − 0.1 x  0.4

27.6. Note that the data are complete at least through 11. K 2 (7)  1, K4 (7)  78 , K 7 (7)  12 , K 8 (7)  83 , K 11 (7)  0.  1  25 Fˆ (7)  1 + 78 + 12 + 38 (2)   0.03125 . 100 800 27.7. In this solution, we will use hats for the Kaplan-Meier estimate and tildes for the kernel-smoothed estimate. This exercise goes beyond the textbook in that you aren’t given complete data. Therefore, the weights on the points are not 1/n. Since we are using a uniform kernel, we will be multiplying (1/2b )  1/28 by the sum of the probabilities of all observation points within 14 of 39, so all we need to know is Fˆ (53) − Fˆ (25− ) , or Fˆ (53) − Fˆ (10) ; this follows the convention that the boundary points are included in the kernel. 29  0.9667 Sˆ (10)  30 ! ! ! ! 29 25 20 19 Sˆ (53)   0.8027 30 26 22 20 Sˆ (10) − Sˆ (53)  0.1639 (This is the same as Fˆ (53) − Fˆ (10) .) 1 fˆ(39)  (0.1639)  0.005855 28

27.8. The observation points at 2 and 3 are both 1/4 of the way down the triangle, since they’re half a unit away from the estimation point 2.5 whereas the bandwidth is 2. The height of the triangle is 21 (to   make the area 1; the base is 4), so the kernel density is 34 21  38 . The other observation points are more than 2 away from 2.5, so they contribute nothing. Since the sample has 5 observations, each observation has density 0.2. Putting everything together, we have 4 points times 38 kernel density times 0.2 empirical density, or 12/40 . (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

482

27.9. We can deduce f n ( y i ) for y i  12, 16, 17, and 19, the points within 5 of 15, as the increase in Fn ( y i ) at those points. (The empirical distribution is discrete, so f n ( y i ) is the probability of y i in the empirical distribution, not a probability density function.) Therefore: f n (12)  Fn (12) − Fn (10)  0.40 − 0.20  0.20

f n (16)  Fn (16) − Fn (12)  0.50 − 0.40  0.10

f n (17)  Fn (17) − Fn (16)  0.80 − 0.50  0.30

f n (19)  Fn (19) − Fn (17)  0.86 − 0.80  0.06

Then the kernel smoothed density function at 15 will be the sum of f n ( y i ) times the kernel density k yi (15) at these points. As a function of y, k y (15) is 0 for y  10, 1/b  1/5 for y  15, and 0 for y  20, with linear interpolation between 10 and 15 and between 15 and 20. Therefore k12 (15)  0.08 k16 (15)  0.16 k17 (15)  0.12 k19 (15)  0.04 The kernel-smoothed density is fˆ(15)  (0.2)(0.08) + (0.1)(0.16) + (0.3)(0.12) + (0.06)(0.04)  0.0704 27.10. We will need to calculate Fˆ (50) − Fˆ (40) . For Fˆ (40) , we have K15 (40)  K 18 (40)  1 and K x i (40)  0 for x i ≥ 60. Also, K 40 (40)  0.5, since the distribution function is 0.5 at the observation point. That leaves K 36 (40) and K 45 (40) . For K 36 (40) , 36 is 16 to the right of 20, or 8/10 of the way to 40, so K 36 (40)  1−0.5 (0.82 )  0.68. 45 is 15 to the right of 60, or 3/4 of the way from 60, so K 45 (40)  0.5 (0.752 )  0.28125. We conclude Fˆ (40)  (2 + 0.68 + 0.5 + 0.28125) /8  0.43265625. For Fˆ (50) , we have K 40 (50) + K 60 (50)  1 by symmetry. Since 36 is 6 to the right of 30, or 0.3 of the way to 50, K 36 (50)  1 − 0.5 (0.32 )  0.955. Since 45 is 15 to the right of 30, or 0.75 of the way to 50, K 45 (50)  1−0.5 (0.752 )  0.71875. We conclude Fˆ (50)  (2+1+0.955+0.71875) /8  0.58421875. Therefore, Pr (40 < X < 50)  0.58421875 − 0.43265625  0.1515625 . 6 27.11. There are 6 observations in the range 10–30, so fˆ(20)   0.03. K 8 (20)  K 10 (20)  1, K 12 (20) 

18 20 ,

K 13 (20) 

17 20 ,

K 14 (20) 

16 20 ,

K 17 (20) 

13 20 .

(10)(20)

104 1 1 Fˆ (20)  (20 + 20 + 18 + 17 + 16 + 13)   0.52 20 10 200 Sˆ (20)  1 − 0.52  0.48 0.03 hˆ (20)   0.0625 0.48 27.12.

k 13 (20) 

1 64 ,

k14 (20) 

2 64 ,

k17 (20) 

5 64 ,

so

1 1 fˆ(20)  (1 + 2 + 5)  0.0125. 10 64 The derivation of the non-trivial kernel distribution functions is in the following figure: C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 27

0.12 0.1 0.08 0.06 0.04 0.02 0

483

1 64

5

20 21

(a) K 13 (20) 0.12 0.1 0.08 0.06 0.04 0.02 0

0.12 0.1 0.08 0.06 0.04 0.02 0

2 64

6

20

(b) K14 (20)

22

5 64

9

20

(c) K17 (20)

K 8 (20)  K 10 (20)  K 12 (20)  1. K 13 (20) 

127 128 ,

K 14 (20) 

25

124 128 ,

K 17 (20) 

103 128 ,

so

 1 1  Fˆ (20)  3 (128) + 127 + 124 + 103  0.5765625 128 10 Sˆ (20)  1 − 0.5765625  0.4234375 0.0125 hˆ (20)   0.02952 0.4234375 27.13. The derivation of the non-trivial kernel distribution functions is in the following figure:

0.1 0.08 0.06 0.04 0.02 0

0.1 0.08 0.06 0.04 0.02 0

2 100

32

50

(a) K 42 (50)

52

8 100

42

50 (c) K52 (50)

K 35 (50)  1, K 42 (50) 

196 200 ,

62

K 49 (50) 

119 200 ,

0.1 0.08 0.06 0.04 0.02 0

0.1 0.08 0.06 0.04 0.02 0

K 52 (50) 

9 100

39

50 (b) K49 (50)

2 100

48

64 200 ,

50

K 58 (50) 

(d) K 58 (50) 4 200 .

 1 1  Fˆ (50)  200 + 196 + 119 + 64 + 4 (2)  0.04076 72 200 Sˆ (50)  1 − 0.04076  0.95924 C/4 Study Manual—17th edition Copyright ©2014 ASM

59

68

27. KERNEL SMOOTHING

484

3 2 1 16 , 16 , 16 ,

27.14. The weights on 8, 12, 13, and 14 are 1 weight 10 . We thus have 1 1 10 16 (1

and

1 16

respectively. Every point has probability

+ 3 + 2 + 1 + y )  0.06875 7 + y  11 y  4,

where y is 16 times the kernel density assigned to the unknown point x. So the kernel density was 14 . But this is the maximum density for this kernel, and is only assigned at the center of the triangle, 11. We therefore conclude that x can only be 11 . 27.15. We have: k 1 (3)  e −3  0.04979 1 k 2 (3)  e −3/2  0.11157 2 1 k 4 (3)  e −3/4  0.11809 4 and since each point has probability weight fˆ(3) 

1 4



1 4

0.04979 + 0.11157 + 2 (0.11809)  0.0994



27.16. The kernel density function is a semi-circle with radius 1, scaled to have area 1. Thus from 0 to 2, only the observation 1 in involved; from 2 to 4, only the observations at 3; from 4 to 6, only the observation at 5. Thus it will look like (D), with a dome twice as high in the middle because there are 2 observations at 3. 27.17. K y ( x ) is a linear function of x, so the kernel is uniform on the interval [y − 1, y + 3]. For the point x  5.5, k y ( x )  1/4 for y  4 and 5 and 0 for y  7. Divide the sum of the k y s by 3. The result is fˆ(5.5)  1/3 (1/4 + 1/4)  1/6 . 27.18. Notice that k y ( x ) is 0 if | y − x| ≥ 2, so we only need to consider observation points y  2, 3, 4. The empirical densities and kernel densities for those points are 4 50 3 k 3 (3)  8

5 50 9 k 2 (3)  32

f50 (2) 

2 50 9 k 4 (3)  32

f50 (3) 

f50 (4) 

The kernel-smoothed density is fˆ(3) 

X y

k y ( x ) f50 ( x ) 

9 32

!

5 3 + 50 8

!

!

4 9 + 50 32

!

!

2  0.069375 50

!

This kernel is the Epanechnikov kernel from the pre-Spring 2003 syllabus, with bandwidth 2. 27.19. The kernel distribution function K y ( x ) is a linear function of x, so it is uniform. However, it begins at y − 1 and ends at y + 2, so its mean is y + 21 instead of the more usual y. Thus the mean of each kernel is 1/2 higher than the observation. Each kernel is a uniform distribution over an interval of length 3, so the variance of each kernel is 32 /12  0.75.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 27

485

The mean of the kernel smoothed distribution Y in terms of the empirical distribution X, by double expectation, is f g E E[Y | x i ]  E[X + 21 ]  12 + E[X]  5.5

where x i is each observation. The variance of Y is calculated using the conditional variance formula: Var ( Y )  E[Var ( Y | x i ) ] + Var (E[Y | x i ])

As we mentioned above, the variance of each kernel is 0.75, so Var ( Y | x i )  0.75. Var ( Y )  E[0.75] + Var ( X + 0.5)

But Var ( X + 0.5)  Var ( X ) 

(2 − 5) 2 + (8 − 5) 2 3

6

So Var ( Y )  0.75 + 6  6.75, and the coefficient of variation is 6.75/5.52  0.4724 . (A) For this particular setup, there is a shortcut. The kernel is uniform starting 1 to the left of each observation and ending 2 to the right, but the observations are 3 apart, so the kernel smoothed distribution is uniform from 1 to 10. The coefficient of variation of a uniform distribution on [1, 10] is the square root of the variance, 92 /12, divided by the mean which is the midpoint, 5.5. That works out to 0.4724. 27.20. For statement (A), the conditional variance formula says that the variance of the kernel smoothed distribution equals the variance of the expected values of the kernels, which is the variance of the raw distribution, plus the expected value of the variances of the kernels. Since the expected value of the variances must be nonnegative, (A) is true. Most of the other statements can be found in the lesson. The exception is (C). For example, if you use a uniform kernel with a bandwidth smaller than half the distance between two observations, the kernel smoothed function will abruptly fall to zero in between the two points, and thus is not continuous. 27.21. The empirical variance of the observations is

p

10 + 25 + 37 + 52 + 72 + 100  49 13 6 P 2 xi 102 + 252 + 372 + 522 + 722 + 1002   3330 13 6 6 x¯ 

2

σˆ 2  3330 31 − 49 13  896 59

By equation (27.1), for X the smoothed random variable Var ( X )  896 59 +

202  1029 98 3

27.22. We will use double If X is the kernel-smoothed random variable, then its third raw f expectation. g moment is EX [X 3 ]  Ex i EX [X 3 | x i ] , where x i are the observations. The third raw moment of a uniform distribution centered at x i and extending b in each direction is 1 2b

Z

x i +b x i −b

x 3 dx 

(xi + b )4 − (xi − b )4 8b

This formula is good enough for our purposes, but it can be rearranged, since the even powers of x i cancel in the numerator, as 3 3 ( x i + b ) 4 − ( x i − b ) 4 2 (4bx i + 4b x i )   x 3i + b 2 x i 8b 8b C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

486

So the raw third moment of the smoothed distribution is the raw third moment of the original distribution plus b 2 times the mean of the original distribution. In our case, b  5, and letting Y be the original random variable Y¯  11.8

P

x 3i

5



63 + 2 (83 ) + 153 + 223  3052.6 5

and the third raw moment of X is 3052.6 + 25 (11.8)  3347.6 . 27.23. We will use double expectation: EX [X ∧ 20]  Ex i EX [X ∧ 20 | x i ] , where x i are the observations. For x i ≤ 10, the expected value of X ∧ 20 | x i is x i under the uniform distribution centered at x i , while for x i ≥ 30, X ∧ 20 | x i  20 so its expected value is 20. For the x i ’s in between, the easiest way to calculate the expected value is to split the kernel’s range into two pieces: the piece below 20 and the piece above 20. For the piece below 20, which has probability equal to the proportionate part of the range below 20, the mean is the midpoint, while for the piece above 20, the mean is 20. For example, for x i  12, the uniform distribution given x i is uniform on [2, 22]. If X is a uniform random variable on [2, 22], the mean is computed as the probability of X ≤ 20, or 0.9, times the midpoint of [2, 20], or 11, plus the probability of X > 20, or 0.1, times 20, so

f

g

E[X ∧ 20 | x i  12]  0.9 (11) + 0.1 (20)  11.9 Similarly for the other three points between 10 and 30: E[X ∧ 20 | x i  16]  0.7 (13) + 0.3 (20)  15.1

E[X ∧ 20 | x i  20]  0.5 (15) + 0.5 (20)  17.5

E[X ∧ 20 | x i  26]  0.2 (18) + 0.8 (20)  19.6

Averaging E[X ∧ 20 | x i ] over the ten x i ’s, E[X ∧ 20] 

4 + 6 + 8 + 10 + 11.9 + 15.1 + 17.5 + 19.6 + 2 (20)  13.21 10

27.24. The empirical variance is 244  12.2 20 3604 σˆ 2  − 12.22  31.36 20 x¯ 

From formula (27.1), the variance of the kernel-smoothed distribution is 31.36 + 32 /3  34.36 . 27.25. As we discussed, the mean of the smoothed distribution is the same as the original mean: E[X] 

3 + 4 + 6 + 9 + 15 + 18 + 19 + 24 + 25 + 25  14.8 10

This kernel would put some weight on x’s below 0.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 27

487

27.26. Let Y be the original distribution and X the kernel-smoothed distribution. The mean of X is the mean of Y, which is 23 + 28 + 29 + 35 + 50 E[Y]   33 5 The second moment of the original distribution Y is E[Y 2 ] 

232 + 282 + 292 + 352 + 502  1175.8 5

So the variance of the original distribution is Var ( Y )  1175.8 − 332  86.8. As we discussed at the end of the lesson (formula (27.2)), the variance of a triangular kernel-smoothed distribution is the original variance plus b 2 /6, so Var ( X )  86.8 + 100/6  103.46667 . 27.27. This can be done directly with double expectation, but since we have formulas for the mean and variance, we’ll calculate the second moment as the mean squared plus the variance. Let Y be the original random variable. Then, using formula (27.2) for the variance, E[X]  E[Y] 62  Var ( Y ) + 6 6 E[X 2 ]  Var ( X ) + E[X]2  Var ( Y ) + E[Y]2 + 6  E[Y 2 ] + 6

Var ( X )  Var ( Y ) +

The second moment of Y is

P

x 2i /10  738.6, so E[X 2 ]  738.6 + 6  744.6 .

27.28. The mean of the original data Y and therefore of the smoothed distribution X is (15 + 20 + 25 + . 2 2 2 25) /4  21.25. The second moment of the original data is 15 + 20 + 2 (25 ) 4  468.75. The variance of the original data is 468.75 − 21.252  17.1875. The kernels are gamma distributions with parameters α  5 and θ  3 for 15, 4 for 20, and 5 for 25. The variance of each distribution is αθ 2 , or 45 for 15, 80 for 20, and 125 for 25. The expected value of the variances is f g 45 + 80 + 2 (125) E Var ( X | Y )   93.75 4 So the variance of X is 17.1875 + 93.75  110.9375 . 27.29. Let Y be the original variable. By conditional variance, Var ( X ) is Var ( Y ) plus the expected value of the variances of X | x i . The variance of X | x i , using the formula for first and second moments of a Pareto, is !2 !2 2θ 2 2θ 2 3θ 2 θ θ Var ( X | x i )  −  −  ( α − 1)( α − 2) α−1 2 2 4 θ is selected so that θ/ ( α − 1)  x i , so θ  2x i and the formula becomes Var ( X | x i )  3x 2i

We need to calculate 3 times the second moment of x i :

22 + 32 + 52 + 62 + 102  34.8 5 The variance of the original distribution is 2 + 3 + 5 + 6 + 10 Y¯   5.2 5 Var ( Y )  34.8 − 5.22  7.76

So Var ( X )  7.76 + 3 (34.8)  112.16 . C/4 Study Manual—17th edition Copyright ©2014 ASM

27. KERNEL SMOOTHING

488

Quiz Solutions 27-1. The kernel distribution of 86 is 1 for x i ≤ 82, 0 for x i ≥ 90, and a straight line in between 82 and 90, so it is 7/8 at 83, 5/8 at 85, 4/8 at 86, 3/8 at 87, and 1/8 at 89. Summing up over the 100 observations: 68 (1) + 7/8 + 5/8 + 4/8 + 3/8 + 1/8 Fˆ (86)   0.705 100 27-2. The kernel density is 0 for x i ≤ 82 and x i ≥ 90. It rises to 1/b  1/4 at 86 and down to 0 at 90 linearly, so k83 (86)  k89 (86)  1/16, k 85 (86)  k87 (86)  3/16, and k86 (86)  1/4. Summing up and dividing by the number of observations (100): 2/16 + 6/16 + 1/4 fˆ(86)   0.0075 100 27-3.

The empirical variance is 35,212 780 − 20 20

!2

 239.6

The variance of the kernel-smoothed distribution, by the formula we developed, is b 2 /3 higher, or 239.6 + 102 /3  272.93 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 28

Mortality Table Construction Reading: Loss Models Fourth Edition 12.4 While the techniques in this lesson can be used to estimate any random variable, they are usually used for mortality table construction, or for modeling other random variables that are a function of time. Mortality tables are constructed by specifying the probability that a person age x will die in the next year, denoted by q x , for every integral age x. These tables do not specify probabilities of surviving fractional years, or survival functions for a person at a fractional age. They are built off large data sets with almost all data censored and truncated. Parametric functions are usually inadequate for fitting mortality data. We will consider two categories of techniques for estimating q x . The first category uses individual data. The second category uses data grouped into intervals. In the methods we discuss in this lesson, the mortality rate q x or the hazard rate is estimated by dividing number of deaths by number of exposures. The definition of exposure depends on the method. If you took Exam MLC or Exam LC, you have encountered the following notation: nqx

is the probability of death within n years for a person age x.

npx

is the probability of survival for n years for a person age x.

You are not responsible for this notation for Exam C/4, but we will occasionally use it since it is convenient. But usually, consistent with the textbook, we will use the (ambiguous) notation q j and p j for the probability of death and survival (respectively) during period j, where the period may be several years long. So when we use the subscript j on q or p, the period may be more than one year and j is not necessarily an age, whereas when we use the subscript x and there is no left subscript, it will indicate a one-year period starting at age x.

28.1

Individual data based methods

In the methods of this section, exposure is measured policy by policy. We discuss two methods for estimating q x . The two methods differ in the definition of exposure and in what is estimated (mortality rate or hazard rate). The first method uses exact exposure. The exact exposure for each individual is the amount of time from entry into the study until leaving the study. Entry into the study may be at the start of the study or may be late. Leaving the study may be by death, withdrawal, or termination of the study. Exposure is tabulated by age. Here’s an example. Example 28A A 3-year mortality study uses records starting on Jan. 1, 2009 and ending on Dec. 31, 2011. For simplicity, assume all events including births, policy issue, withdrawals, and deaths occur on the first day of the month and that a month is 1/12 of a year. The ending day of the study will be treated as if it is 1/1/2012. The following data is recorded for 6 individuals in the study: C/4 Study Manual—17th edition Copyright ©2014 ASM

489

28. MORTALITY TABLE CONSTRUCTION

490

Number

Birth Date

Policy Issue

Date of Withdrawal

Date of Death

1 2 3 4 5 6

3/1944 10/1944 12/1944 1/1945 4/1945 7/1945

5/2004 6/2010 2/2009 12/2008 1/2010 11/2010

5/2011 — — 4/2010 — —

— — 4/2011 — 8/2011 —

Calculate the total exact exposure for every age that has any exposure. Answer: The first person enters the study when it starts on 1/2009 and is age 64 10 12 at that time. He leaves 2 at age 67 12 . Therefore there are 2 months of exposure for age 64, 12 months apiece for ages 65 and 66, and 2 months for age 67. 8 and leaves at the end of the study on The second person enters the study on 6/2010 at age 65 12 3 12/31/2011 at age 67 12 . Therefore there are 4 months of exposure for age 65, 12 months for age 66, and 3 months for age 67. 2 4 The third person enters the study at age 64 12 and leaves at age 66 12 . There are 10 months of exposure at age 64, 12 months at age 65, and 4 months at age 66. 3 The fourth person enters the study when it starts on 1/2009 at age 64 and leaves at age 65 12 . There are 12 months of exposure for age 64 and 3 months for age 65. 9 4 The fifth person enters the study at age 64 12 and leaves at age 66 12 . There are 3 months of exposure for age 64, 12 for age 65, and 4 for age 66. 4 6 The sixth person enters the study at age 65 12 and stays to the end at 12/31/2011 and is age 66 12 at that time. There are 8 months of exposure at age 65 and 6 months at age 66. Total exposure in months is: Number

Age 64

Age 65

Age 66

Age 67

1 2 3 4 5 6

2 0 10 12 3 0

12 4 12 3 12 8

12 12 4 0 4 6

2 3 0 0 0 0

Total

27

51

38

5

We assume the hazard rate is constant for each age. If e j is exact exposure for age j and d j is the number of deaths, the estimate of the hazard rate at that age is h j  d j /e j . As we will learn in Subsection 33.1.1, this is the maximum likelihood estimate of the hazard rate. If we are estimating a mortality rate for one year of age, then the integral of the hazard rate over the year is H j  d j /e j , and q j  1 − e −d j /e j . More generally, to estimate a mortality rate over n years of age, for example for ages 65–69 (that would be n  5 years of age), if we assume the hazard rate is constant over the entire period, H j  nd j /e j and q j  1 − e −nd j /e j

(28.1)

where d j and e j are the number of deaths and the exposure over the entire period. Example 28B In the previous example, calculate the mortality rates using the exact exposure method. Answer: There are two deaths at age 66 (#3,#5) and no other deaths, so qˆ 64  qˆ 65  qˆ 67  0 and qˆ66   1 − e −2/(38/12)  0.468248 . C/4 Study Manual—17th edition Copyright ©2014 ASM

28.1. INDIVIDUAL DATA BASED METHODS

491

The second method uses actuarial exposure. Actuarial exposure is the same as exact exposure except for those who die. For those who die, exposure is counted until the end of the year of death, even if that is after the termination of the study. Then setting e j equal to actuarial exposure, the estimate of the one-year mortality rate is qˆ j  d j /e j . If we wanted to estimate a mortality rate over n years, the estimate would be qˆ j 

nd j

(28.2)

ej

where d j and e j are the number of deaths and the exposure rate over the entire period. The actuarial estimator is an inconsistent estimator, but it is not much different from the exact exposure estimator for a large data set with a low mortality rate. On the other hand, it replicates the empirical estimator if data is complete. (The exact exposure method does not replicate the empirical estimator.) Example 28C In Example 28A, calculate actuarial exposure and the estimated mortality rates. Answer: Actuarial exposure is the same as exact exposure except for #3 and #5. For #3, entry is at age 2 and leaving is on 12/2011 at age 67, so there are 8 additional months of exposure at age 66. For #5, 64 12 leaving is on 4/2012, so there are 8 additional months of exposure at age 66. Thus there are 38 + 16  54 months of exposure at age 66 and qˆ 66  2/ (54/12)  0.444444 . Mortality at the other ages is still 0. 

?

Quiz 28-1 For a mortality study from 1/1/2011 to 12/31/2012, there are 3 individuals born on 2/1/1981. The first purchases a policy on 2/1/2010 and neither dies nor withdraws. The second purchases a policy on 2/1/2011 and withdraws on 8/1/2011. The third purchases a policy on 7/1/2011 and dies on 1/1/2012. Calculate the total actuarial exposure for q 30 of these three individuals. Notice that actuarial exposure for those who die is not necessarily an integer, because entry may occur at a fractional age. However, in practice, insureds are assigned an integral insuring age. When a person buys a policy, his birthday is set equal to the date the policy is issued. The age assigned to the person may be the age last birthday or the age nearest birthday. Since this age controls the premiums and policy values, mortality is estimated based on the insuring age rather than the real age. Example 28D Redo Example 28A using age last birthday as the insuring age. For convenience, the data are repeated here: Number

Birth Date

Policy Issue

Date of Withdrawal

Date of Death

1 2 3 4 5 6

3/1944 10/1944 12/1944 1/1945 4/1945 7/1945

5/2004 6/2010 2/2009 12/2008 1/2010 11/2010

5/2011 — — 4/2010 — —

— — 4/2011 — 8/2011 —

Compute mortality rates using the exact exposure method and the actuarial exposure method. Answer: First we use exact exposure. In each case, the birth date is advanced to the next policy anniversary. The first person gets a birthday of 5/1944. The second person gets a birthday of 6/1945. The third person gets a birthday of 2/1945. The fourth person gets a birthday of 12/1945. The fifth person gets a birthday of 1/1946. The sixth person gets a birthday of 11/1945. The resulting exposure in months is: C/4 Study Manual—17th edition Copyright ©2014 ASM

28. MORTALITY TABLE CONSTRUCTION

492

Number

Age 63

Age 64

Age 65

Age 66

1 2 3 4 5 6

0 0 0 11 0 0

4 0 12 4 12 0

12 12 12 0 7 12

12 7 2 0 0 2

Total

11

32

55

23

#3 dies at age 66 but #5 dies at age 65 based on the insuring age. So the mortality estimates are qˆ 63  qˆ 64  0, qˆ 65  1 − e −1/(55/12)  0.196021 , qˆ66  1 − e −1/(23/12)  0.406513 . Now let’s use actuarial exposure. For #3, actuarial exposure terminates at the next birthday in February, or at 2/2012, so actuarial exposure adds 10 months of exposure at age 66 for #3. For #5, actuarial exposure terminates at the next birthday in January, or 1/2012, so actuarial exposure adds 5 months of exposure at  age 65 for #5. Thus qˆ65  1/ (60/12)  0.2 and qˆ66  1/ (33/12)  0.363636 . Examples like the previous one can be confusing. If you’re trying to calculate exposure for just one age, it may be helpful to create a table specifying exactly when the exposure for that age starts and when it ends. For example, if you wanted the actuarial exposure at age 64 in the previous example, your table would look like this: Number

Insuring Birth Date

Age 64 Exposure Begins

Age 64 Exposure Ends

Months of Exposure

1 2 3 4 5 6

5/1944 6/1945 2/1945 12/1945 1/1946 11/1945

1/2009 6/2010 2/2009 12/2009 1/2010 11/2010

5/2009 6/2010 2/2010 4/2010 1/2011 11/2010

4 0 12 4 12 0

and for age 65, it would look like this: Number

Insuring Birth Date

Age 65 Exposure Begins

Age 65 Exposure Ends

Months of Exposure

1 2 3 4 5 6

5/1944 6/1945 2/1945 12/1945 1/1946 11/1945

5/2009 6/2010 2/2010 — 1/2011 11/2010

5/2010 6/2011 2/2011 — 1/2012 11/2011

12 12 12 0 12 12

The above study is calendar based, or date-to-date. To avoid fractional years of exposure due to the study ending, some studies are anniversary based, or anniversary-to-anniversary. This means that the study starts and ends on the policy anniversary. This method, however, throws away data before the anniversary in the first calendar year and after the anniversary in the last calendar year. Example 28E Redo Example 28A using age nearest birthday as the insuring age as an anniversary-toanniversary study, starting with anniversaries in 2009 and ending at anniversaries in 2011. Calculate exposure using the actuarial exposure method. C/4 Study Manual—17th edition Copyright ©2014 ASM

28.1. INDIVIDUAL DATA BASED METHODS

493

Answer: The first person gets a birthday of 5/1944. The second person gets a birthday of 6/1944, since that is nearer to the actual birthday than 6/1945. The third person gets a birthday of 2/1945. The fourth person gets a birthday of 12/1944. The fifth person gets a birthday of 1/1945. The sixth person gets a birthday of 11/1945. For the first person, exposure begins on 5/2009 and ends on 5/2011, for 12 months apiece at age 65 and 66. For the second person, exposure begins on 6/2010 and ends on 6/2011, for 12 months at age 66. For the third person, exposure begins on 2/2009 and ends on 2/2011. Therefore, the death is ignored. There are 12 months apiece of exposure at ages 64 and 65. For the fourth person, exposure begins on 12/2009 and ends on 4/2010. Therefore, there are 4 months of exposure at age 65. For the fifth person, exposure begins on 1/2010 and ends on 1/2011. Once again the death is ignored, and there is 12 months of exposure at age 65. For the sixth person, exposure begins on 11/2010 and ends at 11/2011, for 12 months of exposure at age 65. Exposure is summed in the following table:

28.1.1

Number

Age 64

Age 65

Age 66

1 2 3 4 5 6

0 0 12 0 0 0

12 0 12 4 12 12

12 12 0 0 0 0

Total

12

52

24

Variance of estimators

The estimate of the variance of the exact exposure estimator for mortality over n years is

L ( qˆ j )  (1 − qˆ j ) 2 n 2 Var

dj e 2j

(28.3)

We will derive this estimate using the delta method in Section 34.4. To remember it, it’s like a combination of the Greenwood approximation (the first factor, S ( x ) 2 ) with the variance of the Nelson-Åalen estimator (the last factor, d j /e 2j ). To estimate the variance of the actuarial estimator for mortality over n years, treat the estimate as the empirical estimate with complete data. The number of members of the population is set equal to exposure divided by the length of the period, e j /n, since each member would contribute n to exposure if they were present for the entire n-year period. Then use a version of equation (23.1). The resulting formula is

L ( qˆ j )  Var Since qˆ j  nd j /e j , this can also be written as

L ( qˆ j )  the most common case), then Var C/4 Study Manual—17th edition Copyright ©2014 ASM

qˆ j (1 − qˆ j ) e j /n

d j ( e j /n − d j )

( e j /n ) 3 d j (e j − d j ) e 3j

.

(28.4)

. In particular, if n  1 (one-year mortality rate,

28. MORTALITY TABLE CONSTRUCTION

494

Example 28F In Example 28A, estimate the standard deviation of the estimates of qˆ 66 using exact exposure and using actuarial exposure. Answer: With exact exposure, e66 

38 12

and qˆ 66  0.468248. The standard deviation of qˆ 66 is

r (1 − 0.468248) 2

2  0.23748 (38/12) 2

With actuarial exposure, e66  54/12, as derived in Example 28C, and in that exercise we obtain qˆ 66  4/9, so the standard deviation of qˆ 66 is

r

(4/9)(5/9) 54/12

28.2

 0.23424



Interval-based methods

In an interval based method, data is summarized by age or age group. Assume data are grouped into intervals, each one starting at c j and ending at c j+1 , where c 0 < c 1 < · · · < c k . The following notation is taken from the textbook. On past exams, they have occasionally used textbook notation from this topic with no explanation, expecting you to know what it means. Note that previous editions of the textbook used different notation. P j is the population at time c j . This is the number of members of the population after all deaths and withdrawals from the previous interval and after any new entries at the beginning of the current interval. n j is the number of new entrants in the interval [c j , c j+1 ) . n stands for new! Initially, the textbook uses this notation only for new entrants during the interval, but not at the beginning of the interval. Later on, the textbook introduces the notation n bj , the number of entrants at the beginning of the interval , the number of entrants at any time other than the very beginning of the at exact time c j , and n m j interval (m for middle). We include the left endpoint of the interval [c j , c j+1 ) but not the right endpoint. If someone enters the study at exact age 45, for example, he’s included when calculating the mortality rate for 45-year olds. This is consistent with the rules for ties mentioned in the paragraph after equation (24.1). w j is the number of withdrawals in the interval ( c j , c j+1 ]. Initially, the textbook uses this notation only for withdrawals during the interval, but not at the end of the interval. Later on, the textbook introduces the notation w ej , the number of withdrawals at the end of the interval at exact time c j+1 , and w m , the j number of withdrawals not at the very end of the interval. We include the right endpoint of the interval ( c j , c j+1 ], but not the left one. If someone leaves the study at exact age 46, for example, he’s included when calculating the mortality rate for 45-year olds and not considered as having withdrawn at age 45; withdrawals occurring at the same time as a death count in the risk set for those deaths. This is consistent with the rules for ties mentioned in the paragraph after equation (24.1). d j is the number of deaths in the interval ( c j , c j+1 ]. Do not confuse this with d j defined on page 396 for individual data. There you are given d j for each individual and it means entry time. Here in interval-based estimation, d j is the count of number of deaths in an interval. C/4 Study Manual—17th edition Copyright ©2014 ASM

28.2. INTERVAL-BASED METHODS

495

The populations are computed recursively starting with P0  n0b . Then m e b P j  P j−1 + n m j−1 − d j−1 − w j−1 − w j−1 + n j

(28.5)

Using interval data, we can estimate exposure by assuming that entries and withdrawals during the interval (not at the beginning or end of the interval) contribute 1/2 of a year (or whatever the interval length is) of exposure. Thus the exact exposure method would use an exposure of m e j  P j + 0.5 ( n m j − wj − dj)

(28.6)

in its calculation of qˆ j  1 − e −d j /e j . The actuarial exposure method would use m e j  P j + 0.5 ( n m j − wj )

(28.7)

when calculating qˆ j  d j /e j . Another method is Kaplan-Meier style. Assuming entries and withdrawals during the interval occur in the middle of the interval and deaths are uniform, entries and withdrawals contribute half a period ) . This results in the actuarial estimator of the previous − wm to the risk set, so the risk set is P j + 0.5 ( n m j j paragraph. Notice that unlike in individual data methods, when we talk about exposure in grouped data methods, it refers to the number of individuals in the study, rather than the number of individuals times the size of the interval. Thus qˆ j is the estimated proportion of deaths throughout the interval, not the annual mortality rate. If the interval length is 5 years, for example, qˆ j is the estimated probability of dying in five years. So “exposure” and “risk set” are one and the same in this context. Example 28G (Data from Example 28A) A 3-year mortality study reviews the company’s records starting on Jan. 1, 2009 and ending on Dec. 31, 2011. For simplicity, assume all events occur on the first of each month and that a month is 1/12 of a year; however, the ending day of the study will be treated as if it is 1/1/2012. The following data is recorded for 6 individuals in the study: Number

Birth Date

Policy Issue

Date of Withdrawal

Date of Death

1 2 3 4 5 6

3/1944 10/1944 12/1944 1/1945 4/1945 7/1945

5/2004 6/2010 2/2009 12/2008 1/2010 11/2010

5/2011 — — 4/2010 — —

— — 4/2011 — 8/2011 —

Insuring age is age last birthday. Data are grouped by age. Calculate the exposure both on an exact basis and on an actuarial basis. Then calculate the mortality rates on an actuarial basis. Answer: Let’s calculate each individual’s entry age and withdrawal/death age.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Number

Entry age

Withdrawal age

Death age

1 2 3 4 5 6

8 64 12 65 64 1 63 12 64 65

67 7 66 12

— — 2 66 12 — 7 65 12 —

4 64 12 — 2 66 12

28. MORTALITY TABLE CONSTRUCTION

496

m b We’ll use ages as subscripts of our symbols. Then looking down the entry age column, n63  1, n64  2, m b m  1, n65  2, and all other n j ’s are 0. Looking down the withdrawal age column, w 64  1, w 66  2, e . Looking down the death age  1, Notice that #1 leaves at exact age 67 and so is counted as w 66 column, d65  1 and d66  1. The populations are m n64 e w 66

P63  0 m b P64  P63 + n63 + n64 0+1+23 m b m P65  P64 + n64 + n65 − w 64 3+1+2−15 P66  P65 − d65  5 − 1  4

m e P67  P66 − d66 − w 66 − w 66 4−1−2−10

Exact exposure is m e63  P63 + 0.5n63  0 + 0.5  0.5

m m e64  P64 + 0.5 ( n64 − w 64 )  3 + 0.5 (1 − 1)  3

e65  P65 − 0.5 ( d65 )  5 − 0.5  4.5

m e66  P66 − 0.5 ( w 66 + d66 )  4 − 0.5 (2 + 1)  2.5

Actuarial exposure doesn’t subtract 0.5d j at j  65 and 66, so e65  5 and e66  3. The resulting mortality  rates are qˆ 63  qˆ 64  0, qˆ 65  1/5  0.2 , qˆ66  1/3  0.333333 . The next example is one of the few examples in this lesson not involving mortality tables. Example 28H You offer two types of automobile comprehensive policies: one with no deductible and a policy limit of 5000, and another with a deductible of 500 and a maximum covered loss of 10,000. You have the loss experience shown in the following table. Loss Range

(0, 500] (500, 1000] (1000, 2000] (2000, 5000] (5000, 7500] (7500, 10000] At limit Total

Number of Losses in Range 0 deductible, 5000 limit 500 deductible, 10,000 limit 20 18 16 19 27 100

32 24 21 18 10 5 110

The ground-up loss distribution for both types of policy is assumed to be the same. Estimate the probability that a loss will be no more than 7500 using the actuarial method. Answer: Since the deductibles and limits are at interval endpoints, the only reasonable assumption is that all of entries and withdrawals occur at integer endpoints. Exposure in each interval (or the risk set) equals the population at the beginning of each interval. The policies with 500 deductible enter the study at 500, so we have the following table: C/4 Study Manual—17th edition Copyright ©2014 ASM

28.2. INTERVAL-BASED METHODS

497

j

cj

Pj

n bj

w ej

dj

ej

Sˆ ( c j+1 )

0 1 2 3 4 5

0 500 1000 2000 5000 7500

100 190 140 100 33

100 110 0 0 0

0 0 0 27 0

20 50 40 40 18

100 190 140 100 33

0.8 0.8 (14/19) 0.8 (10/19) 0.8 (6/19) 0.8 (6/19)(15/33)

As hinted in the last column of the above table, the intervals (500, 1000], (1000, 2000], and (2000, 5000] may be combined into a single interval with risk set 190 and events 50 + 40 + 40  130, since there is no truncation or censoring between 500 and 5000. The probability of survival to 7500 is 0.8 (6/19)(15/33)  0.114833, so the probability that a loss is no more than 7500 is 1 − 0.114833  0.885167 .  Example 28I You perform a mortality study on 950 lives age 45. Additional lives enter the study at ages 45 and later. You are given the following information from the mortality study:

j

Age cj

Number of lives entering at this age nj

0 1 2 3 4

45 46 47 48 49

50 30 40 30 10

Number of lives withdrawing before next age wj

Number of deaths dj

100 90 80 70 60

3 5 6 4 8

All entries, other than the original 950 lives, and all withdrawals are assumed to occur uniformly throughout each year. Estimate the probability of survival for an individual age 45 to each of ages 46, 47, 48, 49, 50 using the actuarial method. − d j . The number of terminations − wm Answer: Calculate the population recursively using P j+1  P j + n m j j e at the end of the study, w 5 , is the number of lives remaining after death and withdrawal, or 684, and w i  0 for i , 5. Calculate the exposure as e j  P j +0.5 ( n m − wm ) . Note that the original 950 lives are part of the starting j j population and count fully, unlike the late entrants (including the 50 entrants at age 45), which count as 0.5 lives in the exposure. Similarly, the 684 lives surviving at the end count fully, unlike withdrawals, which count as 0.5 lives apiece. j

cj

Pj

nm j

wm j

w ej

dj

ej

pˆ 0j

0 j+1 pˆ 45

0 1 2 3 4 5

45 46 47 48 49 50

950 897 832 786 742 0

50 30 40 30 10

100 90 80 70 60

0 0 0 0 684

3 5 6 4 8

925 867 812 766 717

922/925 862/867 806/812 762/766 709/717

922/925  0.996757 0.996757 (862/867)  0.991008 0.991008 (806/812)  0.983686 0.983686 (762/766)  0.978549 0.978549 (709/717)  0.967631

C/4 Study Manual—17th edition Copyright ©2014 ASM

28. MORTALITY TABLE CONSTRUCTION

498

Comment In all of the above examples involving mortality studies, we assumed data by age, but the methods we have discussed apply equally well to data arranged into larger intervals, such as quinquennial ages.

?

Quiz 28-2 You are given the following information regarding five individuals in a study: dj

uj

xj

0 0 1 2 4

2 — 6 — 6

— 3 — 7 —

The data are grouped into intervals of 5 time units. Entries and withdrawals are assumed to be uniformly distributed within each group. Calculate the probability of death in the period (0, 5] using this approximation and the actuarial estimator.

Multiple Decrement The estimators using the Kaplan-Meier method are all single-decrement rate estimators. You may multiply survival probabilities for several decrements to obtain an estimate of the probability of survival from all decrements combined. Example 28J In a study of 100 life insurance policies, you are given that there are 2 deaths and 10 withdrawals in the first year, and 3 deaths and 10 withdrawals in the second year. Use the actuarial estimator to estimate the death and withdrawal rates, assuming that deaths and withdrawals are uniformly distributed throughout each year. Then determine the probability that a policy on a newly issued life will not terminate by death or withdrawal within two years. Answer: We will use MLC/LC notation p (j τ ) to indicate the probability of surviving all decrements.

There are no late entrants. The populations are P0  100 and P1  100 − 2 − 10  88. Let qˆ 0j( d ) and qˆ 0j( w ) be the estimated rates of decrement from death and withdrawal respectively for j  0, 1. From the perspective of the mortality study, there are 10 withdrawals apiece in years 1 and 2. The actuarial exposures are e0  100 − 0.5 (10)  95 and e1  88 − 0.5 (10)  83, so 93  0.978947 95 80   0.963855 83

1 − qˆ 00( d )  1 − qˆ 10( d )

using the actuarial estimator. From the perspective of the withdrawal study, there are 2 censored observations in the first year and 3 censored observations in the second year. The actuarial exposures are e0  100 − 0.5 (2)  99 and e1  88 − 0.5 (3)  86.5, so 89  0.898990 99 76.5   0.884393 86.5

1 − qˆ00( w )  1 − qˆ10( w ) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 28

499

using the actuarial estimator. Then pˆ 0( τ )  (0.978947)(0.898990)  0.880064 pˆ 1( τ )  (0.963855)(0.884393)  0.852427 and the probability of survival for 2 years is 2 pˆ 0( τ )  (0.880064)(0.852427)  0.750190 . Notice that this is different from naively estimating that since 25 lives either died or withdrew, we divide the 75 survivors  by the 100 original lives to obtain 2 pˆ 0( τ )  (100 − 25) /100  0.75.

Exercises 28.1. For a calendar year mortality study for 1/1/2009–12/31/2011, a person born on 3/1/1984 buys an insurance policy on 9/1/2006 and withdraws on 8/1/2011. Insuring age is age last birthday. Calculate the exact exposure for age 27. 28.2.

For a calendar year mortality study for 2012, you are given the following data: Person

Date of Birth

Policy Issue Date

Date of Termination

Cause of Termination

1 2 3 4

6/1/1971 1/1/1972 4/1/1972 6/1/1972

5/1/2009 5/1/2009 6/1/2012 3/1/2000

— 10/1/2012 10/1/2012 —

— surrender death —

Calculate the absolute difference between the Kaplan-Meier product limit estimator of q40 and the actuarial estimator of q 40 . 28.3.

For an anniversary-to-anniversary study for years 2007–2012, you are given the following data: Person

Date of Birth

Policy Issue Date

Date of Termination

Cause of Termination

1 2 3 4

8/1/1962 11/1/1962 2/1/1963 6/1/1963

12/1/2004 6/1/2007 10/1/2008 1/1/2000

6/1/2011 10/1/2011 — —

death surrender — —

Insuring age is age last birthday. Calculate qˆ48 using the exact exposure method.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

28. MORTALITY TABLE CONSTRUCTION

500

Table 28.1: Summary of Mortality Table Construction Formulas

Individual data Exact exposure: For each insured, time from entry to leaving, whether leaving is by death, withdrawal, or termination of study. Estimate of q j (mortality rate over n-year period) using exact exposure: qˆ j  1 − e −nd j /e j Variance of estimate:

L ( qˆ j )  (1 − qˆ j ) 2 n 2 Var

(28.1) dj e 2j

(28.3)

Actuarial exposure: For each insured, time from entry to leaving if leaving is by withdrawal or termination by study. If leaving is by death, time from entry to end of year (of age) of death. Estimate of q j (mortality rate over n-year period) using actuarial exposure: qˆ j  Variance of estimate:

L ( qˆ j )  Var

nd j ej qˆ j (1 − qˆ j ) e j /n

(28.2)

(28.4)

Insuring age: Age based on setting age to integer at policy issue. The integer may be age nearest birthday or age last birthday. Interval-based data Population: Exact exposure:

m e b P j+1  P j + n m j − w j − w j − d j + n j+1

(28.5)

m e j  P j + 0.5 ( n m j − wj − dj)

(28.6)

Actuarial exposure, or risk set for Kaplan-Meier approximation for large data sets: m e j  P j + 0.5 ( n m j − wj )

C/4 Study Manual—17th edition Copyright ©2014 ASM

(28.7)

Exercises continue on the next page . . .

EXERCISES FOR LESSON 28

501

For an anniversary-to-anniversary study for years 2007–2010, you are given the following data:

28.4.

Person

Date of Birth

Policy Issue Date

Date of Termination

Cause of Termination

1 2 3 4 5

8/1/1967 11/1/1967 1/1/1968 2/1/1968 6/1/1970

1/1/2004 6/1/2007 6/1/2005 10/1/2009 2/1/2005

2/1/2008 5/1/2008 1/1/2011 — —

death surrender surrender — —

Insuring age is age nearest birthday. Calculate qˆ40 using the actuarial estimator. 28.5. In a mortality study, you have the following individual data. For each triplet, the first number is the age in months of entry to the study, the second number is the age in months of exit from the study, and the letter is s for withdrawal or termination of study and x for death. (32-0,34-0,s)

(33-0,35-0,s)

(33-2,33-9,x)

(33-5,33-11,s)

(33-8,35-5,s)

You estimate q 33 in two ways: 1.

The actuarial estimator based on individual data.

2.

The actuarial estimator based on data grouped by age. Entries and withdrawals are assumed to occur uniformly between integral ages, unless they occur at an exact integer. Calculate the absolute difference between the two estimates.

28.6.

For a calendar year mortality study from 1/1/2009 to 12/31/2011:

(i)

25 individuals born on 7/1/1949 were issued policies on 3/1/2010. One of them withdrew on 5/1/2010. (ii) 5 individuals born on 4/1/1950 were issued policies before 1/1/2009. One of them withdrew on 10/1/2010 and one died on 5/1/2011. (iii) 1 individual born on 10/1/1950 was issued a policy on 2/1/2009 and died on 1/1/2011. Calculate the variance of the exact exposure estimate of qˆ60 , 28.7.

For a calendar year mortality study from 1/1/2007 to 12/31/2011:

(i)

25 individuals born on 5/1/1967 were issued policies on 3/1/2004. One of them died on 11/1/2009, and two withdrew on 1/1/2010. (ii) 20 individuals born on 7/1/1967 were issued policies on 8/1/2009. One of them died on 6/1/2010, and one withdrew on 2/1/2010. Estimate the variance of the actuarial estimator for qˆ42 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

28. MORTALITY TABLE CONSTRUCTION

502

28.8.

In a calendar year mortality study from 1/1/2006 to 12/31/2008:

(i) Insuring age is used, and is equal to age nearest birthday. (ii) On Jan. 1, 2006, the following policies with insuring age 25 at their anniversaries in 2006 are in force: Policy anniversary

Number

1/1 3/1 8/1 12/1

24 42 15 22

(iii) One of the insureds with policy anniversary 3/1 dies on 9/20/2007. (iv) Three of the policies with anniversary 8/1 withdraw on 2/1/2006, 4/1/2007, and 11/1/2008. Actuarial exposure is used to estimate 3 qˆ25 , the probability that a person age 25 dies within 3 years. Estimate the standard deviation of this estimate. 28.9. In a mortality study, you have the following individual data. For each triplet, the first number is the age in months of entry to the study, the second number is the age in months of exit from the study, and the letter is s for withdrawal or termination of study and x for death. (32-9,37-9,s)

(33-2,36-7,s)

(33-2,35-3,x)

(33-5,38-5,s)

(33-7,35-1,s)

You estimate the probability that a person exact age 33 dies within 3 years using the exact exposure method. Estimate the variance of the estimator. 28.10. In a study of population mortality, you have the following data:

Age

Lives at this age

Lives leaving the study at the next age

Deaths before the next age

5 10 20 50 65

10,000 9,450 8,645 6,827

525 780 1768 1400

25 25 50 70

All lives leaving the study are assumed to leave the study at the end of the age interval. Mortality is estimated using the actuarial estimator. Linear interpolation is used to estimate mortality between interval endpoints. Determine the estimate of the probability that a life age 5 survives to age 45, or 40 pˆ 5 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 28

503

28.11. You are given the following information about time to remission for a cohort of 70 cancer patients: Weeks (lower–upper)

Number Who Experienced Remission

Number Who Withdrew from Study

0–1 1–2 2–3 3–4

3 6 12 8

2 10 2 6

To estimate time to remission, you use the actuarial estimator with an assumption of uniform withdrawals within each week. Determine the resulting estimate for the proportion experiencing remission within 3 weeks. 28.12. You are given the following data from a mortality study on individuals ages 40–41. All individuals entering the study at age 40 and not withdrawing continue in the study at age 41.

Age

Entrants Beginning of Year

Entrants During Year

Withdrawals During Year

Withdrawals End of Year

Deaths

40 41

125 150

10 8

5 10

25 20

1 2

Estimate q41 using the exact exposure method. 28.13. [Sample:301] For a mortality study, you are given: (i)

(ii) (iii) (iv)

The following table of observations: Interval j

Left End cj

Right End c j+1

New Entrants nj

Withdrawals wj

Deaths dj

0 1 2 3

0 6 12 18

6 12 18 24

90 80 70 40

30 20 30 30

10 8 20 16

n j  new entrants in the interval [c j , c j+1 ) of which 70% arrive in the first half. w j  withdrawals in the interval ( c j , c j+1 ], of which 30% occur in first half. d j  deaths in the interval ( c j , c j+1 ], all occurring at the mid-point.

Calculate the Kaplan-Meier estimate of the probability that a subject observed at time 0 dies before time 24. (A) 0.41

(B) 0.43

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.50

(D) 0.57

(E) 0.59

Exercises continue on the next page . . .

28. MORTALITY TABLE CONSTRUCTION

504

Use the following information for questions 28.14 through 28.17: Four of XYZ’s policyholders have the following history. Policy number 1 2 3 4

Date of birth

Date policy issued

4-1-1930 7-1-1930 8-1-1930 9-1-1930

3-1-1980 10-1-1990 1-1-2000 8-1-2000

Status on 12-31-2012 Died 10-1-2012 Insured Insured Surrendered 11-1-2012

Status on policy anniversary in 2013 Died 10-1-2012 Surrendered 5-1-2013 Insured Surrendered 11-1-2012

28.14. [Sample:292] XYZ conducts a mortality study from 1-1-2012 through 12-31-2012 using actual ages. Calculate the absolute value of the difference between the estimated mortality probabilities at age 82, using the exact and actuarial methods. (A) 0.012

(B) 0.015

(C) 0.018

(D) 0.021

(E) 0.024

28.15. [Sample:293] XYZ conducts a mortality study from 1-1-2012 through 12-31-2012 using insuring ages. XYZ assigns insuring ages based on the age at the nearest birthday when the policy is issued. Calculate the absolute value of the difference between the estimated mortality probabilities at age 82, using the exact and actuarial methods. (A) 0.046

(B) 0.051

(C) 0.055

(D) 0.060

(E) 0.064

28.16. [Sample:294] XYZ conducts a mortality study from anniversaries in 2012 through anniversaries in 2013 using insuring ages. XYZ assigns insuring ages based on the age at the nearest birthday when the policy is issued. Calculate the absolute value of the difference between the estimated mortality probabilities at age 82, using the exact and actuarial methods. (A) 0.031

(B) 0.035

(C) 0.039

(D) 0.042

(E) 0.045

28.17. [Sample:295] Estimate the variance of the estimator based on exact exposures from exercise 28.15 using the delta method approach. (A) 0.13

(B) 0.26

(C) 0.41

(D) 0.56

(E) 0.71

28.18. From a study on lives ages 35–37, you are given: Age

Lives

Deaths

Withdrawals

35 36 37

1000 1100 1000

3 4 5

100 99 100

You use the actuarial estimator with the assumption of uniform withdrawals between integral ages. 0 Estimate the 3-year single-decrement mortality rate for a person age 35, 3 qˆ35 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 28

505

28.19. A three-year mortality study starts out with 800 lives age 40. You are given the following statistics: Year

New entries

Withdrawals

Deaths

1 2 3

40 38 35

80 90 85

2 3 4

You use the actuarial estimator with the assumption of uniform entries and withdrawals within each year. 0 Estimate the 3-year mortality rate for a person age 40, 3 qˆ 40 . 28.20. A study of agent persistency for their first four years under contract starts with 60 agents. Statistics for this population are: (i) 15 agents persist for four years. (ii) 2 agents die at times 1.5 and 3.5. (iii) 5 agents become disabled at times 0.3, 1.3, 2.5, 3.3, and 3.9. (iv) The other 38 agents withdraw from their contract, with 22 withdrawing in the first year, 10 in the second year, and 3 apiece in the third and fourth years. The actuarial estimator is used. All decrements are assumed to be uniformly distributed within each contract year. Determine the single-decrement probability of an agent persisting for four years in a population of agents not subject to death or disability. 28.21. An insurance policy is available at deductibles of 0 and 500, and at maximum covered losses of 5000 and 10,000. The underlying loss distribution is assumed to be independent of deductible and maximum covered losses. You have the following data on loss sizes. Losses at 5000 and 10000 shown in the table are censored observations at the limits. Deductible Loss Range

0

500

(0, 500] (500, 1000] (1000, 5000] (5000, 10000] At 5000 At 10,000

35 20 25 10 8 2

15 18 9 7 1

Total

100

50

Using the actuarial estimator, estimate the probability that loss size will be no more than 10,000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

28. MORTALITY TABLE CONSTRUCTION

506

28.22. An insurance policy is available with deductibles of 250 and 500, and may or may not have a maximum covered losses of 1000. The underlying loss distribution is assumed to be independent of deductible and maximum covered losses. You have the following data on loss sizes. Losses at 1000 are censored observations at the limit. Deductible Loss Range

250

500

(250, 500] (500, 1000] (1000, 5000] (5000, 10000] At 1000

8 12 14 10 6

15 18 17 0

Total

50

50

Using the actuarial estimator, estimate the probability that a loss upon which a positive payment is made from a policy with a 500 deductible will exceed 5000. 28.23. You have the following information from a study starting with 100 policies on insureds age 20: j 0 1 2

( c j , c j+1 ] (20, 40] (40, 60] (60, 80]

dj

uj

xj

4 10 5

24 20 8

3 5 6

Superscripts ( d ) and ( w ) indicate decrements due to death and withdrawal respectively. 0( d ) 60 q 20

is estimated from this study using the actuarial estimator, with all new entries assumed to occur at the beginning of each interval and withdrawals assumed to occur at the end of each interval. 0( w ) For another population assumed to have the same mortality, 60 q 20  0.6.

(τ) Estimate 60 p20 for that population.

28.24. [4-F01:27] You are given the following information about a group of 10 claims: Claim Size Interval (0, 15000] (15000, 30000] (30000, 45000]

Number of Claims in Interval 1 1 4

Number of Claims Censored in Interval 2 2 0

Assume that claim sizes and censorship points are uniformly distributed within each interval. Estimate, using large data set methodology and exact exposures, the probability that a claim exceeds 30,000. (A) 0.67

(B) 0.70

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.74

(D) 0.77

(E) 0.80

Exercises continue on the next page . . .

EXERCISES FOR LESSON 28

507

28.25. [Sample:73A] You are given the following information about a group of 10 claims: Claim Size Interval (0, 15000] (15000, 30000] (30000, 45000]

Number of Claims in Interval 1 1 4

Number of Claims Censored in Interval 2 2 0

Assume that claim sizes and censorship points are uniformly distributed within each interval. Estimate, using large data set methodology and actuarial exposures, the probability that a claim exceeds 30,000. (A) 0.67

(B) 0.70

(C) 0.74

(D) 0.77

(E) 0.80

28.26. [C-S05:7] Loss data for 925 policies with deductibles of 300 and 500 and maximum covered losses of 5,000 and 10,000 were collected. The results are given below: Range (300, 500] (500, 1,000] (1,000, 5,000) (5,000, 10,000) At 5,000 At 10,000 Total

Deductible 300 500 50 — 50 75 150 150 100 200 40 80 10 20 400 525

Total 50 125 300 300 120 30 925

Let X be the random variable for the ground-up loss size. Using the actuarial estimator, estimate F (5000 | X > 300) . (A) 0.25

(B) 0.32

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.40

(D) 0.51

(E) 0.55

Exercises continue on the next page . . .

28. MORTALITY TABLE CONSTRUCTION

508

28.27. [Sample:305] An insurance policy is available with deductibles of 500 or 750, and policy limits of 1500 or 20,000. The underlying loss distribution is assumed to be independent of deductible and policy limit. You are given the following claim counts: Loss Range 500–750 750–1000 1000–1500 1500–5000 5000–10,000 10,000–20,000 At 1500 limit At 20,000 limit Total

Deductible 500 750 8 10 10 12 15 14 18 10 17 0 0 6 0 0 0 60 60

Total 8 20 27 32 27 0 6 0 120

Calculate the probability that a loss from a policy with a 750 deductible will exceed 5000, using the Kaplan-Meier approximation with all the given data. (A) 0.27

(B) 0.29

(C) 0.33

(D) 0.36

(E) 0.37

Use the following information for questions 28.28 and 28.29: A mortality study of insured lives is being conducted. Insuring ages are employed and the study runs from anniversaries in 2010 through anniversaries in 2012. Assume that events that can occur between anniversaries do so uniformly. The following was observed: There are 10,000 lives at age 60 when first observed in 2010. Between then and their next anniversary 100 die and 800 surrender their policy. In 2011, 1000 new policies are sold at age 61. For those insured at their age 61 anniversary in 2011, 120 die and 700 surrender before their 2012 anniversary. There are 8000 lives at age 61 when first observed in 2010. Between then and their next anniversary 80 die and 700 surrender their policy. In 2011, 800 new policies are sold at age 62. For those insured at their age 62 anniversary in 2011, 90 die and 600 surrender before their 2012 anniversary. 28.28. [Sample:296] Determine the actuarial estimate of the mortality probability at age 61. (A) 0.0112

(B) 0.0113

(C) 0.0114

(D) 0.0115

(E) 0.0116

28.29. [Sample:297] Determine the estimate based on exact exposures of the mortality probability at age 61. (A) 0.0112

(B) 0.0113

(C) 0.0114

(D) 0.0115

(E) 0.0116

Additional released exam questions: C-F06:17, C-S07:26

Solutions 28.1. The insuring birthday is 9/1/1984 and the person withdraws on 8/1/2011, so there is 0 exposure at age 27.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 28

509

28.2. For the Kaplan-Meier product limit estimator, the risk set on 10/1/2012 is 3 since all 3 policies are in force and age 40, and ties for surrender count. The first policy, however is age 41. Thus qˆ40  1/3. For the actuarial estimator, age 40 exposure in 2012 is 5 months for the first policy, 9 months for the second, 10 months for the third since deaths count up to the next birthday, and 7 months for the fourth. Total exposure is 31 months, and qˆ 40  12/31. The absolute difference is 12/31 − 1/3  0.053763 .

28.3. For the first life, insured birthday is 12/1/1962 and age 48 is attained at 12/1/2010, so there are 6 months of exposure. For the second life, insured birthday is 6/1/1963 and age 48 is attained at 6/1/2011, so there are 4 months of exposure. For the third life, insured birthday is 10/1/1963 and age 48 is attained at 10/1/2011, so there are 12 months of exposure. For the fourth life, insured birthday is 1/1/1964 and age 48 is attained at 1/1/2012. Since it is an anniversary-to-anniversary study, the study ends at the anniversary in 2012 and there is no exposure. Total exposure is 6 + 4 + 12  22 months. The exact exposure estimator is 1 − e −12/22  0.420422 .

28.4. Insuring birthdays are 1/1/1968, 6/1/1967, 6/1/1968, 10/1/1967, and 2/1/1970 for the five people respectively. Actuarial exposure at age 40 is 12 months, 11 months, 12 months, 0 months, and 0 months respectively. The actuarial estimator is qˆ40  1/ (35/12)  0.3429 .

28.5. For individual data, actuarial exposure at age 33 is 12 months, 12 months, 10 months, 6 months, and 4 months respectively, for a total of 44 months. The actuarial estimate is qˆ33  1/ (44/12)  3/11. For grouped data, the population P33 is 2, since the first two people are in the study at exact age 33. There are 3 new entrants (the other 3 people). The fourth person withdraws in the middle of the year. Exposure is 2 + 0.5 (3 − 1)  3. The actuarial estimate is qˆ33  1/3. The absolute difference is |3/11 − 1/3|  2/33 .

28.6. Exact exposure at age 60 for the first group of 25 is 4 months apiece, except for the one who withdrew for whom it is 2 months. For the group of 5 individuals, there are 12 months of exposure, including from the one who died, except for the one who withdrew who provides 6 months of exposure. For the third group of 1 individual, exact exposure is 3 months. The total exposure in months is 24 (4) + 2 + 4 (12) + 6 + 3  155. There is one death at age 60, namely the individual from the third group. The death from the first group is at age 61. The estimate of q 60 is qˆ60  1 − e −12/155  0.074498. The variance is ! 1 2  0.00513397 Var ( qˆ60 )  (1 − 0.074498) (155/12) 2 28.7. Exposure for age 42 is 23 years from those who persisted or died from the first group, plus 2 (2/3) for the two who withdrew from the first group. Exposure for age 42 is 19 (11/12) for those who persisted or died from the second group, plus 1/2 from the one who withdrew from the second group. Total exposure is 23 + 4/3 + 19 (11/12) + 1/2  42.25. Then qˆ 42  2/42.25. The estimated variance is

L ( qˆ j )  Var

qˆ j (1 − qˆ j ) ej



(2/42.25)(1 − 2/42.25) 42.25

 0.00106737

28.8. Actuarial exposure for the 1/1 policies is 36 months apiece. Actuarial exposure for the 3/1 policies is 34 months apiece, except for the one who dies who contributes 24 months of exposure. Actuarial exposure for the 8/1 policies is 29 months apiece, except for those who withdraw who contribute 0 months, 8 months, and 27 months respectively. Actuarial exposure for the 12/1 policies is 25 months apiece. C/4 Study Manual—17th edition Copyright ©2014 ASM

28. MORTALITY TABLE CONSTRUCTION

510

Total exposure in months is 24 (36) + 41 (34) + 24 + 12 (29) + 8 + 27 + 22 (25)  3215. The estimate of 3q 25 , by formula (28.2), is nd j 3 (1)   3/ (3215/12)  0.011198 3q 25  ej 3215/12 The standard deviation, by formula (28.4), is

q

s L (3 qˆ25 )  Var

q j (1 − q j ) e j /n

r 

(0.011198)(1 − 0.011198)  0.011135 (3215/12) /3

Exposure is: (32-9,37-9,s) 36 months (33-2,36-7,s) 34 months (33-2,35-3,x) 25 months (33-5,38-5,x) 31 months (33-7,35-1,s) 18 months When computing exposure, treat the ages as exact, so that someone entering who is exactly 33 years and 2 months old will have 10 more months until age 34 and then another 24 months, for a total of 34 months. The other lines can be computed in a similar way. Total exposure is 36 + 34 + 25 + 31 + 18  144 months or 12 years. Then 3 qˆ33  1 − e −3/12)  0.221199. The variance of the estimator is

28.9.

L (3 qˆ33 )  (1 − 0.221199) 2 (32 ) Var

1  0.037908 122

28.10. Since there are no new entrants (the question does not mention them and the lives at each age are the previous age’s lives minus deaths and withdrawals), e j  P j , so 10000 − 25  0.9975 10000 ! 9450 − 25 ˆ  0.994861 p  0.9975 15 5 9450 5 pˆ 5



8645 − 50  0.989107 45 pˆ 5  0.994861 8645

!

Using linear interpolation, 40 pˆ 5 

5 6

!

45 pˆ 5 +

1 6

!

15 pˆ 5 

5 1 (0.989107) + (0.994861)  0.990066 6 6

!

!

28.11. There are no new entrants. We are told to assume uniform withdrawals, or e j  P j − 0.5w j in the absence of new entrants, where w j is the number of withdrawals. Week j

Lives Pj

Remissions dj

Withdrawals wj

Exposure ej

0 1 2

70 65 49

3 6 12

2 10 2

69 60 48

0 j+1 pˆ 0

66/69  0.956522 0.956522 (54/60)  0.860870 0.860870 (36/48)  0.645652

Since we want the proportion experiencing remission, the answer is the complement of 3 pˆ 00 , or 3 qˆ00  1 − 0.645652  0.354348 . C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 28

511

28.12. P41  125+10−5−25−1+150  254. We assume uniform entry and withdrawal, so exact exposure is 254 + 0.5 (8 − 10 − 2)  252. The mortality estimate is 1 − e −2/252  0.0079051 .

28.13. Although this question doesn’t ask for an approximation and could have been placed in Lesson 24, the style is similar to Kaplan-Meier approximations with grouped data. The population is accumulated starting with P0  0 at time 0 and adding new entrants, subtracting withdrawals and deaths: P1  90 − 30 − 10  50

P2  50 + 80 − 20 − 8  102

P3  102 + 70 − 30 − 20  122

The risk set r j equals P j + 0.7n j − 0.3w j : r0  0.7 (90) − 0.3 (30)  54

r1  50 + 0.7 (80) − 0.3 (20)  100

r2  102 + 0.7 (70) − 0.3 (30)  142

r3  122 + 0.7 (40) − 0.3 (30)  141

The Kaplan-Meier estimate of survival to time 24 is 3 Y j0

1−

dj rj

!



 1−

10 54



1−

8 100



1−

20 142



1−

16  0.5710 141



The estimate of the probability of not surviving to time 24 is 1 − 0.5710  0.4290 . (B) The official solution also calculates the Nelson-Åalen estimate, but that is not asked for in the question. 28.14. Exact exposure for each policy begins at birthday in 2012, when the person turns 82, and ends at the earliest of end of year, death, or surrender. Actuarial exposure is 12 months for every death and is otherwise equal to exact exposure. Policy

Beginning of

End of

Exact

Actuarial

1 2 3 4

4-1-2012 7-1-2012 8-1-2012 9-1-2012

10-1-2012 12-31-2012 12-1-2012 11-1-2012

6 months 6 months 5 months 2 months

12 months 6 months 5 months 2 months

There is one death. Total exact exposure is 19 months, and estimated mortality is 1− e −12/19  0.46825. Total actuarial exposure is 25 months, and estimated mortality is 12/25  0.48. The difference is 0.48 − 0.46825  0.01175 . (A) 28.15. Insuring birthday is 3-1-1930 for policy 1, 10-1-1930 for policy 2, 1-1-1931 for policy 3, and 8-1-1930 for policy 4. For policy 3, insuring age is 81 throughout 2012, so that policy is not in the study. Policy

Beginning of

End of

Exact

Actuarial

1 2 3 4

3-1-2012 10-1-2012 — 8-1-2012

10-1-2012 12-31-2012 — 11-1-2012

7 months 3 months — 3 months

12 months 3 months — 3 months

C/4 Study Manual—17th edition Copyright ©2014 ASM

28. MORTALITY TABLE CONSTRUCTION

512

There is one death. Total exact exposure is 13 months, and estimated mortality is 1 − e −12/13  0.6027. Total actuarial exposure is 18 months, and estimated mortality is 12/18  0.6667. The difference is 0.6667− 0.6027  0.0640 . (E) 28.16. The beginning of exposure is on the policy anniversary in 2012, the same as in the previous exercise. The end of exact exposure is the earliest of the policy anniversary in 2013, date of death, and date of surrender. Actuarial exposure is the same except that deaths get 12 months of exposure. Policy

Beginning of

End of

Exact

Actuarial

1 2 3 4

3-1-2012 10-1-2012 — 8-1-2012

10-1-2012 5-1-2013 — 11-1-2012

7 months 7 months — 3 months

12 months 7 months — 3 months

There is one death. Total exact exposure is 17 months, and estimated mortality is 1 − e −12/17  0.5063. Total actuarial exposure is 22 months, and estimated mortality is 12/22  0.5455. The difference is 0.5455− 0.5063  0.0392 . (C) 28.17. Use formula (28.3).

L ( qˆ82 )  (1 − 0.6027) Var

2

1  0.1345 (13/12) 2

!

(A)

28.18. The exposures for the three ages are then 950, 1050.5, and 950. The survival rate for three years is



0 3 pˆ 35  1 −

3 950



4 1− 1050.5



5 1−  (0.9968)(0.9962)(0.9947)  0.9878 950



0 The three-year mortality rate is 3 qˆ35  1 − 0.9878  0.0122 . 28.19. Note that all 800 lives begin at age 40, not half of them. Since we assume uniform entries and withdrawals, e j  P j + 0.5 ( n j − w j ) . Since there are no entries at the beginning of ages after 40, population is calculated recursively as P j+1  P j + n j − w j − d j . The resulting calculation is shown in the following table:

j

Pj

nj

wj

dj

ej

0 1 2

800 758 703

40 38 35

80 90 85

2 3 4

780 732 678

0 0 Therefore 3 pˆ 40  (778/780)(729/732)(674/678)  0.9875 and 3 qˆ 40  1 − 0.9875  0.0125 .

28.20. Withdrawal is the decrement of interest, while death and disability are censored observations. The risk sets are e j  P j + 0.5 ( n j − w j ) . There are no new entrants. There is 1 censored observation in the first year (the disability at time 0.30), 2 in the second year (death at 1.5, disability at 1.3), 1 in the third year (disability at 2.5), and 3 in the fourth year (death at 3.5, disability at 3.3 and 3.9). The following table computes the risk sets. j

Pj

nj

wj

dj

ej

0 1 2 3

60 37 25 21

0 0 0 0

1 2 1 3

22 10 3 3

59.5 36.0 24.5 19.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

0 j+1 p 0

37.5/59.5  0.630252 0.630252 (26/36)  0.455182 0.455182 (21.5/24.5)  0.399445 0.399445 (16.5/19.5)  0.337992

EXERCISE SOLUTIONS FOR LESSON 28

513

The probability of persisting four years in the absence of other decrements is 0.337992 . 28.21. We estimate S (10,000) . For the interval (0, 500], the risk set is the 100 policies with deductible 0 and there are 35 losses, so Sˆ (500)  0.65. We can group together the two intervals (500, 1000] and (1000, 5000], since there is no censoring or new entrants between 500 and 5000. In the interval (500, 5000], the risk set is 115 (all losses except the 35 observations in (0, 500]) and the  number of losses in this combined interval is 20 + 25 + 15 + 18  78, so Sˆ (5000)  0.65 (115 − 78) /115  0.2091. In the interval (5000, 10000], the risk set is, adding up the count of observed losses in the interval plus the 2 + 1 observations censored at the 10,000 limit, 10 + 9 + 2 + 1  22 (or if you prefer, start with the population of 37 at 5000 and remove the 15 observationscensored at 5000). There are 19 losses in the interval (5000, 10000], so Sˆ (10,000)  0.2091 (22 − 19) /22  0.02852. The following table summarizes our calculations: cj

Pj − n j

0 500 5000 10000

0 65 22

nj

wj

dj

ej

Sˆ ( c j )

100 50 0

0 15 3

35 78 19

100 115 22

0.65 0.2091 0.02852

The probability that loss size will be no more than 10,000 is therefore 1 − 0.02852  0.97148 .

28.22. Even though we are estimating the probability for a 500 deductible, all information from the study for both deductibles must be used. We are assuming that the underlying loss distribution is independent of deductible, so the information from policies with a 250 deductible is relevant to our estimate. Since we are assuming a 500 deductible, we just have to calculate a conditional survival function, the condition being that the loss is over 500. So we begin the estimation procedure with the interval (500, 1000]. The only logical assumption for the risk set is that all entries and withdrawals occur at endpoints. j

cj

Pj

nj

wj

dj

ej

0 1

500 1000

92 59

0 0

6 0

27 32

92 59

In the above table, the population at 500 is all losses except for the 8 in the range (250, 500]. The population at 1000 is 92 − 6 − 27  59. Based on this  table, theprobability that a loss is greater than 5000 given that it is greater than 500 is (92 − 27) /92 (59 − 32) /59  0.3233 . 28.23. The mortality estimate is obtained by setting e j  P j + n j . The following table shows the results. j

( c j , c j+1 ]

0 1 2

(20, 40] (40, 60] (60, 80]

Pj − n j 100 77 62

nj

wj

dj

ej

4 10 5

24 20 8

3 5 6

104 87 67

0( d ) 20 ( j+1) pˆ 20

101/104  0.971154 0.971154 (82/87)  0.915340 0.915340 (61/67)  0.833370

0( d ) So we have 60 pˆ 20  0.833370. In the other population,

(τ) 60 p 20

 (0.833370)(1 − 0.6)  0.3333

28.24. In each of the first two intervals, because of uniform distribution, both the claims and the censored claims average one-half exposure for the period. Therefore, the number of exposures in the first period is 10 − 0.5 (3)  8.5 and the number of exposures in the second period is 7 − 0.5 (3)  5.5. Then the estimated probability that a claim exceeds 30,000 is e −1/8.5 e −1/5.5  0.7412 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

514

28. MORTALITY TABLE CONSTRUCTION

28.25. In each of the first two intervals, because of uniform distribution, half the censored claims are censored below the claim in the interval. The risk set in the first interval is therefore 10 − 0.5 (2)  9, and in the second interval, 10 − 1 − 2 − 0.5 (2)  6. Then the estimated probability that a claim exceeds 30,000   is 89 65  0.7407 . (C) 28.26. The risk set at 500 is the 400 policies with deductible 300, so Sˆ (500)  (400 − 50) /400  0.875. The risk set between 500 and 5000 are the 875 remaining policies (925 minus the 50 with claims below 500), and 125 + 300  425 claims are between  500 and 5000(no censoring occurs below 5000, so we can combine ˆ the 2 intervals), so S (5000)  0.875 (875 − 425) /875  0.45. Thus Fˆ (5000 | X > 300)  1 − 0.45  0.55 . (E) 28.27. It suffices to calculate the products starting at 750, since we only want the probability of exceeding 5000 conditioned on exceeding 750. Since censoring occurs only at 1500, we can group the data from 750 to 1500 into one group. This group has 20 + 27  47 events and the risk set is all 120 losses except for the 8 losses below 750, or 112. The 1500–5000 group has 32 events. The risk set is 112 minus the 47 events between 750 and 1500, minus the 6 policies at the 1500 limit, or 112 − 6 − 47  59. The Kaplan-Meier approximation is    47 32 1− (A) 1−  0.2656 112 59 28.28. The first set of 10,000 lives is age 61 in the second year. There are 10,000 − 100 − 800 + 1000  10,100 such lives at the beginning of the second year. Exposure consists of those 10,100 lives minus half the surrenders, or 10,100 − 0.5 (700)  9,750. The second set of 8000 lives is age 61 in the first year. Exposure is 8000 − 0.5 (700)  7650. Total exposure from the two groups is 9750 + 7650  17,400. There are 120 + 80  200 deaths. The actuarial estimate of the mortality rate is 200/17,400  0.01149 . (D) 28.29. For exact exposures, subtract half the deaths from actuarial exposure. The resulting exposure, referring to the previous exercise, is 17,400 − 0.5 (200)  17,300. The estimate of q61 is 1 − e −200/17,300  0.01149 . (D)

Quiz Solutions 28-1. The first one has a full 12 months in the study at age 30. the second one turns 30 on 2/1/2011 and stays in the study for 6 months, providing 6 months of exposure. The third one purchases a policy at age 30 and 5 months and dies during the year, so exposure is until the next birthday, or 7 months. Total exposure for all three in months is 12 + 6 + 7  25 . 28-2. Notice that d j is used in its individual-data sense to mean the time of a new entry. There are 5 entering lives and 1 leaving life, so the risk set is 0.5 (5 − 1)  2, the number of deaths is 1, and the mortality probability is 0.5 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 29

Supplementary Questions: Empirical Models Use the following information for questions 29.1 and 29.2: You are given the following data for a mortality study on a population of 1000 age 20. Ages

Deaths

Censored

21 22 23 24

2 3 4 5

12 13 14 15

All deaths and censoring occur at the exact ages indicated. 29.1.

Calculate the Kaplan-Meier estimate of 2 q 21 .

(A) 0.005

(B) 0.006

(C) 0.007

(D) 0.008

(E) 0.009

29.2. Calculate the Greenwood approximation for the standard deviation of the Kaplan-Meier estimate of 2 q 21 . (A) 0.0020 29.3.

(B) 0.0022

(C) 0.0025

(D) 0.0027

(E) 0.0031

On a disability policy, the following is the distribution of observed time on disability: Years on Disability

Number of Lives

(0, 1] (1, 2] (2, 3] (3, ∞)

65 20 12 3

Time on disability is uniformly distributed within each year. The amount of time on disability is represented by X. Calculate Var ( X ∧ 3) . (A) 0.58

(B) 0.68

(C) 0.78

(D) 0.88

(E) 0.98

29.4. In a mortality study on 10 lives, times of death are 4, 5, 5, 6, 7, 8, 10, 12, 13, 15. The empirical distribution is smoothed using kernel distribution methods with a uniform kernel of bandwidth 4. Determine the smoothed value of h (7.5) . (A) 0.088

(B) 0.126

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.157

515

(D) 0.176

(E) 0.197

Exercises continue on the next page . . .

29. SUPPLEMENTARY QUESTIONS: EMPIRICAL MODELS

516

29.5.

A linear 90% confidence interval for H (100) is (0.82, 0.98) .

Determine the upper bound of a log-transformed 98% confidence interval for H (100) . (A) 0.98 29.6.

(B) 0.99

(C) 1.00

(D) 1.01

(E) 1.02

For a family of estimators Ya , you are given: bias ( Ya )  4 − a

Var ( Ya )  a 2 + a + 1

For the value of a minimizing MSE ( Ya ) , calculate MSE ( Ya ) . (A) 10 21 29.7.

(B) 10 58

(C) 10 43

(D) 10 78

(E) 11

In a mortality study, you are given Year

Entries at Beginning of Year

Withdrawals at End of Year

Deaths

1 2

400 300

100 60

2 4

Using the actuarial estimator, estimate the two-year survival rate in the absence of surrenders. (A) 0.984

(B) 0.985

(C) 0.986

(D) 0.987

(E) 0.988

29.8. An auto collision coverage is sold with deductibles of 500 and 1000. Your records have 14 claims of the following sizes, before application of the deductible: (i) 4 claims of sizes 650, 900, 1200, 1600 from policies with a deductible of 500. (ii) 4 claims for under 600. (iii) 3 claims for over 2000 from policies with a deductible of 500. (iv) 3 claims for over 2000 from policies with a deductible of 1000. You use the Kaplan-Meier product limit estimator to estimate the claim size distribution. You assume that the claim size distribution does not vary with deductible. Estimate the probability that claim size is greater than 1500, before application of the deductible, given that it is greater than 500. (A) 0.34 29.9.

(B) 0.36

(C) 0.40

(D) 0.45

(E) 0.50

You are given the following data on loss sizes: Loss Amount

Number of Losses

0– 1000 1000– 5000 5000–10000

5 4 3

An ogive is used as a model for loss sizes. Calculate the probability that a loss is greater than 150% of the mean. (A) 0.25

(B) 0.28

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.33

(D) 0.41

(E) 0.42

Exercises continue on the next page . . .

29. SUPPLEMENTARY QUESTIONS: EMPIRICAL MODELS

517

29.10. You observe the following claim sizes: 100,

200,

500,

800,

1000,

1300,

2000,

2000

Let p  Pr ( X < 1000 | X > 500) . p is estimated empirically. Estimate the variance of the empirical estimator for p. (A) 0.01367 29.11. (i)

(B) 0.03125

(C) 0.032

(D) 0.04

(E) 0.048

(D) 2

(E) 4

θˆ is an estimator of θ. You are given: E ( θˆ − θ ) 2  38

f

g

(ii) E θˆ 2  50 (iii) θ  6

f

g

Determine biasθˆ ( θ ) . (A) −4

(B) −2

(C) 0

29.12. A study is conducted of survival time of automobiles. Five observations of the survival time in years are 4, 5, 7, 9, 12. Survival time is smoothed using kernel density methods, with a uniform kernel with bandwidth 4. Determine the variance of estimated survival time of automobiles. (A) (B) (C) (D) (E)

Less than 11 At least 11, but less than 12 At least 12, but less than 13 At least 13, but less than 14 At least 14

29.13. In a mortality study, there are 8 lives that survive three or more years. 3 deaths occur in the fourth year. Each death occurs at a different duration within the fourth year. No lives drop out of the study during the fourth year. The estimate of S (3) , using the Nelson-Åalen estimator, is 0.8630. Calculate the Nelson-Åalen estimate of S (4) . (A) 0.52

(B) 0.54

(C) 0.56

(D) 0.58

(E) 0.59

29.14. On an insurance coverage, you have claims with sizes of 500

600

650

700

720

760

900

950

You estimate the claim distribution using the empirical distribution smoothed with a triangular kernel having bandwidth 100. Estimate Pr (700 < X < 800) . (A) 0.21

(B) 0.22

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.23

(D) 0.24

(E) 0.25

Exercises continue on the next page . . .

29. SUPPLEMENTARY QUESTIONS: EMPIRICAL MODELS

518

29.15. For a mortality study, you are given (i) One death apiece occurred at times y1 and y2 . No other deaths occurred in the study. (ii) n lives withdrew from the study between times y1 and y2 . (iii) There were no entries to the study past the starting time. (iv) The Nelson-Åalen estimate of H ( y2 ) is 0.035625. (v) The estimated variance of the Nelson-Åalen estimator of H ( y2 ) is 0.00064414. Determine n. (A) 12

(B) 13

(C) 14

(D) 15

(E) 16

29.16. You have the following data for a group of 1000 policyholders: Years

Deaths

Surrenders

(0,1] (1,2] (2,3]

2 6 13

100 60 40

The actuarial estimator is used to estimate the mortality rate. All surrenders occur at the end of each year. For another group of 500 policyholders, no surrenders occur. Mortality for this group is assumed to be the same as for the group of 1000 policyholders. Estimate the number of survivors at the end of 3 years from the group of 500 policyholders. (A) 488

(B) 489

(C) 490

(D) 491

(E) 492

Solutions 29.1. [Lesson 24] The risk set for age 22 is 1000 − 2 − 12  986, since we must subtract those who left the study at age 21. The risk set for age 23 is 986 − 3 − 13  970. The Kaplan-Meier estimate is 2 qˆ 21  1 −

983 986

966  1 − (0.996957)(0.995876)  1 − 0.992846  0.007154 970

!

!

(C)

29.2. [Lesson 26] The variance of 2 qˆ21 is the same as the variance of 2 pˆ 21 since Var (1 − X )  Var ( X ) . We use the result of the previous problem for 2 pˆ 21 .

L Var

2 2 qˆ 21  0.992846





 0.992846



2



3

(986)(983) 

+

4



(970)(966)

(0.00000309522 + 0.00000426885)  0.00000725908

√ Then 0.00000725908  0.002694 . (D) 29.3.

[Lesson 25] E[X ∧ 3]  0.65 (0.5) + 0.2 (1.5) + 0.12 (2.5) + 0.03 (3)  1.015 E ( X ∧ 3) 2 

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

0.65 (1) + 0.20 (23 − 13 ) + 0.12 (33 − 23 ) + 0.03 (32 )  1.71333 3

EXERCISE SOLUTIONS FOR LESSON 29

519

Var ( X ∧ 3)  1.71333 − 1.0152  0.68311

(B)

29.4. [Lesson 27] We estimate f (7.5) and F (7.5) . The kernel density function is 81 , with 7 points (all points from 4 to 10) in the bandwidth, so dividing by the number of points, 10: f (7.5) 

7

(8)(10)



7 80

The kernel distribution function for observation y is 7.5 − ( y − 4) 8 11.5 − y  8

K y (7.5) 

which is

15 16

at 4,

13 16

at 5,

11 16

at 6,

9 16

at 7,

Fˆ (7.5) 

7 16

1

at 8,

(10)(16)



3 16

at 10. Each observation has weight

15 + 2 (13) + 11 + 9 + 7 + 3 



89 71  Sˆ (7.5)  1 − 160 160 So

fˆ(7.5) 7/80 14 hˆ (7.5)     0.1573 ˆ S (7.5) 89/160 89

1 10 .

This adds up to:

71 160

(C)

q

L ( H )  0.08, 29.5. [Lesson 26] Hˆ (100)  0.9, the midpoint of the linear confidence interval, and Z0.9 Var from which it follows that q L ( H )  2.326 (0.08)  0.113 Z0.98 Var 1.645 and then 0.113 U  exp  1.134 0.9

!

HU  (0.9)(1.134)  1.02

(E)

29.6. [Lesson 21] By equation (21.1) on page 356, the MSE is the sum of the bias squared and the variance. Thus we must minimize

(4 − a ) 2 + a 2 + a + 1  16 − 8a + 2a 2 + a + 1  2a 2 − 7a + 17 The minimum of a quadratic ax 2 + bx + c is assumed at −b 2a where b is the coefficient of x and a (nothing to do with this problem’s a) is the coefficient of x 2 . Here, the minimum is therefore assumed at 74  1.75. We calculate the value of the expression at 1.75. 2 (1.752 ) − 7 (1.75) + 17  10.875

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

29. SUPPLEMENTARY QUESTIONS: EMPIRICAL MODELS

520

29.7. [Lesson 28] Exposure for the first year consists of the 400 entrants. At the end of the year, 100 withdraw and 2 die, leaving 298 members. Exposure for the second year consists of those 298 plus 300 new entrants, or 598. The two-year survival rate is 398 Sˆ (2)  400

!

594  0.988 598

!

(E)

29.8. [Lesson 24] The claims under 600 must come from policies with deductible 500. The risk sets for amounts below 1000 do not include policies with a deductible of 1000. The risk set for the claims below 650 is the 11 (4 + 4 + 3) claims from policies with deductibles of 500. Since no censoring occurs below 650, there is no need to know the exact amounts of these claims; the product limit estimator would telescope the ratios anyway. The risk set for the claim of 900 is the 6 claims remaining after subtracting the 5 claims for 650 from the risk set of 11. The risk set for the claim of 1200 is the 2 claims for 1200 and 1600 plus the 6 claims for amounts above 2000, or 8 claims. We do not have to compute the risk set for the claim of 1600 to estimate S (1500) . So we have: yi 650 900 1200

ri 11 6 8

6 S14 (1500)  S14 (1200)  11 29.9.

si 5 1 1 5 6

!

7  0.3977 8

!

!

(C)

[Subsection 25.1.2] We calculate the mean as the average of the averages for each interval: 5 0 + 1000 4 1000 + 5000 3 5000 + 10,000 E[X]  + +  3083 13 12 2 12 2 12 2

!

!

!

150% of 3083 13 is 4625. Then, using the ogive to calculate the distribution function, 5 4625 − 1000 F12 (4625)  + 12 5000 − 1000

!

9 5 −  0.71875 12 12



The probability of a loss greater than 4625 is 1 − F (4625)  1 − 0.71875  0.28125 . (B)

29.10. [Subsection 23.1] We use equation (23.3) on page 382. There are five observations above 500, with only one below 1000. The number of such observations   has a binomial distribution, with estimated parameters m  5 and q  1/5, so the variance is 5 51 54  4 5 . p is estimated as the number of observations between 500 and 1000 divided by the total number of 4 observations above 500, which is 5, so the variance of p is 54 /52  125  0.032 . (C)

29.11. [Lesson 21] Note that θ is a constant and thus may be pulled out of any expectation. We expand f g 2 ˆ E (θ − θ) . ˆ + E θ2 E ( θˆ − θ ) 2  E θˆ 2 − 2 E θθ

f

g

f

g

f

g

f

g

ˆ + 62 38  50 − 12 E[θ] ˆ −48  −12 E[ θ]

ˆ 4 E[ θ] The bias is therefore

C/4 Study Manual—17th edition Copyright ©2014 ASM

ˆ − θ  4 − 6  −2 biasθˆ ( θ )  E[θ]

(B)

EXERCISE SOLUTIONS FOR LESSON 29

521

29.12. [Lesson 27] Since each kernel’s mean is the observation, the mean of the kernel smoothed dis 7.4. The variance is easiest tribution is the mean of the original distribution, which here is 4+5+7+9+12 5 to calculate using the conditional variance formula. Let X be kernel-smoothed survival time and I be the observation (4, 5, 7, 9, or 12). By X | I, we mean the kernel-smoothed survival time given that the unsmoothed time is I. The conditional variance formula says Var ( X )  Var E[X | I] + E Var ( X | I )





f

g

E[X | I]  I, since the mean of each kernel is the observation point. Var E[X | I] is the variance of the unsmoothed points, or



Var E[X | I] 







(4 − 7.4) 2 + (5 − 7.4) 2 + (7 − 7.4) 2 + (9 − 7.4) 2 + (12 − 7.4) 2 5

 8.24

Var ( X | I ) is the variance of the kernel for each point. The kernel for each point is a uniform distribution k2 over an interval of 8, and the variance of a uniform distribution on an interval of size k is 12 , so the variance f g 82 of our kernel is Var ( X | I )  12  5 13 . Since it is the same for each I, E Var ( X | I )  5 13 . Therefore Var ( X )  8.24 + 5 31  13.5733 29.13.

(D)

[Lesson 24] We back out Hˆ (3) : Hˆ (3)  − ln Sˆ (3)  − ln 0.8630

Then we sum up

si ri



1 ri

for the 3 deaths that occur between time 3 and 4: 1 1 1 Hˆ (4)  − ln 0.8630 + + +  0.58186 8 7 6 Sˆ (4)  e −0.58186  0.55886 (C)

29.14. [Lesson 27] We calculate Fˆ (700) and Fˆ (800) .  2 For Fˆ (700) , 500 and 600 have weight 1 apiece. 650 is half-way to 700, so its weight is 1− 12 12  0.875.

700 has weight 12 . 720 is 4/5 of the way from 800 to 700, and so has weight 1 2 2 2 5

1 4 2 2 5

 0.32. 760 is 2/5 of the

way from 800 to 700 and so has weight  0.08. The weights add up to 1+1+0.875+0.5+0.32+0.08   0.471875. 3.775 so Fˆ (700)  3.775 8 For Fˆ (800) , the 4 points through 700 have weight 1 apiece. 720 is 1/5 of the way from 700 to 800, so it  2  2 has weight 1 − 12 51  0.98. 760 has weight 1 − 21 35  0.82. The weights add up to 4 + 0.98 + 0.82  5.8, so Fˆ (800)  5.8 8  0.725. Pr (700 < X < 800)  0.725 − 0.471875  0.253125 . (E)

29.15. [Section 24.2 and Lesson 26] We use equation (24.2) on page 398 and equation (26.2) on page 438. 1 Let x  r11 and y  r12  r1 −1−n . Then x + y  0.035625 2

x + y 2  0.00064414 x 2 + (0.035625 − x ) 2  0.00064414

2x 2 − 2 (0.035625) x + (0.0356252 − 0.00064414)  0

C/4 Study Manual—17th edition Copyright ©2014 ASM

29. SUPPLEMENTARY QUESTIONS: EMPIRICAL MODELS

522

2x 2 − 0.07125x + 0.000625  0

0.07125 ± 0.071252 − 8 (0.000625) x  0.02, 0.015625 4

p

The reciprocals are 50 and 64, and these are r1 and r2 . Since r1 is the higher, there must have been 13 withdrawals (64 − 50 − one death). n  13 . (B)

29.16. [Lesson 28] Since all withdrawals occur at the end of the year, the exposure equals the starting population. Thus e1  1000, e2  1000 − 2 − 100  898, and e3  898 − 6 − 60  832. Sˆ (3) 

998 1000

!

892 898

!

819  0.975842 832

!

The estimated number of survivors is 0.975842 (500)  487.92 . (A)

C/4 Study Manual—17th edition Copyright ©2014 ASM

Part III

Parametric Models

524

PART III. PARAMETRIC MODELS

This part of the course deals with building models for loss sizes and counts using well-known distribution functions. It also discusses evaluating the fit of these models. Parametric estimation has been on CAS 4B exams since time immemorial. This material, and the material on testing the fit with chi-square, should be familiar to someone who took a mathematical statistics course. But graphic tests, Kolmogorov-Smirnov tests, and Anderson-Darling tests will be new to most. Expect lots of exam questions on all estimation methods—method of moments, percentile matching, and especially maximum likelihood. This is the heart of the course. Bayesian estimation is so similar to Bayesian credibility that I postpone discussing it until the credibility part of the manual.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 30

Method of Moments Reading: Loss Models Fourth Edition 13.1

30.1

Introductory remarks

Note about parametric estimation Before beginning the main topic of this lesson, let’s discuss a general aspect of parametric estimation that students occasionally get confused about. Parametric estimation involves hypothesizing that the underlying distribution follows a probability distribution that can be expressed as a function with a small number of parameters. Examples of probability distributions abound; any of the distributions in the tables you get at the exam are examples. An exponential with parameter θ, a gamma with parameters α and θ, a lognormal distribution with parameters µ and σ, are all examples of probability distributions which may be used as hypotheses. Once you decide on which function you’d like to fit to the data, you then use one of the methods we’ll discuss in the upcoming lessons: method of moments, percentile matching, or maximum likelihood. (There are other methods, but they are not on the syllabus.) This method tells you which values of the parameters are the best ones in some sense. After all that is done, you have a probability distribution which you assume is the underlying distribution for the data. (I’m simplifying; as we learn in the later lessons, you actually fit many functions to the data, then choose the function which is the best fit in some sense.) At this point, you deal with the completely independent question you started with. This question may be “What is the overall mean?” or “What is average payment per loss with a deductible of 1000?” or “What is the TVaR?”. The question you are trying to answer does not affect the parameter fitting in any way. Suppose you have data from a coverage with a 10,000 policy limit. Your question may concern the average payment per loss on a 1000 deductible. The fact that your question deals with deductibles doesn’t affect the estimate of the parameters. As long as you’re willing to assume that the underlying distribution for both coverages (the policy limit and the deductible) is the same, you estimate the parameters based on the 10,000 policy limit, completely ignoring the fact that the resulting distribution will be used to estimate a 1000 deductible. Once you have the parameters, you use the methods we learned earlier in this course to answer your question about the 1000 deductible.

The method of moments There is approximately one method of moments question on each exam. In the method of moments, to fit a k parameter distribution, you equate the k sample moments with the same k moments of the distribution you’re fitting. When not told otherwise, equate the first k sample moments with the same k moments of the fitted distribution. If you are equating the second moments, instead of matching the second moment, you may match the variance, but use the empirical distribution P 1 P variance ( n1 ( x i − x¯ ) 2 ), which uses division by n, not the sample variance ( n−1 ( x i − x¯ ) 2 ), which uses division by n − 1. C/4 Study Manual—17th edition Copyright ©2014 ASM

525

30. METHOD OF MOMENTS

526

In the following, m is the first sample moment and t is the second sample moment:

Pn m

i1

Pn

xi

t

n

i1

x 2i

n

This notation is used in the textbook’s tables for various method of moments estimators, but you do not get these formulas at the exam.

30.2

The method of moments for various distributions

30.2.1

Exponential

The exponential distribution has one parameter θ, which is its mean. To fit an exponential distribution using the method of moments, set θ equal to the sample mean, or θˆ  m. Example 30A You are given a sample of 50 claims whose sum is 920. It is assumed that the underlying distribution of these claims is exponential. Determine the method of moments estimator of the loss elimination ratio at 10. Answer: The method of moments estimator of θ is x¯  920 50  18.4. For an exponential distribution, the loss elimination ratio is E[X ∧ d] LER ( d )   1 − e −d/θ E[X] In our case, LER (10)  1 − e −10/18.4  0.4193 .

30.2.2



Gamma

A gamma distribution has two parameters, α and θ. Let X be the random variable having a gamma distribution. Then E[X]  αθ Var ( X )  αθ2 Equating these moments to the sample mean x¯ and the biased sample variance σˆ 2 , αθ  x¯ αθ2  σˆ 2 Dividing the first line into the second, σˆ 2 t − m 2 θˆ   x¯ m

αˆ 

x¯ 2 m2  σˆ 2 t − m 2

Example 30B You have the following sample of loss sizes: 3,

5,

X

C/4 Study Manual—17th edition Copyright ©2014 ASM

8,

10,

x i  166

15,

27,

X

38,

x 2i  6196

60

(30.1)

30.2. THE METHOD OF MOMENTS FOR VARIOUS DISTRIBUTIONS

527

You are to fit the losses to a gamma distribution using the method of moments, matching the first two moments. Determine the estimate of θ. 166 8

Answer: The sample mean is x¯  equation (30.1),

30.2.3

 20.75. The empirical variance is

6196 8

343.9375 θˆ   16.5753 . 20.75

− 20.752  343.9375. By



Pareto

The following example shows how to calculate the method of moments estimators for a two-parameter Pareto. Example 30C You have the following sample of loss sizes: 3,

5,

X

8,

10,

20,

50,

X

x i  396

100,

200

x 2i  53,098

You are to fit the losses to a Pareto distribution using the method of moments, matching the first 2 moments. Estimate e (20) using the fitted distribution. Answer: Let’s develop a general formula. Matching first and second moments, θ m α−1

2θ 2 t ( α − 1)( α − 2)

By dividing the square of the first equation into the second we eliminate θ. 2 ( α − 1) t  2 α−2 m 2αm 2 − 2m 2  tα − 2t

α (2m 2 − t )  2m 2 − 2t αˆ 

2(m2 − t ) 2(t − m2 )  2m 2 − t t − 2m 2

Then we can calculate θˆ by plugging αˆ into the equation of means. αˆ − 1 

t t − 2m 2

θ m αˆ − 1 mt θˆ  t − 2m 2 C/4 Study Manual—17th edition Copyright ©2014 ASM

30. METHOD OF MOMENTS

528

Plugging in the given values m  396/8  49.5 and t  53,098/8  6637.25, we get 2 (6637.25 − 49.52 )  4.82165 6637.25 − 2 (49.52 ) θˆ  m ( αˆ − 1)  49.5 (4.82165 − 1)  189.17 αˆ 

20+189.17 Therefore, e (20)  4.82165−1  54.733 . (Actually, we did not have to calculate θˆ in order to calculate e (20) , 20 since e (20)  E[X] + α−1 and the method of moments sets E[X] equal to the sample mean automatically, 20 so we just needed 49.5 + 3.82165 .) 

The method of moments fails when t ≤ 2m 2 .

30.2.4

Lognormal

For the lognormal, the method of moments equates the first two sample moments to the first two moments of the lognormal distribution. We discussed this in Example 15C on page 259. This is not the same as equating the first two moments of the logarithm of the sample with lognormal parameters µ and σ. Example 30D Claim sizes are as follows: 1000,

1000,

1000,

1500,

2500,

5000

A lognormal distribution is fitted to the claim sizes using the method of moments, matching the first 2 moments. Determine the estimate of µ. Answer: Let m be the sample mean and t the sample second moment. This m and t notation is used in the tables in the appendix of Loss Models, but the formulas involving m and t are not given to you in the tables you get on the exam. m  x¯  2000 t

1 6



10002 + 10002 + 10002 + 15002 + 25002 + 50002



 6,083,333 31 We equate these moments to the moments of a lognormal. 2

e µ+0.5σ  2000 e

2µ+2σ 2

 6,083,333 13

µ + 0.5σ2  ln 2000 2µ + 2σ 2  ln 6,083,333 31

Then σˆ 2  ln 6,083,333 13 − 2 ln 2000  15.6211 − 2 (7.6009)  0.4193 µˆ  7.6009 − 0.5 (0.4193)  7.3913



A general formula is µˆ  2 ln m − 0.5 ln t

σˆ 2  −2 ln m + ln t

In Appendix A of Loss Models, an equivalent formula is given, in which you first estimate σˆ 2 : σˆ 2  ln t − 2 ln m C/4 Study Manual—17th edition Copyright ©2014 ASM

µˆ  ln m − 0.5σ2

30.2. THE METHOD OF MOMENTS FOR VARIOUS DISTRIBUTIONS

30.2.5

529

Uniform

The mean of a uniform distribution on [0, θ] is θ/2. The method of moments estimator equates θ/2 and ¯ so θˆ  2x. ¯ x, Example 30E You are given the following sample from a uniform distribution on [0, θ]: 1

2

4

6

10

Estimate θ using the method of moments. Answer: The sample mean is x¯  (1 + 2 + 4 + 6 + 10) /5  4.6, so θˆ  2 (4.6)  9.2 . Notice that the sample observation of 10 is impossible (probability is zero) for a uniform distribution on [0, 9.2], making this estimate implausible. 

30.2.6

Other distributions

It is possible to calculate method of moments estimators for the inverse Pareto, the inverse gamma, and the inverse Gaussian. An exam can also ask questions on other distributions, such as mixtures of exponentials. 10 2 Example 30F A sample of loss sizes has 10 i1 x i  100 and i1 x i  30,000. You are to fit this sample to a mixture of 2 exponential distributions with density function:

P

P

f (x ) 

w −x/5 1 − w −x/θ e + e 5 θ

Determine w and θ. Answer: x¯  10 and the second raw moment µ02  3000. The first and second moments of an exponential are θ and 2θ2 , and the moments of a mixture are the mixtures of the moments, so we have

Solve for w in the first equation.

Plug into the second equation.

2

w (5) + (1 − w ) θ  10

w 2 (5 ) + (1 − w )(2θ 2 )  3000





w (5 − θ ) + θ  10 10 − θ w 5−θ 10 − θ −5 50 + 2θ 2  3000 5−θ 5−θ

!

!

500 − 50θ − 10θ 2  15,000 − 3000θ 10θ 2 − 2950θ + 14,500  0 θ 2 − 295θ + 1450  0

295 ± 2952 − 4 (1450) θ 2  5 or 290

p

5 is rejected since w 

10−θ 5−θ .

C/4 Study Manual—17th edition Copyright ©2014 ASM

So θˆ  290 and wˆ 

10−290 5−290

 0.982456



30. METHOD OF MOMENTS

530

?

Quiz 30-1 A sample has mean 0.7 and raw second moment 0.6. It is fitted to a distribution with the following probability density function: f (x ) 

Γ ( a + b ) a−1 x (1 − x ) b−1 Γ(a )Γ(b )

0≤x≤1

Determine a and b using the method of moments.

30.3

Fitting other moments, and incomplete data

The textbook mentions that you can fit any k moments, not necessarily the first k. You can even use negative moments, and those may be preferable for inverse distributions, which may not even have positive moments. However, on an exam, unless you are told otherwise, when you have to fit one parameter, match the means; when you have to fit two parameters, match the first 2 moments. On the fall 2004 exam, there was a question specifically asking you to fit the first two negative moments. It is necessary to use negative moments for distributions which don’t have positive moments. Example 30G A sample of loss sizes has sample mean 22 and sample biased variance 361. You are to fit the sample to an exponential matching the second moment. Determine the estimate of the mean. Answer: The sample second raw moment is 361 + 222  845. We match this to the second moment of an exponential, 2θ 2 , to obtain 2θ 2  845 θˆ 

p

845/2  20.555

The answer is different from what you would get if you matched the variance (19), or if you matched the first moment (22).  Example 30H You observe a sample of 4 loss sizes: 25,

100,

200,

1000

You are to fit these to an inverse exponential distribution, matching the −1 moment. Determine the estimated mode. Answer: The sample −1 moment is the average of the reciprocals: µ0−1 

1/25 + 1/100 + 1/200 + 1/1000 0.04 + 0.01 + 0.005 + 0.001   0.014 4 4

ˆ  35.71 . The tables have E[X −1 ]  θ −1 Γ (2)  θ1 , so θˆ  1/0.014  71.4286. The mode is θ/2 You may be interested in what this inverse exponential looks like. The probability density function is plotted in Figure 30.1.  To use the method of moments with incomplete data, you must compute the moments of the truncated or censored distribution. Example 30I An automobile collision coverage has a deductible of 500. Claims for losses less than 500 are not submitted. You observe a sample of 20 losses totalling 18,000, including the deductible. The loss distribution is fitted to an exponential distribution using the method of moments, matching the mean. Estimate the average size of all losses, including losses below the deductible. C/4 Study Manual—17th edition Copyright ©2014 ASM

30.3. FITTING OTHER MOMENTS, AND INCOMPLETE DATA

531

f (x ) 0.01 0.009 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 0

0

20

40

60

80

100

120

140

160

180

200

x

Figure 30.1: Probability density function of inverse exponential with θ  71.4286. See example 30H.

Answer: For an exponential, e ( d )  θ for any d, so in particular e (500)  θ. e (500) is the average payment, not the average loss. We equate θ to the mean payment size. Since 18,000 includes deductibles, the average ˆ payment is 18,000 20 − 500  400. Therefore, θ  400. The average size of losses is estimated as 400 . If you find this answer remarkable, repeat to yourself 100 times “The exponential distribution is memoryless.”  Example 30J An automobile liability coverage has a policy limit of 10,000. You observe a sample of 20 claims on which total payments of 150,000 were made. The loss distribution is fitted to a Pareto distribution with α  2 using method of moments, matching the mean. Determine the estimate of mean loss size. Answer: To make the numbers easier to handle, scale everything by 0.0001, so that the policy limit is 1 and total payments are 15. We’ll multiply the answer by 10,000. The censored sample mean is 15 20  0.75. The tables tell us that θ * θ .1 − E[X ∧ 1]  α−1 θ+1

,!

1 θ θ+1

! α−1 +/ -

We equate this to 0.75. θ  0.75 θ+1 θ  0.75θ + 0.75 0.25θ  0.75 θˆ  3 Multiplying by 10,000, θˆ  30,000 and the mean is

C/4 Study Manual—17th edition Copyright ©2014 ASM

θˆ α−1



30,000 2−1

 30,000 .



30. METHOD OF MOMENTS

532

Calculator Tip Any statistical calculator can calculate means and variances using its statistical registers. As usual, we’ll discuss the TI-30XS/B Multiview calculator in particular. On this calculator, you enter the values in a column of the table, and then check the 1-variable statistics. The mean is statistic 2 and the standard deviation with division by n is statistic 4, labeled σx. Don’t use statistic 3, Sx, which is the standard deviation with division by n − 1, since our convention is not to use that for method of moments. P In many of our estimators, we used t, the raw second sample moment. You can use statistic 6, x2 , dividing it by n, to obtain t. For example, Example 30D, the procedure would be as follows: Clear table

data data 4

L1

s% 1000 s% 1000 s% 1500 s% 2500 s% 5000

Enter the 6 sample numbers1 in column 1

1000 enter

Calculate statistics

2nd [stat]1 (Set DATA=L1, FRQ=ONE) enter

Clear display of data table Calculate µˆ  2 ln m − 0.5 ln t Calculate σˆ 2  −2 ln m + ln t

clear clear

s% s%

2 ln 2nd [stat]32 ) -0.5 ln 2nd [stat]36 div 6 ) enter (−) 2 ln 2nd [stat]32 ) + ln 2nd [stat]36 div 6 )

L2

L3

L2

L3

L1(1)= L1 1500 2500 5000 L1(7)= 1-Var:L1,1 1:n=6 ¯ 2:x=2000 3↓Sx=1581.13883

Blank screen 2ln(¯x)−.5ln(

d

x2 7.391273244

P

d d

x2 7.391273244 −2ln(¯x)+ln(P x2 ÷ 0.41925843 2ln(¯x)−.5ln(

P

When moments other than first and second moments are needed, enter an appropriate formula in the second column to calculate the needed power of the numbers. For example, in Example 30H, you would enter the formula 1/L1 in column 2, and calculate statistics for column 2, using the mean. If you need to match two moments in which one is not the square of the other, you may use two columns of the data table for the two formulas.

1Instead of entering the six sample numbers separately, one could enter just the four unique numbers 1000, 1500, 2500, and 5000, and enter a second column with frequencies of 3, 1, 1, and 1 respectively. This is useful when there are a lot of repeats, but probably no faster for this short list. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 30

533

Table 30.1: Formula Summary for Method of Moments

Distribution

Formulas

Exponential

θˆ  m

Gamma

αˆ 

m2 t − m2 2(t − m2 ) αˆ  t − 2m 2

Pareto

t − m2 θˆ  m mt ˆ θ t − 2m 2

µˆ  2 ln m − 0.5 ln t

σˆ 2  −2 ln m + ln t

Lognormal (alt)

µˆ  ln m − 0.5 σˆ 2

σˆ 2  ln t − 2 ln m

Uniform on [0, θ]

θˆ  2m

Lognormal

Exercises 30.1.

Claim sizes are as follows: 100

200

500

1000

1500

2000

3700

An exponential is fitted to claim sizes using the method of moments. Let X be the claim size. Determine the estimate of Pr ( X > 500) . 30.2.

Claim sizes are as follows: 100

100

200

200

300

500

1000

2000

A two parameter Pareto distribution is fitted to the claim sizes using the method of moments. Let X be the claim size. Determine the estimate of Pr ( X > 500) . 30.3.

[1999 C4 Sample:8] Summary statistics for a sample of 100 losses are: Interval (0, 2,000] (2,000, 4,000] (4,000, 8,000] (8,000, 15,000] (15,000, ∞] Total

Number of Losses 39 22 17 12 10 100

Sum 38,065 63,816 96,447 137,595 331,831 667,754

Sum of Squares 52,170,078 194,241,387 572,753,313 1,628,670,023 17,906,839,238 20,354,674,039

A Pareto distribution is fit to this data using the method of moments. Determine the parameter estimates.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

30. METHOD OF MOMENTS

534

30.4.

[4-S00:36] You are given the following sample of five claims: 4

5

21

99

421

You fit a Pareto distribution using the method of moments. Determine the 95th percentile of the fitted distribution. (A) (B) (C) (D) (E)

Less than 380 At least 380, but less than 395 At least 395, but less than 410 At least 410, but less than 425 At least 425

30.5. You fit losses to a Pareto using the method of moments, matching on the first 2 moments. The sample mean is 500. P For what values of the empirical variance ( x i − x¯ ) 2 /n will a fit to a Pareto be possible? 30.6.

[C-S05:24] The following claim data were generated from a Pareto distribution: 130

20

350

218

1822

Using the method of moments to estimate the parameters of a Pareto distribution, calculate the limited expected value at 500. (A) (B) (C) (D) (E) 30.7.

Less than 250 At least 250, but less than 280 At least 280, but less than 310 At least 310, but less than 340 At least 340 [4-F00:2] The following data have been collected for a large insured: Year 1 2

Number Of Claims 100 200

Average Claim Size 10,000 12,500

Inflation increases the size of all claims by 10% per year. A Pareto distribution with parameters α  3 and θ is used to model the claim size distribution. Estimate θ for Year 3 using the method of moments. (A) 22,500

(B) 23,333

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 24,000

(D) 25,850

(E) 26,400

Exercises continue on the next page . . .

EXERCISES FOR LESSON 30 [4B-S98:5] (2 points) You are given the following:

30.8. •

535

The random variable X has the density function f ( x )  α ( x + 1) −α−1 , 0 < x < ∞, α > 0.



A random sample of size n is taken of the random variable X. ˜ the method of moments estimator of α. Assuming α > 1, determine α,

(A) X¯

(B)

X¯ ¯ X−1

(C)

X¯ ¯ X+1

(D)

¯ X−1 X¯

(E)

¯ X+1 X¯

[4B-F95:17] (3 points) You are given the following:

30.9. •

Losses follow a Pareto distribution with parameters θ (unknown) and α  3.



300 losses have been observed. ˜ the method of moments estimator of θ. Determine the variance of θ,

(A) 0.0025θ 2

(B) 0.0033θ 2

(C) 0.0050θ 2

(D) 0.0100θ 2

(E) 0.0133θ 2

30.10. Claim sizes are as follows: 10

20

30

40

50

A single-parameter Pareto distribution is fitted to the claim size using the method of moments. Both parameters, α and θ, are determined this way. Determine the estimate of α. 30.11. [4B-F97:18] (2 points) You are given the following: •

The random variable X has the density function f ( x )  αx −α−1 , 1 < x < ∞, α > 1.



A random sample is taken of the random variable X. ˜ the method of moments estimator of α, as the sample mean goes to infinity. Determine the limit of α,

(A) 0

(B) 1/2

(C) 1

(D) 2

(E) ∞

30.12. A study is performed of time from submission of a claim on a life insurance policy until payment. For the 15 claims in the study, the time in days is as follows: 11 15

11 17

12 17

13 19

13 19

13 25

14 131

14

Using this data, the cumulative hazard rate at time 15 days, H (15) , is estimated with two estimators: • •

Hˆ 1 (15) is the Nelson-Åalen estimator. Hˆ 2 (15) is the method of moments estimator if the data are fit to a single parameter Pareto with θ  10. Determine the absolute value of the difference between the estimates, | Hˆ 1 (15) − Hˆ 2 (15) |.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

30. METHOD OF MOMENTS

536

30.13. [Prior exam] A random sample of death records yields the following exact ages at death: 30, 50, 60, 60, 70, 90. The age at death of the population from which the sample is drawn follows a gamma distribution. The parameters of the gamma distribution are estimated using the method of moments. Determine the estimate of α. (A) 6.0

(B) 7.2

(C) 9.0

(D) 10.8

(E) 12.2

30.14. [Prior exam] You are given: (i) Five lives are observed from time t  0 until death. (ii) Deaths occur at t  3, 4, 4, 11, and 18. Assume the lives are subject to the probability function te −t/c , c2

f (t ) 

t > 0.

Determine c by the method of moments. (A)

1 4

(B)

1 2

(C) 1

(D) 2

(E) 4

30.15. Claim sizes are as follows: 10

20

40

80

100

200

200

500

A gamma distribution is fitted to the claim sizes using the method of moments. Determine the estimate of θ. 30.16. [4B-S90:34] (2 points) The observations 1000, 850, 750, 1100, 1250, and 900 are a random sample taken from a gamma distribution with unknown parameters α and θ. Let α˜ and θ˜ denote the method of moments estimators of α and θ, respectively. In what range does α˜ fall? (A) (B) (C) (D) (E)

α˜ < 30 30 ≤ α˜ < 40 40 ≤ α˜ < 50 50 ≤ α˜ < 60 60 ≤ α˜

30.17. [4B-S91:46] (2 points) The following is a sample of 10 claims: 1500 6000

3500 3800

1800 5500

4800 4200

3900 3000

The underlying distribution is assumed to be gamma, with parameters α and θ unknown. ˆ of θ fall? In what range does the method of moments estimate, θ, (A) (B) (C) (D) (E)

θˆ < 250 250 ≤ θˆ < 300 300 ≤ θˆ < 350 350 ≤ θˆ < 400 400 ≤ θˆ

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 30

537

30.18. [4B-S92:26] (1 point) The random variable X has the density function with parameter β given by f ( x; β )  where E[X] 

β√ 2 2π

2 1 xe − ( x/β ) /2 , β2

and the variance of X is 2β 2 −

x > 0, β > 0,

π 2 2β .

You are given the following observations of X: 4.9, 1.8, 3.4, 6.9, 4.0. Determine the method of moments estimate of β. (A) (B) (C) (D) (E)

Less than 3.00 At least 3.00, but less than 3.15 At least 3.15, but less than 3.30 At least 3.30, but less than 3.45 At least 3.45

30.19. [4B-S95:5] (2 points) You are given the following: •

The random variable X has the density function f ( x )  αx α−1 , 0 < x < 1, α > 0.



A random sample of three observations of X yields the values 0.40, 0.70, 0.90. ˜ the method of moments estimator of α. Determine the value of α,

(A) (B) (C) (D) (E)

Less than 0.5 At least 0.5, but less than 1.5 At least 1.5, but less than 2.5 At least 2.5, but less than 3.5 At least 3.5

30.20. [4B-F99:21] (2 points) You are given the following: •

The random variable X has the density function f ( x )  w f1 ( x ) + (1 − w ) f2 ( x ) ,

0 < x < ∞, 0 ≤ w ≤ 1.



A single observation of the random variable X yields the value 1.



∞ 0 R∞ 0

• •

R

x f1 ( x ) dx  1 x f2 ( x ) dx  2

f2 (1)  2 f1 (1) , 0

Determine the method of moments estimate of w. (A) 0

(B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 3

(C)

1 2

(D)

2 3

(E) 1

Exercises continue on the next page . . .

30. METHOD OF MOMENTS

538

30.21. [4B-S96:4] (2 points) You are given the following: 2 (θ θ2



The random variable X has the density function f ( x ) 



A random sample of two observations of X yields values 0.50 and 0.90.

− x ) , 0 < x < θ.

˜ the method of moments estimator of θ. Determine θ, (A) (B) (C) (D) (E)

Less than 0.45 At least 0.45, but less than 0.95 At least 0.95, but less than 1.45 At least 1.45, but less than 1.95 At least 1.95

30.22. A group of 3 individuals is hypothesized to have the following survival function: ω−x S (x )  ω

!2

0≤x≤ω

where ω is a parameter. Ages at death for the 3 individuals are 70, 60, and 80. Estimate ω using the method of moments. 30.23. You observe the following ages at death: 60

70

75

80

86

87

You fit the parameter c of the survival function S ( x )  1 − method of moments.



88

 x c 100 ,

0 ≤ x ≤ 100 to this data using the

Determine the fitted probability of survival past 80. 30.24. On an insurance coverage, you observe the following claim sizes: 5

8

11

15

18

You fit an inverse exponential to this data using the method of moments, matching the first negative moment. Determine the fitted probability of a claim higher than 10. 30.25. Losses have a mean of 5 and a variance of 15. You fit them to a mixture of 2 uniform distributions, one of them on [0, θ1 ] with weight 0.6, and the other on [0, θ2 ], θ2 > θ1 , with weight 0.4, using the method of moments. Determine θ1 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 30

539

Use the following information for questions 30.26 and 30.27: You are given the following: •

The random variable X has the density function f ( x )  0.5





1 −x/θ1 θ1 e



+ 0.5



1 −x/θ2 θ2 e



, 0 < x < ∞, 0 < θ1 ≤ θ2

A random sample taken of the random variable X has mean 1 and variance k.

30.26. [4B-F98:25] (3 points) If k is 3/2, determine the method of moments estimate of θ1 . (A) (B) (C) (D) (E)

Less than 1/5 At least 1/5, but less than 2/5 At least 2/5, but less than 3/5 At least 3/5, but less than 4/5 At least 4/5

30.27. [4B-F98:26] (2 points) Determine the values of k for which the method of moments estimates of θ1 and θ2 exist. (A) (B) (C) (D) (E)

0< 0< 0< 1≤ 1≤

k k m 2  5002  250,000 30.6.

The first two raw sample moments are 130 + 20 + 350 + 218 + 1822 m  508 5

!

1302 + 202 + 3502 + 2182 + 18222 t  701,401.6 5

!

Then t 2 ( α − 1) 701,401.6  2   2.717937 α−2 m 5082 2 2+  2.717937 α−2 1 2.717937 − 2   0.358968 α−2 2 1 αˆ  2 +  4.785761 0.358968 θˆ  m ( αˆ − 1)  (508)(3.785761)  1923.167

The limited expected value at 500 is

θ * θ .1 − E[X ∧ 500]  α−1 θ + 500

,

C/4 Study Manual—17th edition Copyright ©2014 ASM

! α−1 +/ -

30. METHOD OF MOMENTS

544

1923.167 *  508 .1 − 2423.167 ,

! 3.785761 +/  296.2123 -

(C)

30.7. To bring this up with inflation, we multiply average claim size in year 1 by 1.12 and in year 2 by 1.1. Then, average inflated claim size is 100 (12,100) + 200 (13,750)  13,200 300 Setting the Pareto mean equal to this, θ  13,200 2 θˆ  26,400 30.8.

(E)

This is a Pareto with θ  1. Equating sample and distribution means, 1 α−1 1 X¯ + 1 α˜  1 +  X¯ X¯

X¯  E[X] 

(E)

30.9. Only one parameter is estimated, so only one moment is matched: the mean. We will express the estimator in terms of the data. 300 θˆ 1 X  Xi 2 300 i1

1 θˆ  150

300 X

Xi

i1

So we must calculate the variance of the right hand side. Var ( θ˜ ) 

300 1 X 300 Var ( X i )  Var ( X i ) 1502 1502 i1

X i has a Pareto distribution, and its variance is the can be read from the tables in the appendix. 2θ 2 θ Var ( X i )  − ( α − 1)( α − 2) α−1

!2 

2θ 2 θ2 3 2 −  θ 2 (1) 4 4

Plugging this into the expression for Var ( θ˜ ) ,

300 Var ( θ˜ )  1502

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

3 2 θ  0.01θ 2 4



(D)

EXERCISE SOLUTIONS FOR LESSON 30

545

30.10. We equate the first 2 moments and solve for α and θ. αˆ θˆ X¯   30 αˆ − 1 102 + 202 + 302 + 402 + 502 E[Xˆ 2 ]   1100 5 αˆ θˆ 2  1100 αˆ − 2 ( αˆ − 1) 2 11  9 αˆ ( αˆ − 2) 11αˆ 2 − 22 αˆ  9 αˆ 2 − 18 αˆ + 9

2αˆ 2 − 4αˆ − 9  0 √ 4 ± 16 + 72 ˆ  3.3452 α 4

Notice that from the first moment equation. θˆ  30 (2.3452/3.3452)  21.03, which makes the fit impossible since data for a single-parameter Pareto may not be less than θ, and we are given the points 10 and 20. That’s why θ for a single-parameter Pareto should be given in advance rather than fitted. 30.11. The mean of a single-parameter Pareto is so

αθ α−1 .

Here we have a single-parameter Pareto with θ  1,

α  x¯ α−1 1 1+  x¯ α−1

α˜  1 +

As x¯ → ∞, the fraction goes to 0 and α˜ → 1 . (C)

1 ¯x − 1

30.12. For the Nelson-Åalen estimator, the risk sets and events through time 15 are yi 11 12 13 14 15

ri 15 13 12 9 7

si 2 1 3 2 1

The Nelson-Åalen estimate is then 2 1 3 2 1 Hˆ 1 (15)  + + + +  0.825336 15 13 12 9 7 Notice that the large observation 131 has no effect on this estimate, but will affect the method of moments estimate. (17) +2 (19) +25+131  344 The sample mean is 2 (11) +12+3 (13) +2 (14) +15+2 15 15 . Setting this equal to the mean of a αθ single parameter Pareto, α−1 , 10α 344  α−1 15 C/4 Study Manual—17th edition Copyright ©2014 ASM

30. METHOD OF MOMENTS

546 α 344  α − 1 150 150α  344α − 344

194α  344 344 αˆ   1.773196 194

Then we get S (15) and then H2 (15) .

! αˆ



! 1.773196

2 10 θ    0.48725 Sˆ (15)  15 15 3 Hˆ 2 (15)  − ln 0.487254  0.718970 The absolute difference is |0.825336 − 0.718970|  0.10637 . 30.13. Calculate the first two raw moments.

30 + 50 + 2 (60) + 70 + 90  60 6 302 + 502 + 2 (602 ) + 702 + 902 t  3933 13 6 σˆ 2  3933 13 − 602  333 31 x¯ 

Using formula (30.1), αˆ 

602 333 13

 10.8

(D)

30.14. This is a gamma distribution with α  2 and θ  c, so 2c  x¯  8, c  4 30.15. The first two raw sample moments are 10 + 20 + 40 + 80 + 100 + 2 (200) + 500  143.75 8 102 + 202 + 402 + 802 + 1002 + 2 (2002 ) + 5002  43,562.5 t 8 σˆ 2  43,562.5 − 143.752  22,894.44 x¯ 

By formula (30.1),

σˆ 2 θˆ   159.2935 x¯

30.16. 1000 + 850 + 750 + 1100 + 1250 + 900  975 6 10002 + 8502 + 7502 + 11002 + 12502 + 9002 t  977,916 23 6 σˆ 2  977,916 32 − 9752  27,291 23 x¯ 

α˜ 

C/4 Study Manual—17th edition Copyright ©2014 ASM

x¯ 2 9752   34.8321 2 σˆ 27,291 23

(B)

(E)

EXERCISE SOLUTIONS FOR LESSON 30

547

30.17. 1500 + 3500 + 1800 + 4800 + 3900 + 6000 + 3800 + 5500 + 4200 + 3000  3800 10 15002 + 35002 + 18002 + 48002 + 39002 + 60002 + 38002 + 55002 + 42002 + 30002  16,332,000 t 10 σˆ 2  16,332,000 − 38002  1,892,000 x¯ 

σˆ 2 1,892,000 θˆ    497.89 x¯ 3800

(E)

30.18. Let’s calculate the sample mean. The sample variance is not needed since only one parameter is being estimated. When they don’t specify which moments to use, always use the first k moments, where k is the number of parameters you’re fitting (here k  1). 4.9 + 1.8 + 3.4 + 6.9 + 4.0  4.2 X¯  5 Then set E[X] equal to 4.2. β√ 2π  4.2 2 2 (4.2) βˆ  √  3.3511 2π

(D)

30.19. You may recognize this distribution as a beta distribution with a  α, b  1, and the mean of a a α . If not, it’s not hard to integrate: beta is a+b  α+1 1

Z E[X] 

αx α dx 

0

α α+1

The sample mean is (0.4 + 0.7 + 0.9) /3  23 . Then α 2  α+1 3 α˜  2

(C)

30.20. The sample mean is 1. We have a mixture, and from the integrals, we have E[X]  w + 2 (1 − w )  2 − w. Equating the mean with the sample mean, w  1 . (E)

30.21. This is a scaled beta distribution with a  1, b  2, and we have to estimate θ. The mean of such a distribution is listed in the distribution tables as E[X]  aθ/ ( a + b ) , but if you didn’t notice this, you could integrate. E[X] 

2 θ2

2  2 θ 2 θ2 θ  3 

C/4 Study Manual—17th edition Copyright ©2014 ASM

θ

Z 0

x ( θ − x ) dx

Z

θ

0 θ3

2

θ

Z xθdx −



! 3

θ 3

0

2

x dx

!

30. METHOD OF MOMENTS

548

Equating to the sample mean 0.7, θ  0.7 3 θ˜  2.1

(E)

30.22. This is a scaled beta distribution with parameters θ  ω, a  1, b  2, and therefore has mean ω/3. Similar to exercise 30.21. The sample mean is 70, so the answer is ωˆ  3x¯  3 (70)  210 30.23. You may recognize this distribution as a beta with θ  100, a  1, b  c. Then the mean is θa/ ( a + b )  100/ (1 + c ) . If you didn’t recognize the distribution, integrate the survival function to get the expectation: 100

Z 0



1−

x 100

c

x 100 1− c+1 100 100  c+1



dx  −

 c+1 100 0

The sample mean is 17 (60 + 70 + 75 + 80 + 86 + 87 + 88)  78. So 100  78 c+1 100 − 1  0.28205 cˆ  78   80 0.28205 Pr ( X > 80)  1 −  0.6351 100 30.24. We match the first negative moment to the harmonic sample mean. 1

+ 1  5 θ θˆ  9.2914

E[X −1 ] 

1 8

+

1 11

5

+

1 15

+

1 18

 0.1076

Pr ( X > 10)  1 − F (10)  1 − e −9.2914/10  0.6051 30.25. Let X be the random variable. E[X]  0.3θ1 + 0.2θ2  5 5 − 0.3θ1 50 − 3θ1  θ2  0.2 2 2 2 0.6θ1 0.4θ2 E[X 2 ]  +  15 + 52  40 3 3 2θ12 + 13 (50 − 3θ1 ) 2  400 1300 5θ12 − 100θ1 + 0 3 √ 20 − 400 − 1040/3 θˆ 1   6.3485 2 ) Using the lower solution to the quadratic results in θˆ 2  50−3 (6.3485  15.4772 > 6.3485. The other solution 2 50−3 ( 13.6515 ) ˆ ˆ would lead to θ1  13.6515, θ2   4.5228 < 13.6515, which violates the condition θ2 > θ1 . 2 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 30

549

30.26. The distribution is an evenly weighted mixture of exponentials. For an exponential, the mean is θ and the second moment is 2θ 2 , so E[X]  0.5 ( θ1 + θ2 ) E[X 2 ]  0.5 2θ12 + 2θ22





 θ12 + θ22 Equating the two moments, θ1 + θ2  2 ⇒ θ2  2 − θ1 θ12 + θ22 

3 2

+ 12  2.5

Plug in θ2  2 − θ1 and solve the second equation for θ1 .

θ12 + 4 − 4θ1 + θ12  2.5

2θ12 − 4θ1 + 1.5  0 √ 4 − 16 − 12 2 1 θ1    4 4 2

(C)

The negative sign was selected for the quadratic solution since we are given θ1 ≤ θ2 . 30.27. Generalizing the quadratic of the previous exercise:

θ12 + (2 − θ1 ) 2  1 + k

θ12 + 4 − 4θ1 + θ12  1 + k

2θ12 − 4θ1 + (3 − k )  0

The discriminant must be at least 0 to obtain real solutions, but less than 16, since both solutions are positive. 16 − 4 (2)(3 − k ) ≥ 0 16 − 24 + 8k ≥ 0 k≥1

16 − 4 (2)(3 − k ) < 16

1≤k 0 and m 2 > 0, ( m 1 + m 2 ) 2 > m 12 + m 22 , so the fraction cannot be greater than 1. On the other hand, if m 1  0 or m 2  0, the fraction is 1. So the least upper bound of the quotient is 1. The square of the coefficient of variation is this fraction minus 1, so the least upper bound of the coefficient of variation √ √ is 4 (1) − 1  3 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 30

551

30.32. The sample mean is 2000. The second moment is 5002 + 10002 + 15002 + 25002 + 45002  6,000,000 5 Setting mean and second moment of lognormal equal to these, σ2  ln 2000  7.6009 2 2µ + 2σ2  ln 6,000,000  15.60727 µ+

Double the first equation and subtract from the second. σ2  15.60727 − 2 (7.6009)  0.40547 0.40547 µ  7.6009 −  7.3982 2 ! ln 4500 − 7.3982 Pr ( X > 4500)  1 − Φ √ 0.40547  1 − Φ (1.59)  1 − 0.9441  0.0559

(B)

30.33. Equating moments, µ + 0.5σ2  ln 386  5.95584 2µ + 2σ2  ln 457,480.2  13.03349 σ2  13.03349 − 2 (5.95584)  1.121814 µˆ  5.95584 − 0.5 (1.121814)  5.39493 √ σˆ  1.121814  1.05916 Now we use the formula for E[X ∧ 500]. The method of moments sets E[X]  x¯  386. ln 500 − µ − σ2 ln 500 − µ + * / E[X ∧ 500]  E[X]Φ + x .1 − Φ σ σ

!

!

, ln 500 − 5.39493 −  386Φ 1.05916

 386Φ (−0.29) + 500 1 − Φ (0.77)



-

1.059162

!

! ln 500 − 5.39493 + * / + 500 .1 − Φ 1.05916 , -



 386 (1 − 0.6141) + 500 (1 − 0.7794)  259.3

(D)

30.34. It is unusual to match negative moments, but for an inverse Pareto, there is little choice, since positive moments 1 and higher are not defined. In any case, you must follow the exam question’s instructions, and default to using the first n moments for method of moments only when the question doesn’t specify which moments to use. The sample −1 moment is (We’ll use m and t for negative moments here, even though usually they mean positive moments.) m C/4 Study Manual—17th edition Copyright ©2014 ASM

1 1 1 1 1 1 1 + + + + +  0.017094 6 15 45 140 250 560 1340





30. METHOD OF MOMENTS

552

The sample −2 moment is t

1 1 1 1 1 1 1 + + + + +  0.00083484 6 152 452 1402 2502 5602 13402





Let m be the sample −1 moment and t the sample −2 moment. We equate these with the fitted moments, using the formulas from the Loss Models Appendix: 1 θ ( τ − 1) 2  2 θ ( τ − 1)( τ − 2) 2 ( τ − 1)  τ−2  2m 2 ( τ − 1)

m t t m2 t ( τ − 1) − t

−t  (2m 2 − t )( τ − 1) −t τˆ  1 + 2m 2 − t ! −0.00083484 1+  4.33358 2 (0.0170942 ) − 0.00083484 1 θˆ  m ( τ − 1) 1   17.549 (C) 0.017094 (3.33358)

30.35. The sample mean is 81 8i1 x i  96, where x i are the 8 payments. The mean of a uniform distribution censored at 150 is most easily computed using conditional expectation. Let I be the condition that a loss is below 150, and X the censored loss size. Then

P

E[X ∧ 150]  E E[X ∧ 150 | I]

f

g

 Pr ( X < 150) E[X ∧ 150 | X < 150] + Pr ( X ≥ 150) E[X ∧ 150 | X ≥ 150]

Note that the expected value of X given that it is less than 150, since it is uniform, is 75. 150 150 (75) + 1 − (150) θ θ (150)(75)  150 − θ



E[X ∧ 150] 



We now equate the sample mean to the distribution mean. 96  150 − 54  θˆ 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(150)(75)

(150)(75)

θ

θ

(150)(75) 54

 208 31

(E)

QUIZ SOLUTIONS FOR LESSON 30

553

Quiz Solutions 30-1.

This beta distribution is in the tables. We match the moments: a a+b a ( a + 1) 0.6  ( a + b )( a + b + 1) 0.7 

From the first equation, a  7b/3. Substituting this into the second equation,

(7b/3)(7b + 3) /3 (10b/3)(10b + 3) /3 7b (7b + 3)  10b (10b + 3)

0.6 

60b + 18  49b + 21 3 bˆ  11

C/4 Study Manual—17th edition Copyright ©2014 ASM

aˆ 

7 11

554

C/4 Study Manual—17th edition Copyright ©2014 ASM

30. METHOD OF MOMENTS

Lesson 31

Percentile Matching Reading: Loss Models Fourth Edition 13.1 Questions on percentile matching have appeared on half the exams. In addition, there have been some questions on the smoothed empirical percentile.

31.1

Smoothed empirical percentile

Recall from Section 1.2 that for discrete distributions, percentiles may not be well defined. For individual data, the empirical distribution, which you’re trying to fit, is discrete. Therefore, the smoothed empirical percentile is used for percentile matching. The smoothed empirical percentile is obtained by adding 1 to the sample size and multiplying by the percent. If n is the sample size, x ( k ) is the k th order statistic of the sample1 πˆ p is the smoothed 100p th percentile then

πˆ p  x ( n+1) p

if ( n + 1) p is an integer. If not, interpolate between two order statistics: Smoothed Empirical Percentile πˆ p  ( n + 1) p − a x ( a+1) + a + 1 − ( n + 1) p x ( a )







(31.1)



where a  b ( n + 1) pc, the greatest integer less than or equal to ( n + 1) p. The smoothed empirical percentile is not defined if the product is less than 1 or greater than n. Example 31A You are given the following losses: 100,

150,

150,

175,

200,

250,

300,

400,

500,

800

Determine the smoothed empirical estimate of the 75th percentile. Answer: 0.75 (11)  8.25. Interpolate between the 8th and 9th losses: 0.75 (400) + 0.25 (500)  425 .



1The order statistics of a sample x1 ,. . . ,x n are the points put in order: x (1) ≤ x (2) ≤ · · · ≤ x ( n ) . There are two common notations for order statistics: one of them uses parenthesized subscripts, like we do here, and the other uses the letter y, either small or capitalized, as in y1 ≤ y2 ≤ · · · ≤ y n . Loss Models uses parenthesized subscripts in the definition of smoothed empirical percentiles, but uses Yk when discussing estimating percentiles using simulation. C/4 Study Manual—17th edition Copyright ©2014 ASM

555

31. PERCENTILE MATCHING

556

?

Quiz 31-1 You are given the following sample: 20

150

70

80

70

Determine the 30th smoothed empirical percentile of the sample. To use percentile matching for a k-parameter distribution, arbitrarily select k percentiles and match the percentiles of the sample to the parametric distribution’s percentiles. For individual data, use the smoothed empirical percentiles of the sample. For grouped data, the percentile should be picked at a group endpoint, but if not, the ogive may be used. Needless to say, exam questions will specify the percentiles to use.

31.2

Percentile matching for various distributions

Percentile matching requires matching the distribution function F ( x ) to a percentile from the sample. This method cannot be easily used if F ( x ) is a difficult integral. For example, do not expect to be asked to match percentiles for a gamma distribution, except for the special case of an exponential. If the distribution is in the tables, an alternative method for matching percentiles is to set V aR p ( X ) equal to the empirical 100p th percentile. The tables provide VaRp ( X ) for almost any distribution for which a closed form expression exists.

31.2.1

Exponential

Let’s solve the distribution function of an exponential with mean θ for x, the 100p th percentile of an exponential distribution. F ( x; θ )  1 − e −x/θ  p e −x/θ  1 − p x  − ln (1 − p ) θ x  −θ ln (1 − p )

There is no need to derive this on an exam, since it is in your tables, which say that VaRp ( X )  −θ ln (1− p ) . Remember, VaRp ( X ) is the 100p th percentile of X. .  Therefore, if π p is the observed percentile, θˆ  −π p ln (1 − p ) . Example 31B You are given the following information about 100 claim sizes: Interval

Number of claims

[ 0, 1000) [1000, 2000) [2000, 5000) [5000, ∞ )

45 32 17 6

You are to fit an exponential distribution to this sample matching the 45th percentile. Using the fitted distribution, estimate the probability that a claim will be greater than 5000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

31.2. PERCENTILE MATCHING FOR VARIOUS DISTRIBUTIONS

557

Answer: The observed 45th percentile, the point at which we have 45/100 claims, is 1000.2 By the above   formula, the estimate is −1000/ ln (1 − 0.45)  1672.70. Then Pr ( X > 5000)  e −5000/1672.70  0.0503 . 

31.2.2

Weibull

Suppose you’re given two percentiles, π p and π q . For a Weibull, the 100p th percentile is determined from F ( x; θ, τ )  1 − e − ( x/θ )  p τ

or

x θ



 − ln (1 − p ) .

So πp



θ ! πq τ θ

 − ln (1 − p )

(*)

 − ln (1 − q )

Dividing the second equation into the first, πp πq

!τ 

ln (1 − p ) ln (1 − q )

ln ln (1 − p ) / ln (1 − q )



τˆ  You can then use (*) to solve for θ:

(**)

θˆ  pˆ τ



ln ( π p /π q ) πp

− ln (1 − p )

The tables you get on the exam have VaRp ( X )  θ − ln (1 − p ) such equation by another and start the derivation at (**).



 1/τ

, so you can immediately divide one

Example 31C You observe a sample of four losses: 15, 20, 30, 60. Fit these losses to a Weibull distribution, matching the 40th and 80th smoothed empirical percentiles. Determine the estimates of θ and τ. Answer: The smoothed empirical 40th percentile is order statistic 0.4 (5)  2, or the second observation 20, and the smoothed empirical 80th percentile is order statistic 0.8 (5)  4, or the fourth observation 60. Using our formulas with p  0.4 and q  0.8, ln (ln 0.6/ ln 0.2) ln 0.3174   1.0446 ln (20/60) −1.0986 20 20 θˆ  1.0446   38.0453 √ 0.525689 − ln 0.6 τˆ 



2Technically, the 45th percentile is 1000− , since there may be some claims equal to exactly 1000, in which case 1000 could be any percentile between the 45th and the 77th . However, we have no information for the number of claims exactly equal to 1000, and for an exponential or any continuous distribution, F (1000− )  F (1000) . C/4 Study Manual—17th edition Copyright ©2014 ASM

31. PERCENTILE MATCHING

558

31.2.3

Lognormal

To match a lognormal’s percentiles, log the statistics and match the corresponding percentiles of a normal distribution, using the normal distribution table. In other words, if z p is the 100p th percentile of a standard normal distribution, and π p , π q are the observed percentiles, then set µ + z p σ  ln π p µ + z q σ  ln π q and solve for µ and σ. Subtracting the first equation from the second: σ ( z q − z p )  ln π q − ln π p ln π q − ln π p σˆ  zq − zp

(31.2)

µˆ  ln π p − z p σˆ

(31.3)

Example 31D You observe a sample of five losses: 15, 20, 30, 60, 200. Fit these losses to a lognormal distribution, matching the 20th and 80th percentiles. Estimate the mean of the distribution. Answer: Φ−1 (0.8)  0.842, and Φ−1 (0.2)  −0.842. The smoothed empirical 80th percentile is order statistic 0.8 (6)  4.8, or the weighted average of the 4th and 5th order statistics, 0.8 (200) + 0.2 (60)  172. The smoothed empirical 20th percentile is order statistic 0.2 (6)  1.2, or the weighted average of the first and second order statistics, 0.2 (20) + 0.8 (15)  16. Using the formulas, ln 172 − ln 16 0.842 − (−0.842) 2.3749   1.410 1.684 µˆ  ln 16 − (−0.842)(1.410)  3.960 σˆ 

The mean is then estimated as 2

2

ˆ σˆ e µ+0.5  e 3.960+0.5 (1.410 )  e 4.954  141.7

31.2.4



Other distributions

Some of the other distributions which can be estimated without a numerical method using percentile matching are loglogistic, paralogistic, inverse Weibull, and inverse exponential. For some distributions, if you are given one parameter, you can estimate the other one with percentile matching. Example 31E Claim sizes follow the distribution f (x ) 

τx τ−1 (1 + x ) τ+1

x ≥ 0, τ > 0 unknown

The following sample of claim sizes is available: 10, C/4 Study Manual—17th edition Copyright ©2014 ASM

15,

25,

40,

80,

200,

350

31.3. PERCENTILE MATCHING WITH INCOMPLETE DATA

559

Percentile matching at the 50th percentile is used to estimate τ. Determine the resulting estimate of Pr ( X ≤ 100) . Answer: This is an inverse Pareto with θ  1. Thus F ( x; τ )  have

x τ 1+x .

The sample 50th percentile is 40. We

! τˆ

40  0.5 1 + 40 40 τˆ ln  ln 0.5 41 ln 0.5 τˆ   28.0710 ln 40/41 Pr ( X ≤ 100) 

100 1 + 100

! 28.0710

 0.7563

Alteratively, the tables have VaRp ( X )  θ ( p −1/τ − 1) −1 , so

(0.5−1/τ − 1) −1  40

0.5−1/τ  1.025 1 ln 1.025 −  τ ln 0.5

ˆ resulting in the same τ.

?



Quiz 31-2 You are given a sample from an inverse Weibull distribution in which the 20th smoothed empirical percentile is 20 and the 70th smoothed empirical percentile is 560. Estimate the parameter θ of this distribution by matching these two percentiles. Table 31.1 summarizes the formulas we developed for percentile matching. However, it is probably not worth memorizing. They don’t ask a lot of percentile matching questions, and if they do, they could easily ask one about a distribution not listed in the table.

31.3

Percentile matching with incomplete data

With censored data, you must select percentiles within the range of the uncensored portion of your data, but having done so, no adjustment is needed. Example 31F For an insurance coverage with a policy limit of 10,000, you observe the following claims: 1000,

1700,

2000,

3500,

4200,

5000,

7300,

and 4 claims at the limit

You fit an inverse exponential to the data matching at the median. Determine the estimate for θ. Answer: There are 11 observations. The smoothed empirical median is order statistic 0.5 (12)  6, or 5000. We could match any percentile up to the one corresponding to the 7th observation, or p  7/12, but C/4 Study Manual—17th edition Copyright ©2014 ASM

31. PERCENTILE MATCHING

560

Table 31.1: Percentile matching formulas for some distributions

Distribution

Estimators πp

Exponential

θˆ  −

Inverse exponential

θˆ  −π p ln p

ln (1 − p )

ln ln (1 − p ) / ln (1 − q )



τˆ 

Weibull

ln ( π p /π q )

θˆ  pˆ τ σˆ 

Lognormal



πp − ln (1 − p )

ln π q − ln π p zq − zp

µˆ  ln π p − z p σˆ

nothing higher. The median of an inverse exponential, π0.5 , is e −θ/π0.5  0.5 θ  ln 2 π0.5 θ π0.5  ln 2 so we have θ  5000 ln 2 θˆ  5000 ln 2  3465.74



With truncated data, you must match the percentile of the conditional distribution. Example 31G For an insurance coverage with a deductible of 500, you observe the following loss sizes (including the deductible): 600,

1000,

1200,

2000,

3000

You fit a Pareto with α  1 to the data matching at the median. Determine the estimate for θ. Answer: The median of the truncated data is 1200. We need the median of the truncated Pareto to equal C/4 Study Manual—17th edition Copyright ©2014 ASM

31.4. MATCHING A PERCENTILE AND A MOMENT

561

1200. The median of a Pareto truncated at 500, π0.5 , is computed as follows: S ( π0.5 )  0.5 S (500) θ/ ( θ + π0.5 )  0.5 θ/ ( θ + 500) θ + 500  0.5 θ + π0.5 θ + 500  0.5θ + 0.5π0.5 0.5θ  0.5π0.5 − 500 θ  π0.5 − 1000

Here, the observed median π0.5  1200, so θˆ  1200 − 1000  200 .

31.4



Matching a percentile and a moment

To fit a two-parameter distribution, it is possible to match one moment and one percentile. Example 31H You are given the following five observations: 4

8

18

85

105

Fit a lognormal distribution matching the sample median and mean. Determine µ and σ. Answer: The sample median is 18, so ln 18 is matched to the median of ln X, which is µ, and µˆ  ln 18  2.8904 . 2 The sample mean is (4 + 8 + 18 + 85 + 105) /5  44, which is matched to e µ+0.5σ , so 2

e ln 18+0.5σ  44 ln 18 + 0.5σ2  ln 44 σˆ 

p

2 (ln 44 − ln 18)  1.3370



Exercises 31.1.

You are given the following ten observations: 6

8

9

12

18

25

60

130

800

1000

Determine the smoothed empirical estimate of the 20th percentile. 31.2.

[4B-F92:11] (1 point) A random sample of 20 observations has been ordered as follows: 12, 16, 20, 23, 26, 28, 30, 32, 33, 35, 36, 38, 39, 40, 41, 43, 45, 47, 50, 57

Determine the smoothed empirical estimate of the 60th percentile. (A) 32.4

(B) 36.0

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 38.0

(D) 38.4

(E) 38.6 Exercises continue on the next page . . .

31. PERCENTILE MATCHING

562

31.3.

[4B-S93:30] (1 point)

The following 20 wind losses, recorded in millions of dollars, occurred in 1992: 1, 6,

1, 6,

1, 8,

1, 10,

1, 13,

2, 14,

2, 15,

3, 18,

3, 22,

4, 25

Calculate the 75th smoothed empirical percentile. (A) 12.25

(B) 13.00

(C) 13.25

(D) 13.75

(E) 14.00

31.4. [4B-S96:1] (1 point) A random sample of 8 observations from a continuous distribution yields the following values: 0.20,

0.40,

1.0,

3.0,

4.6,

5.8,

6.2,

9.4

Determine the smoothed empirical estimate of the 40th percentile. (A) 1.0 31.5.

(B) 1.4

(C) 1.8

(D) 2.2

(E) 3.0

[4-S00:2] You are given the following random sample of ten claims: 46 1078

121 1452

493 2054

738 2199

775 3207

Determine the smoothed empirical estimate of the 90th percentile. (A) (B) (C) (D) (E) 31.6.

Less than 2150 At least 2150, but less than 2500 At least 2500, but less than 2850 At least 2850, but less than 3200 At least 3200 [4-F02:2] You are given the following claim data for automobile policies: 200

255

295

320

360

420

440

490

500

520

1020

Calculate the smoothed empirical estimate of the 45th percentile. (A) 358

(B) 371

(C) 384

(D) 390

(E) 396

31.7. [4B-F97:29] (2 points) You wish to calculate the smoothed empirical estimate of the 100p th sample percentile based on a random sample of 4 observations. Determine all values of p for which the 100p th sample percentile is defined. (A) (B) (C) (D) (E) 31.8.

0≤p≤1 0.20 ≤ p ≤ 0.80 0.25 ≤ p ≤ 0.75 0.33 ≤ p ≤ 0.67 p  0.50 For a sample of 10 claims, x1 , . . . , x10 you are given:

(i) The smoothed empirical estimate of the 55th percentile is 380. (ii) The smoothed empirical estimate of the 60th percentile is 402. Determine y6 , the sixth order statistic of the sample. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 31

31.9.

563

[4-F04:2] You are given the following random sample of 13 claim amounts:

99

133

175

216

250

277

651

698

735

745

791

906

947

Determine the smoothed empirical estimate of the 35th percentile. (A) 219.4

(B) 231.3

(C) 234.7

(D) 246.6

(E) 256.8

31.10. [160-83-94:13] From a complete mortality study of five lives, you are given: (i) The underlying survival distribution is exponential. (ii) Deaths occur at times 1, 2, t3 , t4 , 9, where 2 < t3 < t4 < 9. (iii) The parameter of the exponential is estimated as 4.7619 both by the method of moments and by percentile matching at the median. Calculate t4 . (A) 6.5

(B) 7.0

(C) 7.5

(D) 8.0

(E) 8.5

31.11. A sample from an exponential distribution yields 100, 300, 1000, 5000, 10000. Percentile matching is used to estimate the parameter. The smoothed empirical 40th percentile is matched to the corresponding percentile of the fitted distribution. Determine the estimate for the mean. 31.12. [160-S91:20] For a complete study of five lives, you are given: (i) Deaths occur at times t  2, 3, 3, 5, 7. (ii) The underlying survival distribution is S ( t )  4−λt , t ≥ 0. ˆ Using percentile matching at the median, calculate λ.

(A) 0.111

(B) 0.125

(C) 0.143

(D) 0.167

(E) 0.333

31.13. [4-F02:37] You are given: (i) Losses follow an exponential distribution with mean θ. (ii) A random sample of losses is distributed as follows: Loss Range (0–100] (100–200] (200–400] (400–750] (750–1000] (1000–1500] Total

Number of Losses 32 21 27 16 2 2 100

Estimate θ by matching at the 80th percentile. (A) 249

(B) 253

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 257

(D) 260

(E) 263

Exercises continue on the next page . . .

31. PERCENTILE MATCHING

564

31.14. [4-F00:39] You are given the following information about a study of individual claims: (i) 20th percentile = 18.25 (ii) 80th percentile = 35.80 Parameters µ and σ of a lognormal distribution are estimated using percentile matching. Determine the probability that a claim is greater than 30 using the fitted lognormal distribution. (A) 0.34

(B) 0.36

(C) 0.38

(D) 0.40

(E) 0.42

31.15. A sample from a lognormal distribution yields 100, 200, 500, 1000. Parameters µ and σ are estimated by matching the 40th and 80th smoothed empirical percentiles with the lognormal’s percentiles. Determine the estimate for µ. 31.16. [4B-S96:17] (2 points) You are given the following: •

Losses follow a Pareto distribution with parameters θ and α.



The 10th percentile of the distribution is θ − k, where k is a constant.



The 90th percentile of the distribution is 5θ − 3k.

Determine α. (A) (B) (C) (D) (E)

Less than 1.25 At least 1.25, but less than 1.75 At least 1.75, but less than 2.25 At least 2.25, but less than 2.75 At least 2.75

31.17. [4B-F96:3] (2 points) You are given the following: •

Losses follow a Weibull distribution with parameters θ and τ.



The 25th percentile of the distribution is 1,000.



The 75th percentile of the distribution is 100,000. Determine τ.

(A) (B) (C) (D) (E)

Less than 0.4 At least 0.4, but less than 0.6 At least 0.6, but less than 0.8 At least 0.8, but less than 1.0 At least 1.0

31.18. [160-81-96:13] From a sample of 9 lives diagnosed with terminal cancer, you are given: (i) The deaths occurred at times 4, 6, 6, 6, 7, 7, 9, 9, 9 (ii) The underlying distribution was Weibull Calculate τˆ using percentile matching at the 25th and 75th percentiles.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 31

565

31.19. [160-81-99:12] From a laboratory study of 9 lives, death times are 5, 5, 8, 8, 9, 9, 12, 12, 19. A Weibull distribution is fitted to the data by percentile matching at the 20th and 60th percentiles. Calculate Sˆ (8) , the estimated probability of surviving to time 8. (A) 0.45

(B) 0.50

(C) 0.55

(D) 0.60

(E) 0.65

31.20. [4-S00:32] You are given the following information about a sample of data: (i) (ii) (iii) (iv) (v)

Mean = 35,000 Standard deviation = 75,000 Median = 10,000 90th percentile = 100,000 The sample is assumed to be from a Weibull distribution.

Determine the percentile matching estimate of the parameter τ. (A) (B) (C) (D) (E)

Less than 0.25 At least 0.25, but less than 0.35 At least 0.35, but less than 0.45 At least 0.45, but less than 0.55 At least 0.55

31.21. You are given the following sample of ten losses: 1200

1300

1500

1500

1800

2000

5000

10,000

50,000

100,000

Let Fˆ (2000) be the estimate of F (2000) obtained from the Nelson-Åalen estimator, and let F˜ (2000) be the estimate of F (2000) obtained from fitting an exponential distribution to the data matching the 40th smoothed empirical percentile to the fitted 40th percentile. Determine Fˆ (2000) − F˜ (2000) . 31.22. A policy has an ordinary deductible of 500 and a maximum covered loss of 5000. You experience the following 10 payments: 20

100

200

500

1000

1500

2000

2500

4500

4500

You fit a single parameter Pareto distribution with θ  500 to the ground up distribution, using percentile matching so as to match the observed median. Determine the estimated parameter α. 31.23. A policy has an ordinary deductible of 500. You observe the following eight payments: 100

100

200

300

400

600

800

1000

You fit a two-parameter Pareto distribution with θ  1000 and α unknown to the ground up distribution using percentile matching so as to match the observed median. Determine the resulting estimate of the mean of the ground up distribution.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

31. PERCENTILE MATCHING

566

31.24. A group of three individuals is hypothesized to have the following survival function: S (x ) 

ω−x ω

!2

0≤x≤ω

where ω is a parameter. Ages at death for the three individuals are 70, 60, and 80. Estimate ω using percentile matching at the median. 31.25. [160-S89:16] A sample of 9 lives was observed from the time of diagnosis until death. You are given: (i) Times of death were 1, 1, 2, 3, 3, 3, 4, 4, and 5. (ii) The lives were subject to a survival distribution S ( t )  at 2 + bt + 1,

0 ≤ t ≤ k.

Determine the parameter a by matching the smoothed empirical estimate of the 25th and 75th percentiles with the corresponding distribution percentiles. (A) −0.04

(B) −0.03

(C) −0.02

(D) −0.01

(E) 0

31.26. [160-F90:19] From a complete study of 10 laboratory mice, you are given: (i) The times of death, in days, are 2, 3, 4, 5, 5, 6, 8, 10, 11, 11. (ii) The operative survival model is assumed to be uniform. (iii) ωˆ med is the estimate of the uniform parameter ω using percentile matching at the median. (iv) ωˆ mom is the estimate of the uniform parameter ω using the method of moments. Calculate ωˆ med − ωˆ mom . (A) −2

(B) −1

(C) 0

(D) 1

(E) 2

31.27. [4B-S90:44] (2 points) A random sample of claims has been drawn from a distribution with the following cumulative distribution function: F ( x; γ, θ )  1 −

1 . 1 + ( x/θ ) γ

In the sample, 80% of the claim amounts exceed 100 and 20% of the claim amounts exceed 400. ˆ the estimator of θ, by percentile matching. Determine θ, (A) (B) (C) (D) (E)

θˆ < 175 175 ≤ θˆ < 225 225 ≤ θˆ < 275 275 ≤ θˆ < 325 325 ≤ θˆ

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 31

567

31.28. [4-F03:2] You are given: (i)

Losses follow a loglogistic distribution with cumulative distribution function: F (x ) 

(ii)

( x/θ ) γ 1 + ( x/θ ) γ

The sample of losses is 10

35

80

86

90

120

158

Calculate the estimate of θ by percentile matching, using the centile estimates. (A) (B) (C) (D) (E)

180 40th

and

200 80th

210

1500

empirically smoothed per-

Less than 77 At least 77, but less than 87 At least 87, but less than 97 At least 97, but less than 107 At least 107

31.29. [4-F04:30] You are given the following data: 0.49

0.51

0.66

1.82

3.71

5.20

7.62

12.66

35.24

You use the method of percentile matching at the 40th and 80th percentiles to fit an Inverse Weibull distribution to these data. Determine the estimate of θ. (A) (B) (C) (D) (E)

Less than 1.35 At least 1.35, but less than 1.45 At least 1.45, but less than 1.55 At least 1.55, but less than 1.65 At least 1.65

Additional released exam questions: C-F05:3, C-F06:1, C-S07:10,24,28

Solutions 31.1. We want the 0.2 ( n + 1)  0.2 (11)  2.2 observation. We interpolate between the second and third observations: 0.8 (8) + 0.2 (9)  8.2 31.2.

0.6 (21)  12.6. πˆ 0.60  38 (0.4) + 39 (0.6)  38.6 . (E)

31.3.

0.75 (21)  15.75, and 0.75 (14) + 0.25 (13)  13.75 . (D)

31.4.

0.4 (9)  3.6.

31.5.

90th

The

0.6 (3) + 0.4 (1)  2.2 . (D)

percentile is the 0.9 (11)  9.9th order statistic, or 0.9 (3207) + 0.1 (2199)  3106.2 (D)

31.6. There are eleven claims, so the smoothed empirical estimate is the 0.45 (12)  5.4th order statistic. Interpolating between the fifth and sixth order statistics, 0.4 (420) + 0.6 (360)  384 (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

31. PERCENTILE MATCHING

568

31.7. The 100p th percentile is defined only if 1 ≤ 5p ≤ 4, since otherwise we would be unable to extrapolate from the lowest or the highest observation. Dividing through by 5, 0.2 ≤ p ≤ 0.8 . (B) 31.8. Both the 55th and 60th percentiles can be expressed in terms of the 6th and 7th order statistics. We solve the two linear equations in the two unknowns y6 and y7 . 0.55 (11)  6.05

0.05y7 + 0.95y6  380

0.60 (11)  6.60

0.60y7 + 0.40y6  402 −11y6  −4158 y6  378

31.9. We want the 0.35 (14)  4.9th order statistic. We interpolate between the fourth item (216) and the fifth item (250): 0.1 (216) + 0.9 (250)  246.6 . (D) 31.10. We start with the median. The sample median is t3 , so e −t3 /4.7619  0.5 t3  ln 2 4.7619 t3  3.3 Since the exponential’s parameter is its mean, we know the sample mean is 4.7619. Thus 1+2+3.3+ t4 +9  5 (4.7619) , and t4  8.5095 . (E) 31.11. The smoothed empirical percentile is 0.6 (300) + 0.4 (1000)  580. Use the VaR in the tables to get the 40th fitted percentile. VaR0.4 ( X )  −θ ln (1 − 0.4)  580 580 θˆ  −  1135.42 ln 0.6 31.12. The sample median is 3. 4−3λ 

1 2

1  − ln 2 2 ln 2 1 λˆ   3 ln 4 6

−3λ ln 4  ln

(D)

31.13. 400 is the 80th empirical percentile. We set VaR0.8 ( X ) equal to 400. VaR0.8 ( X )  400 −θ ln (1 − 0.8)  400 400 θˆ  −  248.534 ln 0.2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

EXERCISE SOLUTIONS FOR LESSON 31

569

31.14. The key to percentile matching for a lognormal distribution is that the log of the values has a normal distribution. Logging is a monotonic transformation and therefore does not affect the percentiles, so if x is the 100p th percentile of a lognormal distribution with parameters µ and σ, then ln x is the 100p th percentile of a normal distribution with parameters µ and σ. Accordingly, we log 18.25 and 35.80. ln 18.25  2.9042 and ln 35.80  3.5779. Now we standardize the distribution by subtracting µ and dividing by σ, so that we can look it up on our tables. This means 3.5779 − µ  Φ−1 (0.80)  0.842 σ 3.5779  µ + 0.842σ

2.9042 − µ  Φ−1 (0.20)  −0.842 σ 2.9042  µ − 0.842σ

Adding these together, we get µˆ 

2.9042+3.5779 2

 3.241. Then we solve for σ from the first equation,

µˆ − 2.9042 3.241 − 2.9042   0.4 0.842 0.842 If you prefer to memorize formulas, you could use formulas (31.2) and (31.3): σˆ 

ln 35.80 − ln 18.25  0.4 0.842 − (−0.842) µˆ  ln 18.25 − (−0.842)(0.4)  3.241 σˆ 

Then

D ( X > 30)  Pr ln X > ln 30 Pr 



! ln X − µ ln 30 − 3.241 + * / >  Pr . σ 0.4 , ! * ln X − µ > 0.4+/  Pr . σ ,  1 − Φ (0.4)  1 − 0.6554  0.3446

(A)

31.15. Match logarithms. Using smoothed empirical percentiles, 0.4 (5)  2 and 0.8 (5)  4, so the 40th ln 200− µˆ  ln 1000−µˆ  percentile is the second element and the 80th percentile is the fourth, so Φ  0.4 and Φ  σˆ σˆ 0.8. We have Φ−1 (0.4)  −0.25 and Φ−1 (0.8)  0.842. Continuing ln 200 − µˆ ln 1000 − µˆ  −0.25 and  0.842 σˆ σˆ ln 1000 − µˆ 0.842 − ln 200 − µˆ 0.25 6.9078 − µˆ 0.842 − 5.2983 − µˆ 0.25

−0.842 (5.2983) + 0.842µˆ  0.25 (6.9078) − 0.25µˆ 0.842 (5.2983) + 0.25 (6.9078)  5.667 µˆ  0.842 + 0.25 If you prefer to memorize formulas, use formulas (31.2) and (31.3): σˆ 

ln 1000 − ln 200  1.4738 0.842 − (−0.25)

µˆ  ln 200 − (−0.25)(1.4738)  5.667 C/4 Study Manual—17th edition Copyright ©2014 ASM

31. PERCENTILE MATCHING

570

31.16. We want to match F ( x )  1 −

θ α θ+x .

F (θ − k )  1 −

θ 2θ − k



θ F (5θ − 3k )  1 − 6θ − 3k

 0.1



θ 2θ − k

or

 0.9

θ 6θ − 3k

or

!α !α

 0.9  0.1

Dividing the second expression into the first: 3α  9 αˆ  2

(C)

31.17. We set the VaRs equal to 1,000 and 100,000. VaR0.25 ( X )  1000

θ − ln 0.75



VaR0.75 ( X )  100,000 − ln 0.25 − ln 0.75

! 1/τ

θ − ln 0.25



 1/τ

 1/τ

 1000

 100,000

 100

ln 0.25  4.8188 ln 0.75 ln 4.8188  0.3415 τˆ  ln 100

100τ 

(A)

31.18. The 25th smoothed empirical percentile is the 0.25 (10)  2.5 order statistic, the average of the second and third order statistics, or 6. The 75th smoothed empirical percentile is the 0.75 (10)  7.5 order statistic, the average of the seventh and eighth order statistics, or 9. VaR0.25 ( X )  6

θ − ln 0.75

VaR0.75 ( X )  9

θ − ln 0.25

! 1/τ







− ln 0.75 6 6  9 − ln 0.25 9 ln 0.20752 τˆ   3.878 ln 2/3



 1/τ

6

 1/τ

9

ln 0.75  0.20752 ln 0.25

31.19. The 20th smoothed empirical percentile is the 2nd order statistic or 5, and the 60th smoothed empirical percentile is the 6th order statistic or 9.

C/4 Study Manual—17th edition Copyright ©2014 ASM

VaR0.2 ( X )  5

θ − ln 0.8

VaR0.6 ( X )  9

θ − ln 0.4





 1/τ

5

 1/τ

9

EXERCISE SOLUTIONS FOR LESSON 31

9 5

9 − ln 0.8  5 − ln 0.4





τ ln

5 θ

571

! 1/τ

ln 0.4  4.10628 ln 0.8

9  ln 4.10628  1.41252 5 ln 1.41252 τˆ   2.40311 ln 9/5



 − ln 0.8

5  (− ln 0.8) 1/τ θ 5  9.3334 θˆ  (− ln 0.8) 1/τˆ

2.40311 ˆ τˆ Sˆ (8)  e − (8/θ )  e − (8/9.3334)  0.5014

(B)

31.20. The information about the mean and standard deviation is just to confuse you. We match the percentiles: VaR0.5 ( X )  10,000 VaR0.9 ( X )  100,000

! 1/τ

θ − ln 0.5



θ − ln 0.1



 1/τ

 1/τ

 10,000

 100,000

− ln 0.1 − ln 0.5 − ln 0.1 2.30259 10τ    3.32193 − ln 0.5 0.69315 τ ln 10  ln 3.32193 ln 3.32193 1.20055 τˆ    0.52139 ln 10 2.30259 10 

(D)

31.21. With Nelson-Åalen: 1 1 2 1 1 Hˆ (2000)  + + + +  0.827778 10 9 8 6 5 Fˆ (2000)  1 − e −0.827778  0.5630 Since (0.4)(11)  4.4, the 40th smoothed empirical percentile is 0.4x (5) + 0.6x (4)  0.4 (1800) + 0.6 (1500)  1620, which is matched to −θ ln 0.6, so θ˜  −1620/ ln 0.6  3171.34 and F˜ (2000)  1 − e −2000/3171.34  0.4678. The difference is 0.5630 − 0.4678  0.0952 .

31.22. The observed median claim is 1250, equivalent to a 1750 loss.

0.5  Pr ( X ≥ 1750 | X ≥ 500)  Pr ( X ≥ 1750)

This is because a single-parameter Pareto puts no weight on values below θ, which is here 500, so Pr ( X < 500)  0. 0.5  Pr ( X ≥ 1750)  C/4 Study Manual—17th edition Copyright ©2014 ASM

500 1750



31. PERCENTILE MATCHING

572

αˆ 

ln 0.5 ln

500  1750

 0.5533

31.23. In the previous exercise,the probability of a claim being below the deductible, based on the fitted distribution, was zero. We therefore did not have to consider conditional distributions. Unlike the previous exercise, the probability of a claim being below the deductible in this exercise, based on the fitted distribution, is not zero. To match medians, we must therefore match the observed median to the median of a conditional fitted distribution. The observed median claim is 350, equivalent to an 850 loss. Pr ( X > 850 | X > 500)  0.5 1000 α 1850 1000 α 1500 !α

1500 1850

 0.5  0.5

αˆ  The mean is then

1000 ˆ α−1



1000 2.3051

ln 0.5 ln

1500 1850

 3.3051

 433.82 .

31.24. The median is 70.

!2

70 ω − 70 0.5  S (70)   1− ω ω √ 70  1 − 0.5  0.2929 ω ωˆ  239



2

The absurd result shows that the hypothesized survival function is not reasonable for human lives. 31.25. The smoothed 25th percentile is 0.25 (10)  2.5, or the average of the second and third order statistics, or 0.5 (1 + 2)  1.5. Similarly, the smoothed 75th percentile is halfway between the seventh and eighth order statistics, or 0.5 (4 + 4)  4. So we have the two equations (keeping in mind that S ( t )  1 − F ( t ) , so F ( t )  41 ⇒ S ( t )  34 ): 3 4 1 2 4 a + 4b + 1  4

1.52 a + 1.5b + 1 

Eliminate b by multiplying the first equation by 4 and the second by 1.5: 9a + 6b  −1

24a + 6b  −1.125

15a  −0.125

a  −0.008333

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

EXERCISE SOLUTIONS FOR LESSON 31

573

31.26. For a uniform distribution, the mean and the median are both

ω 2.

So

ωˆ mom  2 x¯  13 ωˆ med  2 (5.5)  11 11 − 13  −2

(A)

31.27. If you recognized that the distribution is loglogistic, you could use VaRp ( X ) from the tables. However, we will assume you didn’t recognize the distribution and will work it out from first principles. We are given 1−

1 1+

100 γ θ

1 1+

100 γ θ !γ

100 θ

 0.2  0.8  0.25

(31.4)

and similarly with the other percentile to match to 1−

1 1+

400 γ θ

1 1+

400 γ θ !γ

400 θ

 0.8  0.2 4

(31.5)

Dividing equation (31.5) by (31.4), we get 4γ  16 γ2 100  0.5 θ θˆ  200

(B)

31.28. There are 11 items in the sample, so the smoothed 40th percentile is the 0.4 (12)  4.8th order statistic, or 0.8 (90) + 0.2 (86)  89.2, and the smoothed 80th percentile is the 0.8 (12)  9.6th order statistic, or 0.6 (210) + 0.4 (200)  206. We equate the VaRs. VaR0.4 ( X )  θ (0.4−1 − 1) −1/γ  89.2 VaR0.8 ( X )  θ (0.8−1 − 1) −1/γ  206 206 1/0.8 − 1  89.2 1/0.4 − 1

206 89.2



6

γˆ  C/4 Study Manual—17th edition Copyright ©2014 ASM

! −1/γ

ln 6  2.1407 206 ln 89.2



0.25 1.5

! −1/γ

 6γ

31. PERCENTILE MATCHING

574

θˆ 

206 −1/2.1407  107.801 0.8−1 − 1

(E)

31.29. The smoothed empirical percentiles are the 4th and 8th order statistics, or 1.82 and 12.66. We match these to the distribution’s percentiles: VaR0.4 ( X )  θ (− ln 0.4) −1/τ  1.82

VaR0.8 ( X )  θ (− ln 0.8) −1/τ  12.66 1.82 − ln 0.4  12.66 − ln 0.8

! −1/τ



− ln 0.8  − ln 0.4

! 1/τ

ln 0.8 1.82  12.66 ln 0.4 1.82 ln 0.8 τ ln  ln 12.66 ln 0.4 ln (ln 0.8/ ln 0.4) τˆ  ln (1.82/12.66) −1.41252   0.728248 −1.93961 1.82 θˆ  (− ln 0.4) −1/τ 1.82  1.6141  (− ln 0.4) −1/0.728248

(D)

Quiz Solutions 31-1.

0.3 (6)  1.8. Interpolate between the first and second order statistics: 0.2 (20) + 0.8 (70)  60 .

31-2.

Using VaR from the tables, VaR0.2 ( X )  θ (− ln 0.2) −1/τ  20

Var0.7 ( X )  θ (− ln 0.7) −1/τ  560

Dividing the first into the second, − ln 0.7 − ln 0.2

! −1/τ

 28

1 ln 0.7 − ln  ln 28 τ ln 0.2 ln (ln 0.7) / (ln 0.2)



!



1.506815   0.452198 ln 28 ln 28 20 θˆ   57.289 (− ln 0.2) −1/0.452198 τˆ  −

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 32

Maximum Likelihood Estimators Reading: Loss Models Fourth Edition 13.2 On recent released exams, there have been 2–3 maximum likelihood questions per exam, plus questions on related topics such as the delta method. In all likelihood (pun intended), there will be at least as many such questions on future exams. In maximum likelihood estimation, we maximize the probability or likelihood of observing whatever we observed. The likelihood of a point (as in individual data) is the density function (or the probability function if the model is discrete), while the likelihood of an interval (as in grouped data) is the probability of data being in the interval, which can be expressed as a difference of cdf’s. If all observations are independent (as they usually are in exam questions), the likelihood L is the product of the likelihoods of the individual observations. To maximize, we log L, since this makes it easier to work with the function, arriving at the loglikelihood l. We differentiate l with respect to each parameter, getting a vector of partial derivatives. Set these equal to zero and solve. Here is an example of this algorithm: Example 32A [4B-S92:27] (2 points) The random variable X has the density function with parameter β given by f ( x; β ) 

2 1 xe −0.5 ( x/β ) , β2

x > 0, β > 0,

β√ where E[X]  2 2π and the variance of X is 2β 2 − π2 β2 . You are given the following observations of X: 4.9, 1.8, 3.4, 6.9, 4.0. Determine the maximum likelihood estimator of β. (A) Less than 3.00 (B) At least 3.00, but less than 3.15 (C) At least 3.15, but less than 3.30 (D) At least 3.30, but less than 3.45 (E) At least 3.45

Answer: The given density function is that of a Weibull, but with a different parametrization from that in the Loss Models appendix: τ  2, β  √θ . E[X] is unnecessary for this question, but was needed for a 2 companion question (see exercise 30.18 on page 537). As mentioned there, Var ( X ) was provided only to confuse you. Let x i , i  1, . . . , 5, be the 5 observations of X. If we knew the parameter β, then the probability of each 2 x i would be the density function f ( x | β )  β12 xe −0.5 ( x/β ) . The probability of all five observations would be the product of this expression over all five x 0i s, or L (β) 

5 P 1 *Y + − 2β12 5i1 x 2i x e i β 10 , i1 -

L ( β ) is called the likelihood function. The maximum likelihood method maximizes this expression over β. To make the maximization easier, we log both sides; this has no effect on maximization. ln L ( β ) is called C/4 Study Manual—17th edition Copyright ©2014 ASM

575

32. MAXIMUM LIKELIHOOD ESTIMATORS

576

the loglikelihood function and is denoted by l ( β ) . l ( β )  −10 ln β + ln

5 Y i1

xi −

5 1 X 2 xi . 2β2 i1

This is maximized by differentiating with respect to β. 5 dl −10 1 X 2 xi  0  + 3 dβ β β i1

−10β 2 +

s

P5

β

i1

10

x 2i

5 X i1

x 2i  0

 3.2003

(C)



Let’s go over the steps of fitting the parameters θ to observations using maximum likelihood. 1. Write down a formula for the likelihood of the observations in terms of θ. Generally this will be the probability or the density of the observations. 2. Log the formula. 3. Maximize the formula. Usually this means differentiating the formula and setting the derivative equal to zero. Although most practical examples require a numerical algorithm to maximize l, there are an ample number of cases that can be done directly, making this estimator a good candidate for exam questions. In order to prepare you for true-false questions, I’ll list all the reasons you should use maximum likelihood rather than method of moments or percentile matching. (However, to my knowledge, no true-false questions of this nature have ever been asked on exams.) This list is taken from the beginning of the reading. 1. Method of moments and percentile matching only use a limited number of features from the sample. For many distributions (e.g., Pareto), there’s much more to the sample than mean and variance. (On the other hand, for distributions like the normal, there isn’t anything more than mean or variance. Guess what? In that case, maximum likelihood is the same as method of moments.) 2. Method of moments and percentile matching cannot be used, or are hard to use, with combined data. How do you combine observations from different deductibles, for example? 3. Method of moments and percentile matching cannot always handle truncation and censoring. 4. Method of moments and percentile matching require arbitrary decisions on which moments or percentiles to use. To this list, let’s add the potential pitfalls of maximum likelihood. 1. There is no guarantee that the likelihood can be maximized—it may go to infinity. 2. There may be more than one maximum. 3. There may be local maxima in addition to the global maximum; these must be avoided. 4. It may not be possible to find the maximum by setting the partial derivatives to zero; a numerical algorithm may be necessary. C/4 Study Manual—17th edition Copyright ©2014 ASM

32.1. DEFINING THE LIKELIHOOD

32.1

577

Defining the likelihood

Let’s define the likelihood of an observation.

32.1.1

Individual data

If the data are being fit to a discrete distribution, the likelihood of an observation is its probability. Example 32B Data are being fit to a distribution with the following probability function: n

Probability of n

0 1 2

θ 0.9 − θ 0.1

The only possible values for the distribution are 0, 1, and 2, and 0 ≤ θ ≤ 0.9. You have the following data Value

Number of observations

0 1 2

15 35 10

Fit the value θ using maximum likelihood. Answer: The probability of fifteen 0’s given θ is θ 15 , so θ 15 is the likelihood of fifteen 0’s. Similarly, (0.9 − θ ) 35 is the likelihood of thirty-five 1’s, and (0.1) 10 is the likelihood of ten 2’s. Multiplying them together, we get the likelihood function L ( θ )  θ 15 (0.9 − θ ) 35 (0.1) 10 Technically, since the order of the observations wasn’t specified, this should be multiplied by a trinomial coefficient for all possible orders, namely 60!/ (15!35!10!) . So the real likelihood is L (θ) 

60! θ 15 (0.9 − θ ) 35 (0.1) 10 15!35!10!

However, the only time we will need to be so precise is when doing likelihood ratio tests. If we just want to maximize, positive multiplicative constants do not affect the maximum and may always be dropped. In fact, we’ll even drop (0.1) 10 and write L ( θ )  θ 15 (0.9 − θ ) 35 Notice that anything not a function of the parameters, including functions of the observations, is a constant and may be dropped. Let’s finish up the problem. Logging l ( θ )  15 ln θ + 35 ln (0.9 − θ ) C/4 Study Manual—17th edition Copyright ©2014 ASM

32. MAXIMUM LIKELIHOOD ESTIMATORS

578

Differentiating dl 15 35  − 0 dθ θ 0.9 − θ 35 15  θ 0.9 − θ 13.5 − 15θ  35θ 13.5 θ  0.27 50



If the data are being fit to a continuous distribution, the probability density function of the data is the likelihood. Example 32A is an example of this. As mentioned in Example 32B, when setting up the likelihood function, we can ignore positive multiplicative constants (and certainly additive constants). Multiplying by a positive constant does not affect the point at which the function reaches the maximum. Anything not involving the parameters being estiQ mated is a constant, even if it involves the observations. In Example 32A, we could have ignored 5i1 x i and written the likelihood function as L (β) 

32.1.2

1 −0.5 P5 x 2 /β2 i1 i e β 10

Grouped data

For grouped data, the likelihood that an observation is in the interval ( c j−1 , c j ) is F ( c j ) − F ( c j−1 ) . Example 32C You are given the following data for claim sizes: Claim size

Number of claims

Under 1000 [1000, 2000) 2000 and up

10 5 3

The data are fit to an exponential distribution using maximum likelihood. Determine the fitted mean. Answer: The likelihood of an observation under 1000 given θ of the exponential is F (1000)  1 − e −1000/θ . The likelihood of an observation between 1000 and 2000 is F (2000) − F (1000)  e −1000/θ − e −2000/θ The likelihood of an observation greater than 2000 is 1 − F (2000)  e −2000/θ Thus the likelihood of the data is L ( θ )  1 − e −1000/θ



 1 − e −1000/θ



C/4 Study Manual—17th edition Copyright ©2014 ASM

 10   15 

e −1000/θ − e −2000/θ e −1000/θ

 11

5 

e −2000/θ

3

32.1. DEFINING THE LIKELIHOOD

579

We can let x  e −1000/θ , since if we maximize for x, we maximize for θ. This will make differentiation easier. So L ( x )  (1 − x ) 15 x 11

l ( x )  15 ln (1 − x ) + 11 ln x dl 15 11 − + 0 dx 1−x x 15x  11 − 11x 11 x 26 11 e −1000/θ  26 1000 θ−  1162.52 ln (11/26)

32.1.3



Censoring

For censored data, such as data in the presence of a policy limit, treat it like grouped data: the likelihood function is the probability of being beyond the censoring point. Example 32D All numbers in this example are expressed in thousands. An auto liability coverage has a policy limit of 100. Claim sizes observed are 20

45

50

80

100

where the claim at 100 was for exactly 100. In addition, there are 2 claims above the limit. The data are fitted to an exponential distribution using maximum likelihood. Determine the mean of the fitted distribution. Answer: Later on we will learn a shortcut for this problem. The likelihood of the 5 claims x1 through x5 is f ( x i ; θ )  limit is 1 − F (100)  e −100/θ . So the likelihood of the data is L (θ) 

e −x i /θ θ .

The likelihood of the 2 claims at the

e (−20−45−50−80−100−100−100)/θ e −495/θ  5 θ θ5

Then logging and differentiating 495 − 5 ln θ θ dl 495 5  2 − 0 dθ θ θ 495 θ  99 5

l (θ)  −



If data are left censored, that is, you know an observation is below d but you don’t know its exact value, then the likelihood is F ( d ) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

32. MAXIMUM LIKELIHOOD ESTIMATORS

580

?

Quiz 32-1 An auto collision coverage has a deductible of 100. All losses, including those below the deductible, are reported. Loss sizes observed (including the deductible) are 120

150

270

625

1000

In addition, there were 3 losses below the deductible. The data are fitted to an inverse exponential distribution using maximum likelihood. Determine the median of the fitted distribution.

32.1.4

Truncation

For truncated data, the observation is conditional on being outside the truncated range. If data are left truncated at d, such as for a policy with a deductible of d, so that you only see the observation x if it is greater than d, the likelihood of x is f (x ) f (x )  Pr ( X > d ) 1 − F ( d )

For the more rare case of right truncated data—you do not see an observation x unless it is under u—the likelihood of x is f (x ) f (x )  Pr ( X < u ) F ( u ) Example 32E An auto collision coverage has a deductible of 500. Claim sizes observed are 600

800

1000

1600

3000

The data are fitted to an exponential distribution using maximum likelihood. Determine the mean of the fitted distribution. Answer: Later on we will learn a shortcut for this problem. The likelihood of each observation x i is f (xi ) e −x i /θ e (500−x i )/θ   1 − F (500) θe −500/θ θ

The likelihood function of the data is L (θ) 

e (2500−600−800−1000−1600−3000)/θ e −4500/θ  5 θ θ5

Logging and differentiating, 4500 − 5 ln θ θ dl 4500 5  − 0 dθ θ θ2 4500 θ  900 5 Note that 900 is the mean of the ground up distribution, the distribution ignoring the deductible. For the memoryless exponential, it is also the mean of the payments distribution, but this would not be true for another distribution. If we wanted to fit the payments distribution, we would first subtract the deductible 500 from each claim size before fitting.  l (θ)  −

C/4 Study Manual—17th edition Copyright ©2014 ASM

32.1. DEFINING THE LIKELIHOOD

581

Table 32.1: Likelihood Formulas

Discrete distribution, individual data

px

Continuous distribution, individual data

f (x )

Grouped data

F ( c j ) − F ( c j−1 )

Individual data censored from above at u Individual data censored from below at d

1 − F ( u ) for censored observations F ( d ) for censored observations

Individual data truncated from above at u

f (x ) F (u )

Individual data truncated from below at d

f (x ) 1 − F (d )

32.1.5

Combination of censoring and truncation

These concepts can be combined; data that are both left truncated and right censored would have likelihood SS ((ud )) . Grouped data that are between d and c j in the presence of truncation at d has likelihood F ( c j ) −F ( d ) 1−F ( d ) .

Example 32F An insurance coverage has an ordinary deductible of 1,000, a maximum covered loss of 50,000, and 90% coinsurance. There are 20 payments x i , i  1, . . . , 20 of amounts less than 44,100 and 5 payments of 44,100. Write the likelihood function for the ground-up distribution. Answer: To translate payments into losses, we divide by 0.9 and then add back the 1,000 deductible, the reverse order of application. In other words, if Y P is the payment variable and X is the loss variable, Y P  0.9 ( X − 1000) , so X  Y P /0.9 + 1000. The five censored observations have likelihood 1 − F (50,000) . The truncation at 1000 makes the probability of observation 1 − F (1000) . The likelihood is therefore:

Q20 L (θ) 

i1

f ( x i /0.9 + 1000)





1 − F (1000)

1 − F (50,000)

 25

Table 32.1 summarizes likelihood formulas for the uncombined cases.

C/4 Study Manual—17th edition Copyright ©2014 ASM

5



32. MAXIMUM LIKELIHOOD ESTIMATORS

582

Exercises The following claim experience is observed:

32.1.

Claim Size

Number of Claims

0–1000 1000–2000 2000– ∞

20 10 5

You fit an exponential to the claim size distribution using maximum likelihood. Determine the estimate of mean claim size. [4B-F95:16] (2 points) You are given the following:

32.2. •



Six losses have been recorded in thousands of dollars and are grouped as follows: Interval (0,2) [2,5) [5, ∞)

Number of Losses 2 4 0

The random variable X underlying the losses, in thousands, has the density function f ( x )  λe −λx ,

x > 0, λ > 0.

Which of the following functions must be maximized to find the maximum likelihood estimate of λ? (A) (B) (C) (D) (E)

(1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 ( e −5λ ) 6 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 (1 − e −5λ ) 6 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 ( e −5λ ) −6 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 (1 − e −5λ ) −6

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 32 [4B-S95:15] (2 points) You are given the following:

32.3. •

583

Six observed losses have been recorded in thousands of dollars and are grouped as follows: Number of Interval Losses [0,2) [2,5)

2 4



There is no record of the number of losses at or above 5000.



The random variable underlying the observed losses, in thousands, has the density function f ( x )  λe −λx ,

x > 0, λ > 0.

Which of the following functions must be maximized to find the maximum likelihood estimate of λ? (A) (B) (C) (D) (E)

(1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 ( e −5λ ) 6 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 (1 − e −5λ ) 6 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 ( e −5λ ) −6 (1 − e −2λ ) 2 ( e −2λ − e −5λ ) 4 (1 − e −5λ ) −6

32.4. [1999 C4 Sample:37] Twenty widgets are tested until they fail. The failure times are distributed as follows: Interval

Number Failing

(0, 1] (1, 2] (2, 3] (3, 4] (4, 5] (5, ∞)

2 3 8 6 1 0

The exponential survival function S ( t )  exp (−λt ) is used to model this process.

Determine the maximum likelihood estimate of λ. 32.5.

[4-F02:23] You are given:

(i) Losses follow an exponential distribution with mean θ. (ii) A random sample of 20 losses is distributed as follows: Loss Range [0, 1000] (1000, 2000] (2000, ∞)

Frequency 7 6 7

Calculate the maximum likelihood estimate of θ. (A) (B) (C) (D) (E)

Less than 1950 At least 1950, but less than 2100 At least 2100, but less than 2250 At least 2250, but less than 2400 At least 2400

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

32. MAXIMUM LIKELIHOOD ESTIMATORS

584

You are given the following losses:

32.6.

1000

1200

1600

2100

2200

2400

You fit an inverse exponential to the loss distribution using maximum likelihood. Determine the resulting estimate of the probability of a loss below 1000. 32.7. [4B-S91:36] (2 points) Given the cumulative distribution function F ( x )  x p , for 0 ≤ x ≤ 1, and a sample of n observations, x1 , x 2 , . . . , x n , what is the maximum likelihood estimator of p? −n (A) Pn i1 ln ( x i )

n (B) Pn i1 ln ( x i )

(C)

Q

n i1

xi

 n1

(D)

Pn

i1 ln ( x i )

n

(E)

Pn

i1

xi

n

[4B-F95:4] (3 points) You are given the following:

32.8.

2 ( θ − x ) , 0 < x < θ. θ2 A random sample of two observations of X yields the values 0.50 and 0.90.



The random variable X has the density function f ( x ) 



ˆ the maximum likelihood estimate of θ. Determine θ, (A) (B) (C) (D) (E)

[4B-F96:5] (3 points) You are given the following:

32.9. •

Less than 0.45 At least 0.45, but less than 0.95 At least 0.95, but less than 1.45 At least 1.45, but less than 1.95 At least 1.95

The random variable X has the density function f (x )  √



β 2πx 3

e −β

2 /2x

,

0 < x < ∞, β > 0.

A random sample of three observations of X yields the values 100, 150, and 200. ˆ the maximum likelihood estimate of β. Determine β,

(A) (B) (C) (D) (E)

Less than 11.5 At least 11.5, but less than 12.0 At least 12.0, but less than 12.5 At least 12.5, but less than 13.0 At least 13.0

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 32

585

32.10. [4B-F97:6] (2 points) You are given the following: •

The random variable X has one of the following three density functions: f1 ( x )  1, 0d

using maximum likelihood. Let X be the underlying random variable. Determine the fitted value of Pr ( X > 10) . (A) 0.63

(B) 0.67

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.71

(D) 0.75

(E) 0.79 Exercises continue on the next page . . .

EXERCISES FOR LESSON 32

591

32.24. Insurance covers 2 groups. Losses for the first group follow a two-parameter Pareto distribution with parameters θ  1000 and α. Losses for the second group follow a single-parameter Pareto distribution with the same two parameters. You observe the following losses for the first group: 700,

1000,

1500

1700,

4000

and the following losses for the second group: 1200,

The parameter α is fitted using maximum likelihood. Determine α. (A) 1.08

(B) 1.42

(C) 1.78

(D) 2.25

(E) 2.79

The following exercise is unlikely to appear on an exam. However, it is in the 306 sample questions, so I am including it. 32.25. [4-F03:30] For a sample of 15 losses, you are given: (i)

Observed Number of Losses

Interval

(ii)

(0, 2] (2, 5] (5, ∞)

5 5 5

Losses follow the uniform distribution on (0, θ ) .

Estimate θ by minimizing the function

P

(E j −O j ) 2 Oj

, where E j is the expected number of losses in the j th

interval and O j is the observed number of losses in the j th interval. (A) 6.0

(B) 6.4

(C) 6.8

(D) 7.2

(E) 7.6

Additional released exam questions: C-F05:5, CAS3-F06:2, C-F06:5,33, C-S07:1

Solutions 32.1. L ( θ )  1 − e −1000/θ



 1 − e −1000/θ



 1 − e −1000/θ



 20   20   30

e −1000/θ − e −2000/θ 1 − e −1000/θ

20000 ! θ −1000 20000 + 0 θ2 θ2

dl 30e −1000/θ  dθ 1 − e −1000/θ

C/4 Study Manual—17th edition Copyright ©2014 ASM



e −2000/θ

e −1000/θ

e −20000/θ

l ( θ )  30 ln 1 − e −1000/θ −



 10 

 10 

 10 

5

e −2000/θ

5

32. MAXIMUM LIKELIHOOD ESTIMATORS

592

30000e −1000/θ  20000 − 20000e −1000/θ e −1000/θ  0.4

θ  1091.36 32.2. The random variable has an exponential distribution, which however is parameterized differently from the Loss Models appendix. F ( x )  1 − e −λx (which you can get by integrating the density function on x). The likelihood of the observations in the interval (0, 2) is F (2) − F (0)  F (2)  1 − e −2λ , and there are two of these so we square this. The likelihood of the observations in the interval [2, 5) is F (5) − F (2)  e −2λ − e −5λ , and there are four of these so we raise this to the fourth power. We’re done, since there are no observations above 5. (A) As an additional exercise, you can try to figure out under what conditions the other choices would be correct. Choice (B) is correct if there are 6 observations in [5, ∞) . Choice (C) is correct in the obscure case where you have an additional 6 observations in the interval (0, 5) but don’t know which subinterval ( (0, 2) , [2, 5) ) they are in. Choice (E) is correct if the data is truncated from above (not censored) at 5, since then the observations are conditional on them being below 5. I can’t figure out any case where you’d want to use (D). 32.3. Here, losses are truncated (not censored) at 5000; there is no record of the number of losses above 5000, whereas in a censoring situation there would be a record of the number of losses, although not the amounts. Therefore, the likelihood function for each observation must be divided by F (5) , which is 1 − e −5λ , making the answer (E). See the solution to exercise 32.2 for a discussion of when each of the other choices would be correct. 32.4.

This parametrization of the exponential is different from the one in the Loss Models Appendix.

2

The likelihood for the 2 in (0, 1] is 1 − e −λ . To simplify matters, let x  e −λ ; then the likelihood is (1 − x ) 2 . Continuing:



Interval

Likelihood

(0, 1]

(1 − x ) 2 3 x − x 2  x 3 (1 − x ) 3  8 x 2 − x 3  x 16 (1 − x ) 8 x 18 (1 − x ) 6 x 4 (1 − x )

(1, 2]



(2, 3] (3, 4] (4, 5] Multiplying the likelihoods, we have

L ( x )  (1 − x ) 20 x 41

 20 ln (1 − x ) + 41 ln x −20 41  + 0 1−x x 41  x  20x 41 x 61

l (x ) dl dx 20 1−x 41 − 41x

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 32

593

e −λ 

41 61

41 λˆ  − ln  0.3973 61 After working out many examples, you will know that whenever the likelihood function is L ( θ )  θ a (1 − θ ) b , the maximum is θ  a/ ( a + b ) , and you won’t need to do the manipulation above. 32.5.

The likelihood for the range [0, 1000] is

Pr ( X ≤ 1000)  1 − e −1000/θ The likelihood for the range (1000, 2000] is Pr (1000 < X ≤ 2000)  e −1000/θ − e −2000/θ  (1 − e −1000/θ )( e −1000/θ ) The likelihood for the range (2000, ∞) is Pr ( X > 2000)  e −2000/θ Multiplying these together, raised to the 7th , 6th , and 7th powers respectively, we get L ( θ )  ( e −20000/θ )(1 − e −1000/θ ) 13 You can certainly use this to complete the exercise, but to make the differentiation less messy, let’s substitute u  e −1000/θ , and maximize for u: L ( u )  u 20 (1 − u ) 13

Based on the comment at the end of the previous exercise, we already know that u  20/33 maximizes L ( u ) , but if you prefer to do the work: l ( u )  20 ln u + 13 ln (1 − u ) dl 20 13  − 0 du u 1−u 20 − 20u − 13u  0 20 u 33 Then e −1000/θ  32.6.

20 33 ,

so θˆ 

−1000 ln (20/33)

 1996.90 . (B)

For an inverse exponential, the density function is f (x ) 

θe −θ/x . x2

When we multiply this over all observations x i , we have

Y 1 L (θ)  Q 2 θn e −θ/x i xi !

X 1 1 l ( θ )  ln Q 2 + n ln θ − θ xi xi !

dl n X 1  − 0 dθ θ xi

C/4 Study Manual—17th edition Copyright ©2014 ASM

32. MAXIMUM LIKELIHOOD ESTIMATORS

594 n θP 1

xi

For the given losses 6

θ

1 1000

+

1 1200

Pr ( X < 1000)  F (1000)  e 32.7.

1 1 1600 + 2100 −1576.57/1000

+

+

1 2200

+

1 2400

 1576.57

 0.2067

The first thing you have to do is differentiate F with respect to x to obtain the density function: f ( x )  px p−1 .

Then you calculate the likelihood function. L (p ) 

Y

p−1

px i

l ( p )  n ln p + ( p − 1)

X

ln x i

n dl  + ln x i  0 dp p −n p P (A) ln x i

X

32.8.

Notice that θ > 0.9, or else the likelihood is 0. 4 ( θ − 0.5)( θ − 0.9) (and θ > 0.9) θ4 l ( θ )  −4 ln θ + ln ( θ − 0.5) + ln ( θ − 0.9) dl 4 1 1 − + + 0 dθ θ θ − 0.5 θ − 0.9 −4 ( θ − 0.5)( θ − 0.9) + θ ( θ − 0.9) + θ ( θ − 0.5)  0 L (θ) 

(−4θ2 + 5.6θ − 1.8) + ( θ2 − 0.9θ ) + ( θ2 − 0.5θ )  0 −2θ 2 + 4.2θ − 1.8  0 √ −4.2 ± 4.22 − 14.4 θˆ   1.5 , 0.6 −4

(D)

Don’t you love the way the ranges allow you to answer 0.6, even though with θ ≤ 0.9 the likelihood of 0.9 is zero? √ 32.9. We will drop the constant 1/ 2πx 3 in setting up the likelihood function. L ( β )  β3 e



β2 2

P

l ( β )  3 ln β −

1 xi

β2

(ignoring constants)

X 1

2 xi X 1 dl 3  −β 0 dβ β xi X 1 3  β2 xi

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 32

βˆ 

s

595

3

P

 11.7670

1 xi

(B)

32.10. All you have to do is calculate, for each i from 1 to 3, L i  2j1 f i ( x j ) with x1  0.50 and x2  0.60. With f1 , likelihood L i is 1. With f2 , likelihood L i is 4 (0.5)(0.6)  1.2. With f3 , likelihood L i is 9 (0.25)(0.36)  0.81. The order is f2 , f1 , f3 . (C)

Q

32.11. Since there’s only one observation, the likelihood function is the density. Maximize L ( w )  w ( f1 (1) − f2 (1)) + f2 (1) , or −w f1 (1) + f2 (1) (using the last bullet, f2 (1)  2 f1 (1) ). The smaller w, the larger L ( w ) , but w is not supposed to be less than 0. So w  0 . (A) 32.12. For Type Y policies, losses are censored at k. Therefore • •

the likelihood of a loss y i less than k is f ( y i ; θ ) . the likelihood of a loss that exceeds k is 1 − F ( k; θ )

For Type Z policies, losses are truncated at k. Therefore •

the likelihood of a loss z j that is recorded (not truncated) is

f ( z j ;θ ) 1−F ( k;θ ) .

 75

Censoring on 75 Type Y policy losses places 1 − F ( k; θ ) in the numerator. Truncation on Type Z policy losses places exactly the same term in the denominator. The two cancel, leaving the expression in (A).



32.13. With truncation from above at k2 and below at k1 , each observation must be divided by the probability that the observation is in the range [k1 , k2 ]. For continuous F (which we assume for loss distributions unless told otherwise), Pr ( k1 ≤ X ≤ k2 )  F ( k2 ; θ ) − F ( k1 ; θ ) , making the answer (E).

32.14. An interesting variant of the preceding problem. Now the truncation is in the middle. The principle is the same as before; divide by the probability of being in the untruncated range. The probability that a loss is less than k1 is F ( k 1 ; θ ) and the probability that a loss is greater than k2 is 1 − F ( k2 ; θ ) , and the two are mutually exclusive, so the probability of either one or the other is F ( k1 ; θ ) + 1 − F ( k2 ; θ ) . So the answer is (E). 32.15. Since each loss is truncated at 100, the likelihood of each loss is together for each of the three x 0i s: 200, 300, and 500. (E)

f ( x i ;θ ) 1−F (100;θ ) .

Multiply three of these

32.16. The first two deaths at time 10 are observations, for which the likelihood is the probability density function. The other 8 lives survive and are censored observations, so the likelihood for them is the survival function. So we need L ( k )  f (10; k ) 2 S (10; k ) 8 We differentiate to obtain the probability density function, the likelihood of the first 2 deaths. f ( t; k )  12 (1 − kt ) −1/2 1k

f (10; k )  12 (1 −

10 −1/2 1 k ) k

We square this and multiply by S (10; k ) for the eight survivors. We’ll drop the constant function.

 10 4 1 k k2 10 3 ln (1 − k )

L (k )  1 −

1−

 10 −1 k





 3 ln

k − 10 − 2 ln k k

l (k ) 

− 2 ln k

!

 3 ln ( k − 10) − 3 ln k − 2 ln k  3 ln ( k − 10) − 5 ln k C/4 Study Manual—17th edition Copyright ©2014 ASM

1 2

in the density

32. MAXIMUM LIKELIHOOD ESTIMATORS

596 3 5 dl  − 0 dk k − 10 k 3k − 5k + 50  0 k  25

(A)

32.17. For the payments for losses below the limit, x i  40, 120, 160, 280, the underlying losses are xi y i  0.8 + 250, or 300, 400, 450, and 600, and the likelihood is the conditional density conditioned on being above 250, or f ( yi ) 1 − F (250) For payments at the limit, the likelihood is the probability of a loss above 1000, given that a loss is above 250, or 1 − F (1000) 1 − F (250) The answer is the product of the 6 likelihoods, or



f (300) f (400) f (450) f (600) 1 − F (1000)



1 − F (250)

2

6

32.18. √ 1 is a multiplicative constant which may be ignored. Then the likelihood of the 5 observations 2πx is exp of something, so let’s log it immediately. Let x i be the 5 observations.

X 1

(xi − µ)2 2x i dl X x i − µ  dµ xi Xx X µ i  − xi xi X 1 5−µ 0 xi 5 5  16.743 µ P 1  0.298633 x l (µ)  −

(A)

i

The estimator is the harmonic mean. 32.19. The density of a mixture is the mixture of the densities, or f (x )  p

e −x/100 e −x/10,000 + (1 − p ) 100 10,000

The likelihood function is a product of f (100) and f (2000) . Hence the answer must be (C). 32.20. A policy limit is right censoring, so the two limited policies would have likelihoods 1 − F ( x ) for x the limit. The other policies have exact data, so their likelihoods are f ( x ) . This means that (E) is correct.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 32

597

32.21. Although it is not essential for solving the problem, note that the distribution is a beta, which is often the case when it is defined over a finite interval. L ( p )  ( p + 1) 3 (0.74)(0.81)(0.95)



p

l ( p )  3 ln ( p + 1) + p ln (0.74)(0.81)(0.95)  3 ln ( p + 1) + p (−0.563119)





3 dl  − 0.563119  0 dp p + 1 3 p+1  5.32747 0.563119 (D) p  4.32747

32.22. Here we have truncation from above and censoring from below. The inverse exponential has density function θe −θ/x f (x )  x2 and distribution function F ( x )  e −θ/x The likelihood of the six claims x i is f (xi ; θ) θe −θ/x i  2 F (10,000; θ ) ( x i )( e −θ/10,000 ) and the likelihood of the claims below 500 is F (500; θ ) e −θ/500  −θ/10,000 . F (10,000; θ ) e As usual, we ignore the multiplicative constant

 L (θ) 

θ6 e

−θ

P

1 xi



Q1

x 2i

e −θ/500

 −θ/10,000 10

.

4 

θ 6 e −0.0036θ e −θ/125 e −θ/1000

e l ( θ )  6 ln θ + θ (−0.0036 − 0.008 + 0.001)  6 ln θ − 0.0106θ dl 6  − 0.0106  0 dθ θ 6 θ  566.0377 0.0106

The median is x such that F ( x; θ )  e −θ/x  12 , or − θx  − ln 2 or x  32.23.

The loglikelihood function is

l ( θ, d )  −5 ln θ − Differentiating with respect to d, we have

C/4 Study Manual—17th edition Copyright ©2014 ASM

5 X xi − d i1

∂l 5  ∂d θ

θ

θ 566.0377 ln 2 , so the answer is 0.6931

d ≤ min x i

 816.62 .

32. MAXIMUM LIKELIHOOD ESTIMATORS

598

which is always positive, so the loglikelihood is maximized when d is made as large as possible, namely dˆ  min x i . In our case, dˆ  5. Differentiating with respect to θ, 5

5 X xi − d ∂l − + 0 θ ∂θ θ2 i1

θˆ 

5 X xi − d i1

5

Substituting the x i ’s and d  5, we get θˆ  (0 + 7 + 15 + 29 + 34) /5  17. Then

D ( X > 10)  e −(10−5)/17  0.7452 Pr 32.24.

(D)

The likelihood function for the first group is α3 (10003α )



(1700)(2000)(2500)

 α+1

The likelihood function for the second group is α3 (10003α )



(1200)(1700)(4000) α+1



Multiply these together and log to obtain the loglikelihood function. l ( α )  6 ln α + 6α ln 1000 − ( α + 1) ln (1700 · 2000 · 2500 · 1200 · 1700 · 4000) 6 dl  + 6 ln 1000 − ln (1700 · 2000 · 2500 · 1200 · 1700 · 4000) dα α 6 αˆ  (B)  1.4153 4.23931 32.25. This method of fitting parameters is known as “minimum modified chi-square”. Modified, because the denominator is O j rather than E j . (If you haven’t encountered the chi-square goodness-of-fit test in a statistics course, you will encounter it later in this course, Lesson 39, and will then understand why this method is called minimum modified chi-square.) It was discussed in the first edition of Loss Models, which was already off the syllabus in Fall 2003, yet they asked this question anyway. We minimize the function by differentiating and setting the derivative to zero. Let’s assume θ > 5. 45 5 Then E j , the expected number of observations in interval j, is 15 θ2  30 θ in (0, 2], θ in (2, 5], and 15−15 θ in (5, ∞) . Let g be the function we are minimizing. We’ll ignore the denominators since they are all 5, and multiplying by 5 doesn’t affect the minimum. 2 2 2 30 45 75 −5 + − 5 + 15 − −5 θ θ θ       1 dg 30 30 45 45 75 75 − 2 −5 − 2 − 5 + 2 10 − 0 2 dθ θ θ θ θ θ θ

g (θ) 













Multiply through θ 3 and divide through by 5. −6 (30 − 5θ ) − 9 (45 − 5θ ) + 15 (10θ − 75)  0 C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 32

599

225θ  1710 θˆ  7.6

(E)

Technically, we must also consider the cases 2 ≤ θ < 5 and θ < 2, but since all the answer choices are greater than 5 we know that those won’t minimize the function. Compare the answer to the maximum likelihood estimate, which is 7.5.

Quiz Solutions 32-1. The likelihoods of the five observed claims are θe −θ/x i /x 2i , and we can ignore the denominator. The likelihoods of the three censored claims are e −θ/100 . Therefore L ( θ )  θ 5 exp . − θ

* ,



1 1 1 1 3 + 1 /  θ5 exp (−0.051304θ ) + + + + + 120 150 270 625 1000 100

l ( θ )  5 ln θ − 0.051304θ dl 5  − 0.051304  0 dθ θ 5  97.4583 θˆ  0.051304



-

The tables have VaR0.5 ( X )  θ (− ln 0.5) −1 , so the median is 97.4583 (− ln 0.5) −1  140.60 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

600

C/4 Study Manual—17th edition Copyright ©2014 ASM

32. MAXIMUM LIKELIHOOD ESTIMATORS

Lesson 33

Maximum Likelihood Estimators—Special Techniques Reading: Loss Models Fourth Edition 13.2 Most of this lesson is devoted to shortcuts for calculating the maximum likelihood estimator. These may help save you time on exams. However, students are reporting that questions for which these shortcuts are useful are not as frequent as they used to be.

33.1

Cases for which the Maximum Likelihood Estimator equals the Method of Moments Estimator

For many distributions, the MLE is equal to the methods of moments estimator, which is easier to compute. The following list applies for estimation with complete individual data. It is hard to define the method of moments estimator with grouped data! 1. For the exponential distribution, they are equal. The MLE of θ is the sample mean. 2. For a gamma distribution with a fixed α (where you are only estimating θ), they are equal. 3. For a normal distribution, µ is estimated as the sample mean, the same as for the method of moP ments. σˆ is the square root of n1 ( x i − µˆ ) 2 . Note the division by n rather than by n − 1, making the estimator biased, but exactly equal to the method of moments estimator the way we defined it. 4. For a Poisson distribution, they are equal. The MLE of λ is the sample mean. 5. For a negative binomial distribution, the maximum likelihood estimator of rβ is the sample mean, but getting the parameters individually requires a computer.

33.1.1

Exponential distribution

For the exponential distribution, even when truncated and censored, the MLE of the parameter θ, the mean, is the quotient of exact exposure over number of uncensored observations. Exact exposure is defined as the difference between the observation (which may be censored) and the truncation point (which is 0 if the observation is not truncated). We have encountered this concept in Section 28.1. Example 33A An insurance coverage has an ordinary deductible of 500 and a maximum covered loss of 10,000. Reported losses (including the deductible) were 1000, 2000, 4000, 8000, and two losses above 10,000. The loss distribution is fitted to an exponential. Determine the maximum likelihood estimator of the mean. Answer: The likelihood function is



L (θ)  C/4 Study Manual—17th edition Copyright ©2014 ASM



− 1000+2000+4000+8000+2 (10,000) /θ (1/θ4 ) e

e −6 (500)/θ 601



e −32,000/θ θ4

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

602

Going through the usual steps, 32,000 − 4 ln θ θ dl 32,000 4 − 0  dθ θ θ2 32,000 θˆ   8000 4

l (θ)  −

Let’s express the answer in terms of the shortcut. There are four uncensored observed events (the two above 10,000 are censored). The exact exposure is x i − 500 for the four observed claims (the first 500 is not part of the exposure, because of the deductible), and 9500 for the losses above 10,000. The maximum likelihood estimator of the mean is therefore: 500 + 1500 + 3500 + 7500 + 2 (9500)  8000 . θˆ  4



Notice the likelihood function in this example, since it comes up frequently. In general, if the likelihood function is L ( γ )  γ−a e −b/γ then you immediately know that γˆ  b/a. In fact, this likelihood function is a constant times the probability density function of an inverse gamma with θ  b and α  a − 1, and the mode of an inverse gamma is listed in the tables as θ/ ( α + 1) . Example 33B There are three classes of policyholders. (i) In Class A, losses follow an exponential distribution with mean θ. (ii) In Class B, losses follow an exponential distribution with mean 2θ. (iii) In Class C, losses follow a gamma distribution with parameters α  4 and θ. You observe the following losses: A: 200, 300, 700 B: 100, 800 C: 400, 1000, 1200 Estimate θ using maximum likelihood. Answer: Dropping constants, the likelihood function is e − (200+300+700)/θ L (θ)  θ3 

!

e − (100+800)/2θ θ2

!

e − (400+1000+1200)/θ θ 12

!

e −4250/θ θ 17

Therefore, θˆ  4250/17  250 .

33.2

Parametrization and Shifting

33.2.1

Parametrization



MLE’s are independent of parametrization. This means that if the parameters are transformed with a oneto-one transformation, the value of the transformed parameter that maximizes the likelihood function is C/4 Study Manual—17th edition Copyright ©2014 ASM

33.3. TRANSFORMATIONS

603

the transformation of the value of the original parameter maximizing the likelihood function. Sometimes, an alternative parametrization is easier to differentiate and solve. As an example, let’s redo Example 32A. By using the parametrization γ  1/β, we can avoid some of the fractions with β. The density function is then 1

f ( x; γ )  γ2 xe − 2 ( γx )

2

and we proceed as before: L ( γ )  γ10 *

5 Y

xi + e −

, i1

5 Y

l ( γ )  10 ln γ + ln

i1

dl 10  −γ dγ γ

s γˆ 

5 X i1

10

P5

i1

γ2 2

x 2i

P5

x 2i

i1

xi −

5 γ2 X 2 xi 2 i1

x 2i  0  0.31247

The estimated value of γ is the reciprocal of the estimated value of β. An exponential can be parametrized with λ  1/θ and density function f ( x )  λe −λx Building on the shortcut mentioned after Example 33A, whenever the likelihood function is L ( λ )  λ a e −λb then you immediately know that the maximum likelihood estimate λˆ  a/b. In fact, this likelihood function is a constant times the probability density function of a gamma with α  a + 1 and θ  1/b, and the distribution tables list the mode of a gamma as θ ( α − 1) .

33.2.2

Shifting

In the presence of truncation, you may either model the ground-up loss distribution using the above rules, or you may model the payments. These two approaches will usually not give you the same answer. They will give you the same answer for the memoryless exponential distribution. We have already discussed the modifications needed to handle truncation if you are estimating the ground-up loss distribution. Modeling payments is known as shifting, since you are fitting losses minus deductible, or shifting the numbers d to the left.

33.3

Transformations

MLE’s are invariant under one-to-one transformations. If you have a transformed variable, you can untransform it and calculate the MLE of the untransformed variable. Depending on how the parameters were modified when performing the transformation, you may have to modify the results. Three examples of this are the lognormal distribution, the inverse exponential distribution, and the Weibull distribution. C/4 Study Manual—17th edition Copyright ©2014 ASM

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

604

33.3.1

Lognormal distribution

The lognormal distribution is a transformed normal distribution. If X has a normal distribution with parameters µ and σ, then e X has a lognormal distribution with the same parameters µ and σ. To calculate the maximum likelihood estimators of the parameters of a lognormal distribution, undo the transformation— ˆ Notice the contrast log each of the observations—and then use shortcut #3 on page 601 to obtain µˆ and σ. between MLE and the method of moments, which is not invariant under transformation. The MLE for lognormal parameters is not the same as the method of moments estimators for those parameters. Example 33C You are given the following observations of X: 4.9, 1.8, 3.4, 6.9, 4.0. You fit a lognormal distribution to these observations using the maximum likelihood method. Determine µ and σ. Answer: µˆ is the average of the logs of the observations: µˆ 

ln 4.9 + ln 1.8 + ln 3.4 + ln 6.9 + ln 4.0  1.343722 5

On a calculator, it is faster to calculate the logarithm of the product: ln (4.9)(1.8)(3.4)(6.9)(4.0)  ln 827.6688  6.71861 and then divide by 5. The sample variance of the logs is: ln2 4.9 + ln2 1.8 + ln2 3.4 + ln2 6.9 + ln2 4.0 − µˆ 2  0.19869 5 σˆ is the square root of the sample variance. σˆ 

33.3.2

√ 0.19869  0.4457



Inverse exponential distribution

If X has an exponential distribution with mean θ, then Y  1/X has an inverse exponential distribution with parameter 1/θ. Therefore, the MLE of θ for an inverse exponential is obtained by calculating the MLE of an exponential on the reciprocals of the data, and then inverting this MLE: n θˆ  Pn ( i1 1/x i ) or the harmonic mean. Example 33D In a study of insurance claims, you are given the following sample of four losses: 1280

2000

5000

6400

You are to fit an inverse exponential distribution with mean θ to this data using maximum likelihood methods. Calculate the resulting estimate of the probability that a loss is less than 5000. Answer: Using the formula for the MLE given above, θˆ 

4  2442.75 1/1280 + 1/2000 + 1/5000 + 1/6400

Therefore, the probability that a loss X is less than 5000 is F (5000)  e −2442.75/5000  0.6135 . C/4 Study Manual—17th edition Copyright ©2014 ASM



33.3. TRANSFORMATIONS

605

The estimator can be generalized to right-truncated and left-censored data (the opposite of what we usually have). Suppose there are n non-censored points and c left-censored points. Let x i be the higher of the censoring point and the data, i  1, 2, . . . , n + c, and let d i be the right-truncation point for each of the n + c observations. Then n  θˆ  P  n+c i1 (1/x i ) − (1/d i )

33.3.3

Weibull distribution

The Weibull distribution is a transformed exponential distribution. If X has an exponential distribution with mean θ, then X 1/τ has a Weibull distribution. However, when performing this transformation, we do not get the parametrization in the Loss Models appendix, since Loss Models uses scale parametrization. The θ in the Loss Models parametrization is the τth root of the θ you get when you apply the transformation. Therefore, to calculate the MLE of θ of a Weibull distribution, if τ is fixed, you average the τ th powers of the observations, or in other words you calculate the τth moment of the observations, but then you take the τth root of the result. Since the exponential shortcut works for censored and truncated data, this will work even for censored and truncated data. We’ll redo Example 32A with this shortcut. Here, τ  2.

s θˆ 

P5

i1

5

x 2i

P5 2 √ i1 x i But β  θ/ 2, so βˆ  10 , the same formula as we developed above. Once again: To fit a Weibull with fixed τ (an exponential would be the special case τ  1), the formula is

q

θˆ 

rP τ

x iτ −

P

d iτ

n

where n is the number of uncensored observations x i are both actual and censored observations d i are truncation points (0 for data not truncated)

?

Quiz 33-1 In a mortality study, you have the following data for five individuals: i

di

ui

xi

1 2 3 4 5

0 0 6 10 40

35 — 30 40 —

— 40 — — 50

The data are fitted to a Weibull distribution with τ  3. You are given that Γ (4/3)  0.89298. Determine the maximum likelihood estimate of mean survival time.

C/4 Study Manual—17th edition Copyright ©2014 ASM

(33.1)

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

606

33.4

Special distributions

33.4.1

Uniform distribution

You should do at least one problem with the uniform distribution and complete individual data to appreciate the situation. Example 33E You are given the following observations of X: 4.9, 1.8, 3.4, 6.9, 4.0. You fit a uniform distribution on [0, θ] to this data using the maximum likelihood method. Determine θ. Answer: What is the likelihood function? The uniform density is the constant θ1 , so the likelihood of the five observations is just this constant multiplied by itself 5 times, or L (θ) 

1 θ5

Now you can see that logging and differentiating is unnecessary—this function will increase as long as θ is decreased. So for what θ is the maximum attained? Well, there is one important constraint; the likelihood of an observation is 0 if the observation is more than θ. This means that if even one observation is more than θ, L ( θ )  0 which is certainly not a maximum! So we must set θ to no less than the maximum of the observations, here 6.9. Hence θ  6.9 .  So we see that for a uniform distribution with complete individual data, the MLE is the maximum of the data. If the data are grouped, or if some observations are censored at a single censoring point: • If there is at least one censored observation, the MLE is the censoring point times the total number of observations divided by the number of observations below this censoring point. The breakdown of the observations below the censoring point has no effect on the MLE. For example, if there are 10 observations below 90 and 1 above 90, the MLE is

11 10 (90)

 99.

• If the data are grouped but all groups are bounded (there is no group with an upper bound of infinity), then the MLE is the lower of the following two items: 1. The upper bound of the highest interval with data. 2. The lower bound of the highest interval with data, times the total number of observations divided by the number of observations below that number. In other words, the lower bound of the highest interval is treated as if it is the censoring point. For example, if there are 20 observations between 0 and 100 and 30 between 100 and 200, the MLE is 200, because 20+30 20 (100) is greater than 200. If there are 20 observations between 0 and 100 and 10 between 100 and 200, the MLE is 150, which is 20+10 20 (100) .

?

Quiz 33-2 You have the following data for losses from an insurance coverage:

C/4 Study Manual—17th edition Copyright ©2014 ASM

Range

Number of Observations

(0, 1000) [1000, 5000) [5000, 10000] (10000, ∞)

18 32 30 0

33.4. SPECIAL DISTRIBUTIONS

607

The underlying distribution is assumed to be uniform on (0, θ]. The parameter θ is estimated using maximum likelihood. Determine θ.

33.4.2

Pareto distribution

Let’s derive a formula to fit α for a Pareto distribution with fixed θ using maximum likelihood. Let’s say there are observations x1 , . . . x n plus c observations censored from above at u and all observations truncated from below at d. (If there is no truncation, set d  0.) Then the likelihood of the x i ’s, ignoring αθ α α truncation, is ( θ+x α+1 . However, we can make the denominator ( θ + x i ) (in other words, drop the 1) since i) θ + x i is a multiplicative constant; only α is a variable. The likelihood of each censored observation, ignoring truncation, is θ α / ( θ + u ) α . Now you see why I dropped the 1 in the previous likelihood, so that I could combine those denominators with these. θα α For each observation, we must divide by 1 − F ( d ) for truncation, or by ( θ+d ) α . The θ ’s will cancel. The likelihood of the data is then α n ( θ + d ) ( n+c ) α L ( α )  Q  α n c i1 ( θ + x i ) ( θ + u ) Logging

( θ + d ) n+c l ( α )  n ln α + α ln Qn+c i1 ( θ + x i )

where x i  u for i > n. Let n+c

X ( θ + d ) n+c K  ln Qn+c  ( n + c ) ln ( θ + d ) − ln ( θ + x i ) i1 ( θ + x i ) i1 Then dl n  +K 0 dα α n αˆ  − K This is something like a geometric average. The formula for αˆ for a two-parameter Pareto may be generalized for varying d i instead of a fixed d by defining K as follows: K

n+c X i1

ln ( θ + d i ) −

n+c X

ln ( θ + x i )

i1

Example 33F An insurance coverage has an ordinary deductible of 500 and a maximum covered loss of 10,000. Reported losses (including the deductible) were 1000, 2000, 4000, 8000, and 2 losses above 10,000. The loss distribution is fitted to a Pareto with θ  5000. Determine the maximum likelihood estimator of α. Answer: Defining K as above, K  ln

55006  −3.68746 (6000)(7000)(9000)(13,000)(15,0002 )

Since there are 4 uncensored losses, αˆ  4/3.68746  1.08476 . C/4 Study Manual—17th edition Copyright ©2014 ASM



33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

608

For a single-parameter Pareto, set K as follows: n+c

X  d n+c    ( n + c ) ln d − ln x i ln Q   n+c    i1 x i i1 K n+c  X   θ n+c   ln x i   ln Qn+c x  ( n + c ) ln θ − i1 i  i1

if d > θ if d < θ

Then αˆ  −n/K again. The formulas for single- and two- parameter Paretos can be generalized to allow for deductibles d i varying by observation. The generalized formulas are shown in Table 33.1.

?

Quiz 33-3 On an insurance coverage with ordinary deductible 500 and policy limit 10,000, there are losses of 1000, 3000, and 8000, and two losses at the limit. Losses are fitted to a two-parameter Pareto with θ  6000 using maximum likelihood. Determine the estimate of α.

33.4.3

Beta distribution

Let’s derive a formula to fit a or b for a beta distribution when the other parameter (b or a respectively) is 1 and θ is fixed. If b  1, then the density function is ax a−1 θa

f (x ) 

0 0.

A random sample of three observations of X yields the values 0.30,

0.55,

0.80.

ˆ the maximum likelihood estimator of θ. Determine the value of θ, (A) (B) (C) (D) (E)

Less than 0.5 At least 0.5, but less than 1.0 At least 1.0, but less than 1.5 At least 1.5, but less than 2.0 At least 2.0

33.2. [160-S90:16] Ten laboratory mice are observed for a period of five days. You are given: (i)

(ii)

Seven mice die during the observation period, with the following distribution of deaths: Exact Time of Death in Days

Number of Deaths

2 3 4 5

1 2 1 3

The lives in the study are subject to an exponential survival function with mean θ.

Determine θˆ by the method of maximum likelihood. (A) 4.0

(B) 4.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 5.0

(D) 5.5

(E) 6.0

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

615

33.3. [160-S87:15] In a mortality study, five individuals are observed from time 0 to time 1. The following data summarizes the results of the study: Individual

Time At Entry

Time of Death

1 2 3 4 5

0 0 0 1/3 2/3

— — 1/2 5/6 —

Individual 2 leaves the study at time 1/3. Individuals 1 and 5 survive to the end of the study. The Nelson-Åalen estimate of the cumulative hazard function at time 1 is denoted by Hˆ (1) . The maximum likelihood estimate of the cumulative hazard function at time 1 assuming a constant force of hazard is denoted by H˜ (1) . Determine Hˆ (1) − H˜ (1) . (A) −0.08

(B) −0.06

(C) 0.00

(D) 0.06

(E) 0.08

[4-F04:36] You are given:

33.4. (i) (ii) (iii)

The following is a sample of 15 losses: 11, 22, 22, 22, 36, 51, 69, 69, 69, 92, 92, 120, 161, 161, 230 ˆ H1 ( x ) is the Nelson-Åalen empirical estimate of the cumulative hazard rate function. Hˆ 2 ( x ) is the maximum likelihood estimate of the cumulative hazard rate function under the assumption that the sample is drawn from an exponential distribution.

Calculate | Hˆ 2 (75) − Hˆ 1 (75) |. (A) 0.00

(C) 0.22

(D) 0.33

(E) 0.44

[4B-S98:20] (1 point) You are given the following:

33.5. •

(B) 0.11

The random variable X has the density function f (x ) 

1 −x/λ , λe

0 < x < ∞,

λ > 0.



λ is estimated by an estimator λ˘ based on a large random sample of size n.



p is the proportion of the observations in the sample that are greater than 1.



The probability that X is greater than 1 is estimated by the estimator e

− 1˘

λ

.

Determine the estimator for the probability that X is greater than 1 if λ˘ is the maximum likelihood estimator. (A) X¯

¯

(B) e −1/X

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) p

(D) − ln p

(E) − ln1p

Exercises continue on the next page . . .

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

616

33.6. A policy has an ordinary deductible of 100 and a maximum covered loss of 1000. You observe the following 10 payments: 15

50

100

215

400

620

750

900

900

900

An exponential distribution is fitted to the ground-up distribution function, using maximum likelihood. Determine the estimated parameter θ. 33.7. A group of 10 individuals is observed from birth to age 65. One dies at 45 and one dies at 55. The rest of the group survives to age 65. The survival function for individuals in this group is hypothesized to be exponential. Estimate the mean lifetime for these individuals using maximum likelihood. 33.8.

[160-F86:8] A cohort of four individuals is observed from time t  0 to time t  4. You are given: Individual

Time of Entry

Time of Death

A B C D

0 0 1 1

1 2 2 –

Individual D survives beyond time t  4. Assuming that the survival function for each individual is exponential with mean θ, determine the maximum likelihood estimate of θ. (A) 3/2

(B) 7/4

(C) 2

(D) 7/3

(E) 3

33.9. [160-F87:18] A mortality study is made of five individuals. An exponential is fitted using maximum likelihood. You are given: (i) Deaths are recorded at times 2, 3, 5, 8, and s, where 0 < s < 6. (ii) θˆ 1 is the maximum likelihood estimator. (iii) θˆ 2 is the maximum likelihood estimator if the study is ended at time t  6 with one individual still alive. (iv) θˆ 2 − θˆ 1  0.50. Determine θˆ 2 .

(A) 3.0

(B) 3.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 4.0

(D) 4.5

(E) 5.0

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

617

33.10. [160-F90:20] You are given: (i)

The following two patients are observed for calendar year 1999 following an operation: Date of Date of Patient Operation Death 1 2

July 1, 1998 October 1, 1998

July 1, 1999 —

(ii) The underlying mortality distribution is exponential. (iii) The maximum likelihood estimate of the exponential parameter is θ  1.00. Calculate the cause and date of termination of Patient 2. (A) (B) (C) (D) (E)

Died April 1, 1999 Withdrew April 1, 1999 Withdrew July 1, 1999 Died October 1, 1999 Withdrew October 1, 1999

33.11. [160-82-96:14] A sample of 500 light bulbs are tested for failure beginning at time t  0. You are given: (i) The study is ended at time t  4. (ii) Five light bulbs fail before time t  4 with times at failure of 1.1, 3.2, 3.3, 3.5, and 3.9. (iii) Time of failure is subject to the following distribution: S ( t )  e −at/2 for t > 0. Calculate the maximum likelihood estimate of a. (A) 0.00498

(B) 0.00501

(C) 0.00505

(D) 0.00509

(E) 0.00512

33.12. [4-S01:7] You are given a sample of losses from an exponential distribution. However, if a loss is 1000 or greater, it is reported as 1000. The summarized sample is: Reported Loss

Number

Total Amount

Less than 1000 1000 Total

62 38 100

28,140 38,000 66,140

Determine the maximum likelihood estimate of θ, the mean of the exponential distribution. (A) (B) (C) (D) (E)

Less than 650 At least 650, but less than 850 At least 850, but less than 1050 At least 1050, but less than 1250 At least 1250

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

618

33.13. You are given a sample of losses from an exponential distribution. However, if a loss is 1000 or greater, it is reported as 1000. The summarized sample is: Reported Loss

Number

Less than 1000 1000 Total

62 38 100

Determine the maximum likelihood estimate of θ, the mean of the exponential distribution. (A) (B) (C) (D) (E)

Less than 650 At least 650, but less than 850 At least 850, but less than 1050 At least 1050, but less than 1250 At least 1250

33.14. [4-S01:30] The following are ten ground-up losses observed in 1999: 18

78

125

168

250

313

410

540

677

1100

You are given: (i) The sum of the ten losses equals 3679. (ii) Losses are modeled using an exponential distribution with maximum likelihood estimation. (iii) 5% inflation is expected in 2000 and 2001. (iv) All policies written in 2001 have an ordinary deductible of 100 and a maximum covered loss of 1000. (The maximum payment per loss is 900.) Determine the expected amount paid per loss in 2001. (A) 256

(B) 271

(C) 283

(D) 306

(E) 371

33.15. [4-F01:10] You observe the following five ground-up claims from a data set that is truncated from below at 100: 125

150

165

175

250

You fit a ground-up exponential distribution using maximum likelihood estimation. Determine the mean of the fitted distribution. (A) 73

(B) 100

(C) 125

(D) 156

(E) 173

33.16. [4-F04:26] You are given: (i)

A sample of losses is 600

700

900

(ii) No information is available about losses of 500 or less. (iii) Losses are assumed to follow an exponential distribution with mean θ. Determine the maximum likelihood estimate of θ. (A) 233

(B) 400

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 500

(D) 733

(E) 1233

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

619

33.17. You are given a sample of 7 items: 5

6

9

12

18

20

30

The underlying distribution for the population has hazard rate function

  λ1 h (x )    λ2

0 < x ≤ 10 x > 10

 Determine the maximum likelihood estimators of λ 1 and λ 2 . 33.18.

You are given a sample of 6 claims: 10

30

40

60

90

150

In addition, 2 claims are known to be greater than 200, but their exact amounts are unknown. The underlying distribution for the population has hazard rate function

  λ1 h (x )    λ2

0 < x ≤ 50 x > 50

 Determine the maximum likelihood estimators of λ 1 and λ 2 . 33.19. (i)

[160-81-99:14] For a study of the failure times of batteries, you are given: Failures are assumed to follow a distribution with hazard rate function

( h (t )  (ii)

λ1 λ2

0≤t 0, t > 0. (ii) The exact times of death are 1, 1, 2, 4, 5, 6, and 6. Calculate the maximum likelihood estimate of k. (A) 0.09

(B) 0.12

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.18

(D) 0.25

(E) 0.28

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

621

33.24. [4-S01:16] A sample of ten losses has the following statistics: 10 X

10 X

X −2  0.00033674

i1 10 X

i1 10 X

X −1  0.023999

i1 10 X

X 0.5  488.97 X  31,939

i1 10 X

X −0.5  0.34445

i1

X 2  211,498,983

i1

You assume that the losses come from a Weibull distribution with τ  0.5. Determine the maximum likelihood estimate of the Weibull parameter θ. (A) (B) (C) (D) (E)

Less than 500 At least 500, but less than 1500 At least 1500, but less than 2500 At least 2500, but less than 3500 At least 3500

Use the following information for questions 33.25 and 33.26: An auto collision coverage has a deductible of 500. Under this coverage, you experience loss amounts (including the deductible) of 600, 700, 900, 1000, and 1500. Assume that claims for losses below the deductible are not submitted. 33.25. You shift the data by 500 and fit it to a Weibull with τ  2 using maximum likelihood. Determine the median fitted loss size, including the deductible, for submitted claims. 33.26. You fit the original data to a Weibull with τ  2 using maximum likelihood. Determine the median fitted loss size, including the deductible, for submitted claims. 33.27. You examine a set of claims on a policy with a policy limit of 100,000. The claims are: 200

1,000

5,000

10,000

20,000

50,000

100,000

The claim for 100,000 was censored at the policy limit. You fit a Weibull distribution with a fixed τ  2 to the loss distribution (which includes losses greater than 100,000) using maximum likelihood. Determine the resulting estimate of θ.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

622

33.28. [4-F00:22] You are given the following information about a random sample: (i) The sample size equals five. (ii) The sample is from a Weibull distribution with τ  2. (iii) Two of the sample observations are known to exceed 50, and the remaining three observations are 20, 30 and 45. Calculate the maximum likelihood estimate of θ. (A) (B) (C) (D) (E)

Less than 40 At least 40, but less than 45 At least 45, but less than 50 At least 50, but less than 55 At least 55

33.29. [C-S05:31] Personal auto property damage claims in a certain region are known to follow the Weibull distribution: 0.2 F ( x )  1 − e − ( x/θ ) , x>0 A sample of four claims is: 130

240

300

540

The values of two additional claims are known to exceed 1000. Determine the maximum likelihood estimate of θ. (A) (B) (C) (D) (E)

Less than 300 At least 300, but less than 1200 At least 1200, but less than 2100 At least 2100, but less than 3000 At least 3000

33.30. [160-S87:16] Five mice are observed over the interval [0, 4]. The time at which each mouse dies is indicated below: Mouse

Exact Time of Death

1 2 3 4 5

2 1 2 — 3

Mouse 4 survives to the end of the observation period. The survival function is assumed to be a Weibull distribution with τ  3. Determine the maximum likelihood estimator for θ.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

623

33.31. The following claim experience is observed: Claim Size

Number of Claims

0–1000 1000–2000 2000–3000 3000+

30 25 20 25

You fit a uniform distribution F ( x )  x/θ, 0 ≤ x ≤ θ to claim sizes using maximum likelihood.

Determine the estimate for θ.

33.32. [4B-S97:8] (2 points) You are given the following: •

The random variable X has a uniform distribution on the interval [0, θ].



A random sample of three observations of X has been recorded and grouped as follows: Interval

Number of Observations

[0, k ) [k, 5) [5, θ]

1 1 1

Determine the maximum likelihood estimate for θ. (A) 5

(B) 7.5

(C) 10

(D) 5 + k

(E) 10 − k

33.33. [4B-S99:3] (2 points) You are given the following: •

The random variable X has a uniform distribution on the interval (0, θ ) , θ > 2.



A random sample of four observations of X has been recorded and grouped as follows: Number of Interval Observations (0,1] (1,2] (2, θ )

1 2 1

Determine the maximum likelihood estimate of θ. (A) 8/3

(B) 11/4

(C) 14/5

(D) 20/7

(E) 3

33.34. [160-F87:3] You have observed 20 lives over the time interval (0, 15). Four deaths occurred during that interval, and the remaining 16 lives are active at t  15. You are modeling the experience using the following density function: f (t ) 

1 , w

0 ≤ t ≤ w.

Calculate the maximum likelihood estimate of w. (A) 55

(B) 60

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 65

(D) 70

(E) 75

Exercises continue on the next page . . .

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

624

33.35. [160-83-97:11] Two research teams studied five diseased cows. You are given: (i) The survival function is S ( t )  ( ω − t ) /ω, 0 ≤ t ≤ ω. (ii) Each cow came under observation at t  0. (iii) The times of death were 1, 3, 4, 4, 6. (iv) Research team X, impatient to publish results, terminated its observations at t  5 and estimated ω using the method of maximum likelihood with incomplete data. (v) Research team Y waited for the last cow to die and estimated ω using the method of maximum likelihood with complete data. Compute the absolute value of the difference between Research Team X’s and Research Team Y’s maximum likelihood estimates of ω. (A) 0.25

(B) 0.30

(C) 0.35

(D) 0.40

(E) 0.45

33.36. [160-83-97:13] For a study of four automobile engines, you are given: (i) The engines are subject to a uniform survival distribution with parameter ω. (ii) Failures occurred at times 4, 5, and 7; the remaining engine was still operational at time r. (iii) The observation period was from time 3 to time r. (iv) The MLE of ω is 13.67. Determine r. (A) 11

(B) 12

(C) 13

(D) 14

(E) 15

33.37. [4B-S98:6] (2 points) You are given the following: •

The random variable X has the density function f ( x )  α ( x + 1) −α−1 , 0 < x < ∞, α > 0.



A random sample of size n is taken of the random variable X.

Determine the limit of αˆ as the sample mean goes to infinity, where αˆ is the maximum likelihood estimator of α. (A) 0

(B) 1/2

(C) 1

(D) 2

(E) ∞

33.38. The following claim sizes are experienced on an insurance coverage: 100

500

2,000

5,000

10,000

You fit these claim sizes to the distribution 100 F (x )  1 − x





x ≥ 100

using maximum likelihood. Determine the estimate of α.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

625

33.39. A policy has an ordinary deductible of 100 and a maximum covered loss of 10,000. You observe the following six payments: 100

200

500

600

1000

5000

In addition, there are three payments at the limit. You fit a single parameter Pareto distribution with θ  50 to the ground up distribution using maximum likelihood. Determine the estimated parameter α. 33.40. [160-81-96:14] Two patients were observed over (0, 5]. You are given: (i) One died at t  4. (ii) The other was still alive at t  5. (iii) The survival function is of the form



S (t )  1 +

t 10

 −θ

t > 0, θ > 0.

,

Determine the maximum likelihood estimate for θ. (A) 0.6

(B) 0.9

(C) 1.1

(D) 1.3

(E) 2.5

33.41. [4-S00:21] You are given the following five observations: 521

658

702

819

1217

You use the single-parameter Pareto with cumulative distribution function 500 F (x )  1 − x

!α ,

x > 500, α > 0

Calculate the maximum likelihood estimate of the parameter α. (A) 2.2

(B) 2.5

(C) 2.8

(D) 3.1

(E) 3.4

33.42. [4-F02:10] A random sample of three claims from a dental insurance plan is given below: 225

525

950

Claims are assumed to follow a Pareto distribution with parameters θ  150 and α. Determine the maximum likelihood estimate of α. (A) (B) (C) (D) (E)

Less than 0.6 At least 0.6, but less than 0.7 At least 0.7, but less than 0.8 At least 0.8, but less than 0.9 At least 0.9

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

626

33.43. [4-F03:6] You are given: (i)

Losses follow a Single-parameter Pareto distribution with density function: f (x ) 

(ii)

α , x α+1

x > 1,

0 500) . 33.51. [160-S91:18] You are given: (i) Four independent lives are observed from time t  0 until death. (ii) Deaths occur at times t  1, 2, 3, and 4. (iii) The lives are assumed to be subject to the probability density function f (t ) 

te −t/c , c2

t > 0.

Calculate the maximum likelihood estimate for c. (A) 0.20

(B) 0.80

(C) 1.25

(D) 2.50

(E) 5.00

33.52. [160-F89:12] You are given: (i) 1000 persons enter a mortality study at exact age x. (ii) 200 withdrawals occur at age x + 0.4. (iii) 38 deaths occur between age x and age x + 0.4. (iv) 51 deaths occur between age x + 0.4 and age x + 1. Calculate the absolute difference between the maximum likelihood estimator of q x under the uniform distribution and the product limit estimator of q x . (A) 0.0001

(B) 0.0004

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.0007

(D) 0.0010

(E) 0.0013

Exercises continue on the next page . . .

EXERCISES FOR LESSON 33

629

33.53. [160-81-99:10] For a fertility study over a 1 year period, you are given: (i) Pregnancies are assumed to be subject to a uniform distribution. (ii) Number Who Scheduled Became Number of Lives Time of Entry Ending Time Pregnant 10 20

0.4 0.0

1.0 1.0

6 8

Calculate the maximum likelihood estimate of the pregnancy rate, q x . (A) 0.51

(B) 0.53

(C) 0.55

(D) 0.57

(E) 0.59

33.54. [160-83-94:10] In a one year mortality study on ten lives of age x, three withdrawals occur at time 0.4 and one death is observed. Mortality is assumed to have a uniform distribution. Determine the maximum likelihood estimate of q x . (A) 0.120

(B) 0.121

(C) 0.122

(D) 0.123

(E) 0.124

33.55. Auto liability policies sold by your company have a policy limit of 100,000. You have the following information on observed payments on these policies: Range of payments

Number of losses in this range

0– 50,000 50,000–100,000

125 75

Losses on this policy are believed to follow an inverse exponential distribution. The parameter is estimated using maximum likelihood. Calculate the fitted probability that a loss is greater than 100,000. 33.56. [4B-S94:12] (3 points) You are given the following: •

Four observations have been made of a random variable having the density function 2

f ( x )  2λxe −λx , •

x > 0.

Only one of the observations was less than 2. Determine the maximum likelihood estimate of λ.

(A) (B) (C) (D) (E)

Less than 0.05 At least 0.05 but less than 0.06 At least 0.06 but less than 0.07 At least 0.07 Cannot be determined from the given information

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

630

33.57. Auto collision policies are subject to a deductible of 1000. You have the following information on observed payments on these policies. There is no information for losses on which no payments are made. Number of losses with payments in this range

Range of payments

(0, 1000) [1000, ∞)

20 80

Losses on this policy are believed to follow an inverse exponential distribution with parameter θ. The parameter θ is estimated using maximum likelihood. Calculate the first quartile of losses in the fitted distribution. Additional released exam questions: C-F05:6, C-F06:18, C-S07:31

Solutions 33.1.

The MLE of an exponential is the mean, or 0.55 . (B)

33.2. Notice that there are 10 mice, 3 of whom survive the observation period of 5 days. Using the exponential shortcut, exposure is 1 (2) + 2 (3) + 1 (4) + 3 (5) + 3 (5)  42, and there are 7 events. 42/7= 6 . (E) 33.3. The two event times are 1/2 and 5/6. At time 1/2, individuals 1, 2, and 4 are in the risk set. At time 5/6, individuals 1, 4, and 5 are in the risk set. There is one death apiece at each of those two times, so the Nelson-Åalen estimate of H (1) is 1 1 2 Hˆ (1)  +  3 3 3 A constant hazard rate leads to an exponential distribution. The MLE of θ for an exponential distribution is exact exposure divided by deaths. There are 2 deaths, and exact exposure is leaving time (whether by death or otherwise) minus starting time. The starting times add up to 0 + 0 + 0 + 1/3 + 2/3  1 and the leaving times add up to 1 + 1/3 + 1/2 + 5/6 + 1  11/3. So the exposure is 11/3 − 1  8/3 and ˜ or 0.75. The cumulative hazard is the integral of θ˜  (8/3) /2  4/3. The hazard rate is the reciprocal of θ, ˜ the constant hazard rate from 0 to 1, or 0.75. Thus H (1)  0.75, and Hˆ (1) − H˜ (1)  2/3 − 0.75  −0.08. (A) 33.4.

yi

ri

si

11 22 36 51 69

15 14 11 10 9

1 3 1 1 3

1 3 1 1 3 Hˆ 1 (75)  + + + +  0.805195 15 14 11 10 9 For the exponential estimate, the estimate of θ is the sample mean, or 1227 15  81.8. This is the reciprocal of the constant hazard rate. The cumulative hazard rate at 75 is 75 times the constant hazard rate, or 75 81.8  0.916870. The difference between estimates is 0.916870 − 0.805196  0.11 . (B) ¯ and e −1/λ , the estimator of the probability that 33.5. For an exponential, the MLE is the mean, so λ˘  X, ¯

X is greater than 1, is e −1/X . (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 33

631

33.6. The exponential shortcut tells us that the mean is the sum of exposure over the number of uncensored claims. Exposure is the sum of the payments (the payments already account for the limit and the P deductible), and there are 7 uncensored claims, so the answer will be 7x i , x i being the 10 payments. But we’ll work it out without the shortcut: L (θ) 

 7 1 θ

e − (15+50+100+215+400+620+750)/θ e −3 (900)/θ





We can ignore the deductible only because this is an exponential distribution; otherwise we couldn’t be so sloppy. For an exponential distribution, the conditional distribution of X − 100 | X > 100 is the same as the unconditional distribution of X. 4850 θ 7 4850 dl − + 2 0 dθ θ θ −7θ + 4850  0 l ( θ )  −7 ln θ −

θˆ  692 67

33.7.

Let θ be the parameter of the exponential. 1 −45/θ−55/θ−8 (65/θ ) e θ2 620  −2 ln θ − θ −2 620  + 2 0 θ θ  310

L (θ)  l (θ) dl dθ θˆ

We could also use the exponential shortcut. Exposure is 45+55+8 (65)  620, and there are two uncensored observations, so θ  620/2  310 . In light of the absurd answer, the moral of the story is that lifetime is not exponential. 33.8. Using the exponential shortcut, exposure is 1 + 2 + 1 + 3  7 and number of uncensored events is 3, so the parameter is 7/3 . (D) 33.9. In the second study, the exposure of the life dying at time 8 is 6, and there are four uncensored observations Using the exponential shortcut for both θˆ 1 and θˆ 2 , 2 + 3 + 5 + s + 8 18 + s θˆ 1   5 5 2 + 3 + 5 + s + 6 16 + s ˆ  θ2  4 4 16 + s 18 + s −  0.50 4 5 80 + 5s − 72 − 4s  10 s2

16 + 2 θˆ 2   4.5 4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

632

33.10. Let x be the departure time of patient 2. Exposure in 1999 is But if there are 2 events, then x  21 . (C)

1 2 +x

2

1 2

+ x and there is one or two events.

is certainly less than 1 for x ≤ 1. So there is only one event,

1 2

+ x  1,

33.11. They try to confuse you by making the exponential parameter 2/a. There are 5 events and exposure is 495 (4) + 1.1 + 3.2 + 3.3 + 3.5 + 3.9  1995. 2 1995   399 a 5 2 a  0.00501 399

(B)

33.12. We use the exponential shortcut (page 601). The exposure is 66,140, and there are 62 uncensored claims, so the MLE is θˆ  66,140/62  1066.7742 . (D) 33.13. Now that you’re not given the amounts, as you were in the previous exercise, there are only two categories of loss—less than 1000 and greater than 1000—so the Bernoulli technique is appropriate. Maximum likelihood sets the probability of X < 1000 equal to the sample proportion, or 0.62, and Pr ( X < 1000)  1 − e −1000/θ , so 1 − e −1000/θ  0.62 1000  −ln0.38  0.967584 θ 1000 θˆ   1033.502 0.967584 33.14. The exponential shortcut estimates θˆ  3679/10  367.9. We inflate by multiplying scale parameter θ by 1.052 to get 405.6096. The expected payment per loss considering deductible and limit is E[X ∧ 1000] − E[X ∧ 100]  405.6095 (1 − e −1000/405.6095 ) − 405.6095 (1 − e −100/405.6095 )  282.5175

(C)

where the LEV formulas are taken from the Loss Models Appendix. 33.15. A very frequent trick on exams. Since the exponential is memoryless, a ground-up loss of 125 when the loss has to be more than 100 has the same meaning as a ground-up loss of 25 with no deductible. Putting it another way, the exponential shortcut says that the exposure is 25 + 50 + 65 + 75 + 150  365, since claims below 100 are impossible. So the mean is 365/5  73 . (A) If you answered E, make sure you understand why you’re wrong. 33.16. Exposure is each loss minus 500, or 100 + 200 + 400  700. θ  700/3  233 13 . (A) 33.17. The cumulative hazard rate is the integral of h ( u ) from 0 to x, or 0 < x ≤ 10 x > 10

  xλ1 H (x )    10λ1 + ( x − 10) λ2 

The survival function is the negative exponential of H ( x ) , or

  e −xλ1 S (x )    e −10λ1 −( x−10) λ2  C/4 Study Manual—17th edition Copyright ©2014 ASM

0 < x ≤ 10 x > 10

EXERCISE SOLUTIONS FOR LESSON 33

633

The probability density function is the negative derivative of S ( x ) , or x ≤ 10 x > 10

  λ1 e −xλ1 f (x )    λ2 e −10λ1 −( x−10) λ2 

Let x i be the 3 observations up to 10 and y i be the other 4 observations. Then the likelihood function is L ( λ 1 , λ2 )  λ 31 e −λ1

P

xi

λ 42 e −40λ1 e (40−

P

yi ) λ2

Notice that the λ 1 and λ 2 terms are in separate factors, and so can be maximized separately. We can P use the second formula in Table 33.2 for each parameter. For λ 1 , the function to maximize is λ 31 e −λ (40+ x i ) , so P P P λˆ 1  3/ (40 + x i )  0.05 . For λ 2 , the function to maximize is λ 42 e − ( yi −40) , so λˆ 2  4/ ( y i − 40)  0.1 . A way to get the answer quickly is to treat the study as if it were two separate studies: one from time 0 to 10, another from time 10 and on. For each study, the parameter is the number of observations in the range of the study divided by the total exposure of all observations contained within the study. Thus the first 10 of every observation counts as part of the (0, 10] study and not as part of the (10, ∞) study when measuring exposure. 33.18. By integrating the hazard rate function and then exponentiating it, we get the survival function:

  e −λ1 x S (x )    e −50λ1 −( x−50) λ2

x ≤ 50 x > 50

 and then by differentiating and negating, we get the density function x ≤ 50 x > 50

  λ1 e −λ1 x f (x )    λ2 e −50λ1 −( x−50) λ2 

The likelihood of the observations is f ( x ) , except that for the two observations censored at 200 it is S (200) . The observations up to 50 sum to 80 and the uncensored ones above 50 sum to 300. The likelihood function is L ( λ 1 , λ2 )  λ 31 e −80λ1 λ 32 e −150λ1 − (300−150) λ2 e −2 (50) λ1 e −2 (200−50) λ2





 λ 31 e −330λ1 λ32 e −450λ2 where the parenthesized term on the first line represents the likelihood of the two censored observations. Since λ 1 and λ2 are in two separate factors, they can be maximized separately. Each one has an expression of the form of the second line of Table 33.2. For λ 1 , we have λ 31 e −330λ1 , which is maximized for λˆ 1  3/330  1/110 . For λ2 , we have λ 3 e −450λ2 , which is maximized at λˆ 2  3/450  1/150 . 2

As in the previous question, the shortcut would be to divide the observations into two studies, and each parameter is the count of uncensored observations within the study over the exposure of all observations within the study period.

33.19. Since exponentials have no memory, we can treat this like two separate studies, one for times 0 to 2 and another for times past 2. For each study, we will use the exponential shortcut. Note that the hazard rate function is the reciprocal of the parameter θ. For the [2,7] study, only batteries 3 through 5 are involved (we’re not sure about battery 5), exposure is 0.6 + 1.3 + max (0, φ5 − 2) and there is one event, so 1/0.2941  3.4  1.9 + max (0, φ5 − 2) forcing φ5  3.5. For the [0,2] study, there is one event and exposure is 1.7 + φ2 + 2 + 2 + 2  7.7 + φ2  1/0.1087  9.2, so φ2  9.2 − 7.7  1.5, and |φ2 − φ5 |  3.5 − 1.5  2.0 . (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

634

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

33.20. The exponential shortcut helps, since you know the mean of Phil’s bulbs is 1000 and the mean of Sylvia’s is 1500. The likelihood for each of Phil’s bulbs is e −x/θP θP and for Sylvia’s, if θS  2θP

e −x/2θP 2θP

Let x i be the 20 observations P P for Phil’s bulbs and y i the 10 observations for Sylvia’s bulbs. We know that

x i  20x¯  20,000 and y i  10 y¯  15,000. Multiplying 20 of Phil’s and 10 of Sylvia’s, and dropping the 2’s in the denominator as a multiplicative constant: L ( θP )  

e −(

P

x i /θP +

P

y i /2θP )

θ30

e −27,500/θP θP30

27,500 θP −30 27,500 dl  + 0 dθP θP θP2 27,500  917 (B) θ∗  30

l ( θP )  −30 ln θP −

1 −x/ (2θ ) 33.21. The likelihood for medium-hazard risk is 2θ e , and for high-hazard risk replace the 2’s with 3’s. Ignoring the 2’s and 3’s in the denominators as multiplicative constants,

L (θ) 

e −(

1+2+3 15 2θ + 3θ

θ4

l ( θ )  −4 ln θ −

)



8 θ

e −8/θ θ4

4 8 dl − + 2 0 dθ θ θ θˆ  2 (B)

33.22. 2

S ( x; θ )  e − ( x/θ ) 2 1 f ( x; θ ) ∝ 2 e − ( x/θ ) θ We’ll pretend it’s equal; you can ignore multiplicative constants when maximizing likelihood. 1 − (1+4+9+16)/θ2 e θ8 30 l ( θ )  −8 ln θ − 2 θ −8 60 dl  + 3 0 dθ θ θ

L (θ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 33

635

−8θ 2 + 60  0 15 θ2  2 θˆ  2.7386 Alternatively, we can use the Weibull shortcut here: θ 

x 2i 4 .

qP

33.23. This is a Weibull with τ  2. If we integrate and exponentiate the hazard rate function to get the t 2 2 survival function, we find that S ( t )  e −kt /2 . Reparametrizing with parameter θ, the form is S ( t )  e − ( θ ) , so k  θ22 . Using our shortcut to solve for θ: 12 + 12 + 22 + 42 + 52 + 62 + 62 119   17. θˆ 2  7 7 2 So k  17  0.12 . (B) 33.24. We use the Weibull shortcut on page 605. This says that we take the τ’th root (which would be squaring here) of the τ’th moment (488.97/10 in our case). So the answer is 48.8972  2390.9166 . (C)

33.25. Using the Weibull shortcut, θˆ  x 2 θ

q

q

1002 +2002 +4002 +5002 +10002  1,460,000 5 5 √ x 1 x2  ln 2, or θ  ln 2, or x  2 , or θ √

payment is the point such that e − ( )  500 higher than the payment, so the answer is 500 + 540.37 ln 2  949.89 . 33.26. The data are truncated at 500, so the likelihood is L (θ) 

e



1 θ2

P

− 12 θ

x 2i



f ( x i ;θ ) 1−F (500;θ ) .

 540.37. The median of the √ θ ln 2, and the claim size is

From basic principles:

θ −10

(5)(5002 )

e  1 X 2 l (θ)  − 2 x i − 5 (5002 ) − 10 ln θ θ X x 2i − 5 (5002 )  3,660,000 dl 10 2  3 (3,660,000) − 0 dθ θ θ θˆ 

r

3,660,000  855.57 5

Alternatively, we could use the Weibull shortcut. Undoing thePtransformation from an exponential means squaring the observations and the deductible, so we’d divide x 2i − 5 (5002 ) by the number of uncensored P

x 2 −5 (5002 )

i observations. Here, all 5 observations are uncensored. So we get . As usual, we have to take the 5 th τ root of this to get θ. The median of the claims is x such that F ( x ) is halfway between F (500) and 1. 2

F (500)  1 − e − (500/855.57)  0.2893 Therefore the median is x such that F ( x )  ˆ

2

ˆ

2

1 − e − ( x/θ )  0.6447 e − ( x/θ )  0.3553

C/4 Study Manual—17th edition Copyright ©2014 ASM

1+0.2893 2

 0.6447. We have:

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

636

x θˆ

!2

 − ln 0.3553

√ √ x  θˆ − ln 0.3553  855.57 − ln 0.3553  870.33

33.27. The Weibull shortcut, which works even with censoring, tells us to divide the sum of the squares of all observations (including the censored one) by the number of uncensored observations (6), and then to take the τ th root, to obtain

s P θ

x 2i

r 

6

1.302604 · 1010  46594 . 6

To work it out from first principles: The density function is f ( x i )  over the 6 uncensored observations, we get 26

6 i1

Q

xi e −



P6

i1

2 ( x i /θ ) 2 e − ( x i /θ ) xi

2

. Multiplying this

x 2i /θ2

θ 12 2

The distribution function at 100,000 is F (100,000)  1− e − (100,000/θ ) . Multiplying this by the above product results in the likelihood function. If we drop the constants and let x7  100,000, we get: P7 2 2 1 L ( θ )  12 e − i1 x i /θ θ

!

l ( θ )  −12 ln θ −

P7

2 i1 x i θ2

7 2 dl −12 2 i1 x i 0  + dθ θ θ3

P

s θ

P7

i1

6

x 2i

and

θˆ  46,594

7 X i1

x 2i  1.302604 · 1010

33.28. We could use the Weibull shortcut, but we’ll do it from first principles. The density is f ( x; θ ) 

2x − ( x/θ ) 2 e θ2

and the probability of exceeding 50 is the survival function, or 2

Pr ( X > 50)  e − (50/θ ) . Multiplying the three observations and two factors for the censored distributions, and dropping some multiplicative constants, 2 1 − (202 +302 +452 )/θ2 −2 (502 )/θ2 1 e e  6 e −8325/θ θ6 θ 8325 l ( θ )  −6 ln θ − 2 θ −6 2 (8325) dl  + 0 dθ θ θ3

L (θ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 33

637

−6 ( θ 2 ) + 16,650  0

θˆ 

r

16,650  52.6783 6

(D)

33.29. This exercise can be done from first principles, or using our Weibull shortcut. Directly, the likelihood, ignoring constants, is

!4

0.2 0.2 0.2 0.2 0.2 0.2 1 L ( θ )  0.2 e [−130 −240 −300 −540 −2 (1000 ) ]/θ θ 0.2 1  0.8 e −20.25053/θ θ 20.25053 l ( θ )  −0.8 ln θ − θ 0.2 −0.8 20.25053 l0 ( θ )  + 0.2 0 θ θ 1.2 20.25053 − 4θ 0.2  0

20.25053 θˆ  4

!5

 3325.69

(E)

The shortcut would allow us to write the last line immediately: take the sum of all observations, including censored observations, raised to the 0.2, and divide by the number of uncensored observations (4), then take the 0.2th root (or the 5th power) of the result. 33.30. Using the Weibull shortcut, θˆ 

r 3

r 

3

23 + 1 3 + 2 3 + 4 3 + 3 3 4 108  3 4

33.31. This exercise can be done using the shortcut for the uniform distribution mentioned in the lesson. There are 100 total observations and 75 below 3000, so θˆ  100 75 (3000)  4000 . The following shows you how to do it from first principles. θ must be at least 3000, or else the likelihood is 0. The likelihood of an observation being in a bounded interval of length c is θc . The intervals up to 3000 in this problem all have length c  1000. The likelihood of an observation being in an interval . above 3000, since it also has to be below θ, is θ−3000 θ 1000 L (θ)  θ

! 75

θ − 3000 θ

! 25

l ( θ )  −100 ln θ + 25 ln ( θ − 3000) dl −100 25  + 0 dθ θ θ − 3000 100θ − 300,000  25θ θˆ  4000

C/4 Study Manual—17th edition Copyright ©2014 ASM

ignoring constants

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

638

33.32. The shortcut for the uniform distribution says to multiply the lower bound of the interval going to ∞, 5, by the ratio of total number of observations, 3, over total number of observations below the lower bound, 2, leading to the answer θˆ  32 (5)  7.5 . Doing it the long way: k 5−k θ−5 θ θ θ l ( θ )  ln ( θ − 5) − 3 ln θ 1 3 dl  − 0 dθ θ − 5 θ −2θ + 15  0 L (θ) 

θˆ  7.5

(ignoring constants)

(B)

33.33. Using the shortcut for the uniform distribution, we multiply the lower bound of the interval going to ∞, 2, by the ratio of the total number of observations, 4, over the number of observations below the lower bound, 3. Then θˆ  4 (2)  8 . (A) 3

3

33.34. The distribution is uniform on [0, w]. The uniform shortcut applies, so we multiply the lower bound of the unbounded interval, 15, by the total number of lives, 20, divided by the number of observations below 15, 4, to get ! 20 (15)  75 (E) 4 33.35. The distribution is uniform on [0, ω]. The uniform shortcut applies. For research team Y, ωˆ is the maximum or 6. For research team X, ωˆ is the lower bound of the unbounded interval, 5, times the total  number of lives, 5, divided by the number of uncensored observations, 4. Then 45 (5)  6.25. Thus the difference in the estimates is 6.25 − 6  0.25 . (A) 33.36. The distribution is uniform between 3 and ω, so we can still use the uniform shortcut with care. If we subtract 3 from all times, we have the usual situation for the uniform shortcut, with a uniform distribution on [0, ω − 3]. The MLE is then the censoring point, r − 3 (we’ve subtracted 3 from all times) times 4/3. We then add back 3, so the MLE is 3 + 34 ( r − 3) . We can now back out r. 4 3+ ( r − 3)  13.67 3

!

r−38

r  11

(A)

33.37. This is a Pareto with θ  1. As indicated in Table 33.1, the maximum likelihood estimator is αˆ  P

n ln (1 + x i )

¯ then at least one of the x i s is greater than or equal to x. ¯ The other x i s are If the sample mean x i /n  x, P nonnegative. So ni1 ln (1 + x i ) ≥ ln (1 + x¯ ) . As x¯ → ∞, ln (1 + x¯ ) → ∞. Therefore αˆ → 0. (A)

P

33.38. This is a single parameter Pareto with θ  100. The density function is found in the Loss Models appendix, or you could differentiate F with respect to x. f (x ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

100α α x α+1

EXERCISE SOLUTIONS FOR LESSON 33

639

1005α α 5 L ( α )  Q α+1 ( xi ) l ( α )  5α ln 100 + 5 ln α − ( α + 1) ln

5 dl  5 ln 100 + − ln α Ydα ln x i  36.1482 αˆ 

Y

Y

xi

xi

5  0.3810 36.1482 − 5 ln 100

33.39. The density function for a single parameter Pareto with θ  50 is f ( x; α )  and the distribution function is

α (50α ) x α+1



50 F ( x; α )  1 − x The likelihood of each of the 6 uncensored payments x i is

x ≥ 50.

α50α

f ( x i + 100)  1 − F (100)

( x+100) α+1 50α 100α

100α α ( x i + 100) α+1



The likelihood of the three censored payments of 9900 is 1 − F (10,000)  1 − F (100)

50  α 10,000 50  α 100

1006α α 6

1 1003α



1 100α

Multiplying these 9 likelihoods together we get the likelihood function. L ( α )  Q6

i1 ( x i

 Q6

+ 100) α+1

!

1003α α 6

i1 ( x i

+ 100) α+1

l ( α )  3α ln 100 + 6 ln α − ( α + 1) ln

6 Y

( x i + 100)

i1

6

Y dl 6  3 ln 100 + − ln ( x i + 100)  0 dα α 6  25.6747 α αˆ  0.23369

i1

Alternatively, you can use the formula in Table 33.1 for a single-parameter Pareto with fixed θ: K  ln 200 + ln 300 + ln 600 + ln 700 + ln 1100 + ln 5100 + 3 ln 10,000 − 9 ln 100  25.67466 6 αˆ   0.23369 25.67466 C/4 Study Manual—17th edition Copyright ©2014 ASM

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

640

33.40. This is a Pareto distribution. The density function is f ( t; θ )  θ 1 +



The likelihood of the death is f (4; θ )  where C is a constant. The likelihood of surviving to 5 is

 t −θ−1 10

1 ( 10 )

θ 1 θ C θ 10 1.4θ+1 1.4

S (5; θ )  1.5−θ

Multiplying these two together and ignoring the constant, we have L (θ) 

θ

(1.4θ )(1.5θ )

l ( θ )  ln θ − θ ln 2.1 1 dl  − ln 2.1  0 dθ θ 1 θˆ   1.3478 ln 2.1

(D)

33.41. The density function for the single-parameter Pareto (you can either differentiate F ( x ) or look up the tables) is α (500α ) f (x )  x α+1 Let the five observations be x i . The likelihood is the product of the 5 densities, or α 5 5005α L ( α )  Q α+1 ( xi ) i

l ( α )  5 ln α + α *5 ln 500 − ln

Y

,

xi +

i

Y 5 dl xi  0  + 5 ln 500 − ln dα α

-

i

Y 5  −5 ln 500 + ln x i  −5 (6.2146) + 33.1111  2.0381 α i

αˆ 

5  2.4533 2.0381

(B)

33.42. Let x i be the three observations (225, 525, 950). Then L (α)  Q

α3 (150) 3α (150 + x i ) α+1

l ( α )  3 ln α + α (3 ln 150) − ( α + 1) ln dl 3  + 3 ln 150 − ln dα α

C/4 Study Manual—17th edition Copyright ©2014 ASM

Y

Y

(150 + x i )  0

(150 + x i )

EXERCISE SOLUTIONS FOR LESSON 33

641

Y 3  −3 ln 150 + ln (150 + x i ) α  −3 (5.0107) + ln (375 · 675 · 1100)  −3 (5.0107) + 19.4470  4.4128 3  0.6798 αˆ  (B) 4.4128

33.43. The survival function is

1 1 − F (x )  x



so the likelihood of the 3 losses and the 2 censored losses is α3 (3 · 6 · 14 · 252 ) α (3)(6)(14)  3 ln α − α ln 157,500 − ln 252 3  − ln 157,500  0 α 3   0.2507 (A) ln 157,500

L (α)  l (α) dl dα αˆ

33.44. Based on subsection 33.4.2, we must calculate K. We are given that θ  10,000. K  ln  ln

(θ +

750) 3 ( θ

( θ + 200) 3 θ13 ( θ + 300) 4 + 200) 3 ( θ + 300) 4 ( θ + 10,000) 6 ( θ + 400) 4

10,00013

(10,7503 )(20,0006 )(10,4004 )

 −4.532728 There are n  14 uncensored observations, so αˆ 

14  3.08865 4.532728

(C)

33.45. Let α 2 be α for Region 2. The expected value of a single-parameter Pareto with θ  1 is α2 1.5α  α2 − 1 α − 1 α2 ( α − 1)  1.5α ( α 2 − 1)

αα2 − α 2  1.5αα2 − 1.5α

α 2 (1 + 0.5α )  1.5α 3α α2  2+α The likelihood equation is L (α) 

αn ( xi )α

!

Q

l ( α )  n ln α − α C/4 Study Manual—17th edition Copyright ©2014 ASM

(3α/ (2 + α )) m Q 3α/(2+α )

!

yi

X

ln x i + m ln 3α − m ln (2 + α ) −

3α X ln y i 2+α

α α−1 ,

so

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

642

The derivative of f ( α ) 

3α 2+α

is f 0 (α) 

(2 + α )(3) − (3α ) 6  (2 + α ) 2 (2 + α ) 2

so the derivative of the loglikelihood function is 6 ln y i dl n X m m  − ln x i + − − dα α α 2+α ( α + 2) 2

P

which reduces to (D). 33.46. The losses are truncated at 1, but this plays no role, since we are shifting the distribution and only fitting the amount of time for which payments are made. After shifting the 10 observations for (1, 2) become 10 observations for (0, 1) , and the 30 observations for [2, ∞) become 30 observations for [1, ∞) , and there is no truncation. 10  0.25, so This is a perfect question for the Bernoulli technique. F (1)  10+30 F (1)  1 −

θ  0.25 θ+1

θ  0.75 θ+1 θˆ  3

(E)

The following is if you wanted to do it the long way. The distribution function for a Pareto with α  1 is F (x )  1 −

θ . θ+x

The likelihood of each of the 10 losses in (0, 1) is F (1) − F (0)  F (1)  1 −

θ , θ+1

while the likelihood of the each of the 30 losses in [1, ∞) is 1 − F (1) 

θ . θ+1

We are ready to calculate the likelihood function.



L (θ)  1 −

θ θ+1

 10

θ θ+1

! 30

l ( θ )  30 ln θ − 40 ln ( θ + 1) dl 30 40  − 0 dθ θ θ+1 30 ( θ + 1)  40θ 10θ  30 θˆ  3

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)



1 θ+1

! 10

θ θ+1

! 30

EXERCISE SOLUTIONS FOR LESSON 33

643

33.47. This can be done using the shortcut for the lognormal distribution mentioned in the Transformation subsection of the lesson. We have µˆ  5 X

P

ln x i ln (100 · 500 · 1000 · 5000 · 10,000)   7.091013 5 5

ln2 x i  21.2076 + 38.6214 + 47.7171 + 72.5426 + 84.8304  264.9190

i1

264.9190 − 7.0910132  2.7013 5 √ σˆ  2.7013  1.6436

σˆ 2 

33.48. Since there are three ranges and two parameters to fit, we use the Bernoulli technique. Thus for claim size X, Pr ( X < 2500)  250/500  0.5 and Pr ( X < 5000)  (250 + 150) /500  0.8. The question is reduced to percentile matching at the 50th and 80th percentiles. For the standard normal distribution, the 50th percentile is 0 and the 80th percentile is 0.842. µˆ  ln 2500  7.824 µ + 0.842σ  ln 5000 ln 5000 − ln 2500  0.823 σˆ  0.842 The third quartile, or 75th percentile, of a standard normal distribution is 0.67. Therefore the third quartile of the fitted lognormal distribution is e 7.824+0.67 (0.823)  4339 . 33.49. We can eliminate (C), (D), and (E) without doing any work, since these are not asymptotically unbiased estimators of σ, whereas maximum likelihood is asymptotically unbiased as we’ll learn in Lesson 34 ((C) and (E) are asymptotically unbiased estimators of σ 2 .) We know (A) is the correct answer because that’s what shortcut 3 said. But if you want to prove this (don’t forget that µ  0): 1 −0.5 P x 2 /σ2 i e σn P 2 xi  −n ln σ − 2σ2 P 2 x −n  + 3i  0 σ σ P 2 xi  n

L (σ)  l (σ) dl dσ σˆ 2

s σˆ 

P

x 2i

n

omitting the constant

(A)

33.50. This is a gamma distribution with α  2. We could use the shortcut mentioned in the lesson, that the MLE is the same as the method of moments estimator, which is 2θ  x¯  1100, so θˆ  550. If we want to calculate θˆ without using the shortcut, we have:

Y 1 − P x i /θ e Ignore x i , a constant θ 12 X 1 l ( θ )  −12 ln θ − xi θ

L (θ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

644

−12 1 X dl  + 2 xi  0 dθ θ θ θˆ  550 Now we calculate the estimated probability of X > 500.

D ( X > 500)  Pr

Z

∞ 500

x −x/550 e dx 5502

∞ ∞ 1 −x/550  −550xe + 550 e −x/550 dx 500 5502 500  1  −10/11 2 −10/11 550 ( 500 ) e + ( 550 ) e  5502 21 −10/11  11 e  0.7692

Z

!

33.51. This is a gamma distribution with α  2, θ  c, so by the shortcut the MLE of θ is the same as the method of moments estimator. Average death time of 2.5 equals 2c (the mean of the gamma distribution); cˆ  1.25 (C) 33.52. First let’s work out the MLE. For the MLE, when using a uniform distribution, it doesn’t matter when the death occurred; the likelihood is some constant times q x . This is true even if death occurs in the interval ( x + 0.4, x + 1) . It is true that the conditional probability of death in that interval, given survival to qx time x+0.4, is 1−0.4q . But the unconditional probability of death—the probability of survival to time x+0.4 x followed by death thereafter—is 0.6q x . After all, that’s what uniform deaths means! So the likelihood term for the deaths is q 38+51  q 89 x x (as usual, we can drop the constants). For the 200 withdrawals, the likelihood of survival to time x + 0.4 is 1 − 0.4q x . That leaves the 1000 − 200 − 38 − 51  711 survivors, and the likelihood of survival for one year is 1 − q x . So we have the likelihood function and can proceed. 200 L ( q x )  q 89 (1 − q x ) 711 x (1 − 0.4q x )

l ( q x )  89 ln q x + 200 ln (1 − 0.4q x ) + 711 ln (1 − q x ) dl 89 80 711  − − 0 dq x q x 1 − 0.4q x 1 − q x

89 − 124.6q x + 35.6q 2x − 80q x + 80q 2x − 711q x + 284.4q 2x  0 400q 2x − 915.6q x + 89  0

qˆ MLE x

915.6 − 915.62 − 4 (400)(89)   0.101725 800

p

For the product limit estimator, deaths before and after the withdrawals can be grouped, so we have 962 711 Sˆ ( x )   0.897614 1000 762 ˆ qˆ PL x  1 − S ( x )  0.102386

!

!

The difference between the two estimators is 0.102386 − 0.101725  0.000661 . (C)

33.53. The likelihood of pregnancy for those entering at time 0 is q x , and the likelihood of no pregnancy is 1 − q x , so the 20 lives entering at 0.0 contribute q 8x (1 − q x ) 12 to the likelihood function. For those entering 0.6q at time 0.4, the likelihood of pregnancy is 1−0.4qx x and the likelihood of no pregnancy is the complement, C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 33 1−q x 1−0.4q x .

645

Multiplying them together, we get 0.6q x L (qx )  1 − 0.4q x

!6

1 − qx 1 − 0.4q x

!4

q 8x (1 − q x ) 12

l ( q x )  6 ln 0.6 + 14 ln q x + 16 ln (1 − q x ) − 10 ln (1 − 0.4q x ) dl 14 16 (10)(0.4)  − + 0 dq x q x 1 − q x 1 − 0.4q x

14 (1 − q x )(1 − 0.4q x ) − 16q x (1 − 0.4q x ) + 4q x (1 − q x )  0 14 − 14 (1.4) q x + 5.6q 2x − 16q x + 6.4q 2x + 4q x − 4q 2x  0

8q 2x − 31.6q x + 14  0

31.6 − 31.62 − 4 (8)(14) qˆ x   0.5085 16

p

(A)

33.54. The likelihood of the death is q x regardless of when it happened. The likelihood of survival to time 0.4 is 1 − 0.4q x and the likelihood of survival to time 1.0 is 1 − q x . The rest is mechanical. L ( q x )  q x (1 − q x ) 6 (1 − 0.4q x ) 3

l ( q x )  ln q x + 6 ln (1 − q x ) + 3 ln (1 − 0.4q x ) 1 6 1.2 dl  − − 0 dq x q x 1 − q x 1 − 0.4q x

(1 − 0.4q x )(1 − q x ) − 6q x (1 − 0.4q x ) − 1.2q x (1 − q x )  0

1 − 1.4q x + 0.4q 2x − 6q x + 2.4q 2x − 1.2q x + 1.2q 2x  0

4q 2x − 8.6q x + 1  0 √ 8.6 − 8.62 − 16 qˆ x   0.123357 8

(D)

33.55. The policy limit plays no role, since in any case we only know which losses are above or below 50,000, but have no more detail. D ( X ≤ 50,000)  125  0.625. For an inverse Let X be loss size. Using the Bernoulli technique, Pr 125+75 exponential, F ( x )  e −θ/x . Therefore ˆ e −θ/50,000  0.625

D ( X > 100,000) , and We need Pr ˆ

ˆ

D ( X > 100,000)  1 − e −θ/100,000  1 − e −θ/50,000 Pr 

 0.5

 1 − 0.6250.5  0.2094

33.56. This is grouped data with two intervals (below 2, above 2) and one parameter, so the Bernoulli technique applies. Maximum likelihood will set Pr ( X < 2)  0.25, since 1/4 of the observations are below 2. Therefore Pr ( X < 2)  2

Z 0

2

0

2

2λxe −λx dx  0.25 2

2

2λxe −λx dx  −e −λx  1 − e −4λ 0

e

C/4 Study Manual—17th edition Copyright ©2014 ASM

2

Z

 0.75 ln 0.75 λˆ  −  0.07192 4

−4λ

(D)

33. MAXIMUM LIKELIHOOD ESTIMATORS—SPECIAL TECHNIQUES

646

33.57. The conditional probability of a payment greater than 1000, given that the loss is above 1000, is Pr ( X > 2000 | X > 1000) 

1 − F (2000) 1 − e −θ/2000  1 − F (1000) 1 − e −θ/1000

Let x  e −θ/2000 . Then we can express this probability as

Pr ( X > 2000 | X > 1000) 

1 1−x  1 − x2 1 + x

By the Bernoulli technique, this probability is the proportion of payments in the range, or Then

80 80+20

 0.8.

1  0.8 1+x x  0.25 e −θ/2000  0.25 The first quartile of (ground up) losses, or the 25th percentile, is π such that F ( π )  0.25, or e −θ/π  0.25 But we have already shown that e −θ/2000  0.25, so πˆ  2000 .

Quiz Solutions 33-1.

There are two uncensored observations, #2 and #5. θˆ 

r 3

353 + 403 + 303 + 403 + 503 − 63 − 103 − 403  50.5055 2

Mean survival time is Γ (4/3)(50.5055)  (0.89298)(50.5055)  45.10 . 33-2. The total number of observations is 80, and the number of observations below 5000 is 50. The upper bound of the highest interval is 10,000, and is compared to 80 50 (5000)  8000, which is less than 10,000, so θ  8000 . 33-3.

Maximum covered claim is 10,500. 65005  −3.02990 7000 · 9000 · 14,000 · 16,5002 3 αˆ  −  0.9901 −3.02990

K  ln

33-4.

D ( X > 1000)  e −1000/θ  0.465, and Fitted probabilities match the sample, so Pr θˆ  −1000/ ln 0.465  1305.96

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 34

Variance Of Maximum Likelihood Estimators Reading: Loss Models Fourth Edition 13.3–13.5

34.1

Information matrix

34.1.1

Calculating variance using the information matrix

Maximum likelihood estimators are asymptotically unbiased, consistent, and asymptotically are normally distributed, as long as certain regularity conditions hold. No unbiased estimator has a lower variance. All of these properties are only true asymptotically; before the sample size gets to infinity, however, the estimators may be biased, for example. The asymptotic covariance matrix of the maximum likelihood estimator is the inverse of the (Fisher’s) information matrix, whose entries are

" I ( θ )rs  − EX

∂2 ∂ ∂ l ( θ )  EX l (θ) l (θ) ∂θs ∂θr ∂θr ∂θs

#

"

# (34.1)

The inverse of this matrix is the Cràmer-Rao lower bound for the variance of any unbiased estimator. Thus asymptotically, the maximum likelihood estimator’s variance is the lowest of any unbiased estimator.1 You should be able to invert 2 × 2 matrices. It is unlikely that you would be expected to invert larger matrices. In case you never learned how to invert 2 × 2 matrices, a simple method is presented in a sidebar. When there is only one parameter θ, the information matrix has a single entry whose value is

" I ( θ )  − EX

d2 l  EX dθ 2

#

! 2   dl   dθ 

If the sample has n independent identically distributed observations, then l ( θ )  information matrix in the one-parameter case is

" I ( θ )  −n EX

d2 ln f ( x i ; θ )  n EX dθ 2

#

(34.2)

Pn

i1 ln

f ( x i ; θ ) and the

! 2   d ln f ( x i ; θ )  dθ  

The equality of the negative of the expected value of the second derivative and the expected value of the first derivative squared is not proved in the book and you are not responsible for its proof. For the curious student, however, I’ve included a proof for the one-variable case in a sidebar. It is usually easier to use the expression with second derivatives, rather than the expression with squares or products of first derivatives. For one thing, the random variable X many times disappears in the differentiation process. In this case, the expected value is the second derivative itself. In addition, 1For a multi-parameter situation, this means that the excess of the covariance matrix of an unbiased estimator over the inverse of the information matrix is positive semi-definite. C/4 Study Manual—17th edition Copyright ©2014 ASM

647

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

648

How to invert a 2 × 2 matrix

The adjoint method is the best method for inverting a 2 × 2 matrix. The cofactor of a matrix element a i j is (−1) i+j times the determinant of the minor formed by removing the i th row and j th column from the matrix. In the case of a 2 × 2 matrix, it is simply the element opposite a i j , multiplied by −1 if i + j is odd. Let b i j be the cofactor of a i j . Then we can form the matrix B  {b i j } of the cofactors. B, the matrix formed by the b i j , is called the adjoint of A, the matrix formed by the a i j . The inverse of A is the transpose of B, divided by the determinant of A: BT A−1  det A In the 2 × 2 case, b i j  (−1) i+ j a3−i,3− j , and the determinant of A is a11 a22 − a12 a21 , so a 22 −a21

A−1  3 Example 34A Calculate the inverse of 4

−a12 a11

!

a11 a22 − a12 a21

2 . 6

!

Answer: 3 4

2 6

! −1

6 −4

−2 3

!

0.6   −0.4 (3)(6) − (2)(4)

−0.2 0.3

! 

In this lesson, the matrices are all symmetric (a21  a12 ), making the formula even simpler.

it’s usually easier to differentiate twice than to calculate the square or product of first derivatives. The formula using second derivatives is taken for granted so much that there was a defective exam question (Fall 2000:13) which only worked if you took second derivatives but not if you used the square of the first derivative! Once the variance of the parameters is known, confidence intervals can be built for probabilities of events by using the delta method to calculate the variance of probabilities of events. We will discuss confidence intervals later. Example 34B [4B-S90:54] (3 points) A single observation, x, is taken from a normal distribution with mean µ  0 and variance σ2  θ. The normal distribution has its probability density function given by f (x ) 

2 2 1 √ e − ( x−µ) /2σ σ 2π

Let θˆ be the maximum likelihood estimator of θ. ˆ Which of the following is the asymptotic variance of θ? (A) 1/θ

(B) 1/θ2

(C) 1/2θ

(D) 2θ

(E) 2θ 2

Answer: We will illustrate the use of the information matrix, although that was not the intent of the original exam question. First, the likelihood function for θ with one observation is obtained by substituting √ √ µ  0 and σ  θ into the probability density function. As usual, multiplicative constants (1/ 2π here) C/4 Study Manual—17th edition Copyright ©2014 ASM

34.1. INFORMATION MATRIX

649

Proof of the equality of the two expressions in equation (34.2)

Let L ( θ ) be the likelihood function. The first derivative of the loglikelihood is dl d ln L dL/dθ   dθ dθ L (θ) Therefore, the second expression, the expectation of the square of the first derivative, is the expectation of (dL/dθ ) 2 L (θ)2 The second derivative of the loglikelihood is, by the quotient rule for derivatives applied to

dL/dθ : L (θ)

L ( θ ) d2 L/dθ 2 − (dL/dθ ) 2 d2 l  dθ 2 L (θ)2 The second part of the numerator divided by the denominator is negative the square of the first derivative. We only have to prove that the expected value of the first part of the numerator divided by the denominator is 0. This expected value is taken over the likelihood function’s probability space, so it is

Z

L ( θ ) d2 L/dθ 2 L ( θ ) dx  L (θ)2

Z

d2 L d2 dx  dθ 2 dθ 2

Z

L ( θ ) dx

since differentiation can be moved out of the integral. However, the likelihood is a proper probability function whose integral over the entire space is 1, so the derivative of this integral is 0. This completes the proof.

will be ignored:

2 1 L ( θ )  √ e −x /2θ θ

Log the function and differentiate it twice. 1 x2 l ( θ )  − ln θ − 2 2θ dl 1 x2 − + dθ 2θ 2θ2 d2 l 1 x2  − 3 2 2 dθ 2θ θ

(*)

However, E X 2  σ2  θ, since the second moment of a normal random variable with mean 0 is its variance. So the information matrix is

f

g

d2 l 1 θ 1 −E − 2 + 3  dθ 2 2θ θ 2θ 2

"

#

Inverting this yields the asymptotic variance:

L (θ)  Var C/4 Study Manual—17th edition Copyright ©2014 ASM

1  2θ 2 1/2θ 2

(E)

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

650

We will also calculate the information matrix using the alternative first derivative expression, although you will probably never use this alternative for any exam question. Squaring expression (*), dl dθ

!2 

2x 2 x4 1 − + 4θ 2 4θ 3 4θ 4

The expected value of x 2 is the second moment of the normal distribution, or θ. The expected value of x 4 is the fourth moment of the normal distribution. Since the mean is 0, the fourth moment is the coefficient of kurtosis times the variance squared. The coefficient of kurtosis for a normal distribution is 3, so the fourth moment is 3θ 2 . Then

! 2  2θ 3θ 2 1 dl   − + E   dθ  4θ2 4θ3 4θ4 

2 3 1 1 − +  , 4θ 2 4θ 2 4θ 2 2θ 2

the same as above. ˆ (Not the asymptotic The original exam question asked “Which of the following is the variance of θ?” variance.) To answer this, we cannot use the information matrix but must calculate the variance directly. dl Setting dθ  0, we obtain dl 1 x2 0 − + dθ 2θ 2θ 2 −θ + x 2  0 θˆ  x 2 The variance of θˆ is

Var ( θˆ )  Var ( X 2 )  E[X 4 ] − E[X 2 ]



2

(∗)

X is drawn from a normal distribution with mean 0, so E[X 2 ] is the variance of the normal distribution, or θ. Also, E[X 4 ] is the kurtosis of the normal distribution times the variance squared, and the kurtosis of a normal distribution is 3, so E[X 4 ]  3θ 2 . Substituting E[X 2 ] and E[X 4 ] into (∗), Var ( X 2 )  3θ 2 − θ 2  2θ2



When the MLE for a single-parameter distribution is the same as the method of moments estimator, a shortcut for calculating the variance of the MLE is available: the variance of the sample mean is σ2 /n. In fact, this shortcut is better than the regular method, since it is true for any n, not just asymptotically. Example 34C A random sample of n claims, x1 , x2 , . . . , x n , is taken from the following exponential distribution: f ( x )  θ1 e −x/θ , x > 0. ˆ the maximum likelihood estimator for θ. 1. [4B-S92:21] (2 points) Determine θ, (A) (B) (C) (D) (E)

1 Pn i1 ln ( x i ) n 1 Pn xi n Pi1n 1 √ i1 ln ( x i ) n P n √1 i1 x i n P n 1 xi √ i1 e n

C/4 Study Manual—17th edition Copyright ©2014 ASM

34.1. INFORMATION MATRIX

651

ˆ 2. [4B-S92:22] (2 points) Determine the variance of θ. (A)

1 n

(B)

1 n2

P

n i1

P

xi

n i1

2

xi

2

(C) θ/2n (D) θ 2 /n (E) θ 2 /2n Answer:

1. As discussed in the previous lesson, θˆ is the sample mean,

1 n

Pn i1

x i . (B)

2. The variance of θˆ is the distribution variance divided by n. The variance of the exponential distribution is θ 2, so the variance of θˆ is θ 2 /n . (D) 

34.1.2

Asymptotic variance of MLE for common distributions

For simplicity we’ll only discuss variance of MLE for complete data. It is hard to deal with incomplete data. However, the last two examples will illustrate calculating the asymptotic variance with truncated and censored data. Uniform distribution For a uniform distribution on [0, θ], the theorem relating the asymptotic variance to the information matrix does not apply, since the MLE is not determined through differentiation of the loglikelihood function. The MLE of θ is the maximum of the observations. The variance of the MLE is the variance of the maximum observation, which is given by formula (21.2): Var ( θˆ ) 

nθ 2 ( n + 1) 2 ( n + 2)

Asymptotically this is θ 2 /n 2 . Most of the variances of the estimators for the other distributions do not converge to 0 so rapidly, but are proportional to θ 2 /n instead. Exponential distribution For an exponential distribution, since the MLE of θ is the sample mean, the variance of it is θ 2 /n. Weibull distribution For a Weibull distribution with fixed τ, let’s work out the asymptotic variance of ˆ θ. L (θ)  l (θ) 

e−

P

x iτ /θ τ

nτ xτ − τi Pθ τ x iτ θ τ+1



− nτ ln θ

dl nτ  − dθ Pθ τ 2 τ ( τ + 1 ) xi d l nτ − 2  − 2 dθ θ τ+2 θ However, for a Weibull, E[X τ ]  θ τ , so the expected value of this expression is nτ ( τ + 1) nτ nτ 2 − 2  2 θ2 θ θ C/4 Study Manual—17th edition Copyright ©2014 ASM

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

652

and the asymptotic variance of θˆ is θ 2 / ( nτ2 ) .

Pareto distribution For a Pareto distribution with fixed θ, we have (dropping the + 1 from the denominator’s exponent, as usual) L (α)  Q

α n θ nα (θ + xi )α

l ( α )  n ln α + nα ln θ − α ln

n dl  + n ln θ − ln dα α d2 l n − 2  2 dα α

Y

Y

(θ + xi )

(θ + xi )

and the asymptotic variance is α 2 /n. You can also derive the asymptotic variance of the MLE of θ for a Pareto with fixed α, even though ˆ We have you usually need numerical techniques to calculate θ. L (θ)  Q

α n θ nα ( θ + x i ) α+1

l ( θ )  n ln α + nα ln θ − ( α + 1)

X

ln ( θ + x i )

dl nα 1  − ( α + 1) dθ θ θ + xi X nα 1 d2 l − 2  2 − ( α + 1) dθ θ (θ + xi )2

X

We need to calculate E[1/ ( θ + X ) 2 ], which we’ll do by integrating f ( x ) / ( θ + x ) 2 . 1 E  (θ + X )2

"

#



Z 0

αθ α dx ( θ + x ) α+3 ∞

αθ α α+2 ( α + 2)( θ + x ) 0 α  ( α + 2) θ 2 −

so the information is I(θ) 

nα nα ( α + 1) nα −  θ2 ( α + 2) θ 2 ( α + 2) θ 2

(34.3)

and the variance is ( α + 2) θ 2 / ( nα ) .

Lognormal distribution The lognormal distribution is the only one for which you can develop the information matrix for a two-parameter distribution without using numerical techniques. As usual, we’ll C/4 Study Manual—17th edition Copyright ©2014 ASM

34.1. INFORMATION MATRIX

653

√ drop the constant (1/x i 2π) in the likelihood function. L ( µ, σ ) 

e−

P

(ln x i −µ ) 2 /2σ2

σn (ln x i − µ ) 2 − n ln σ l ( µ, σ )  − 2σ2 P (ln x i − µ ) ∂l  ∂µ σ2

P

∂2 l n − 2 ∂µ2 σ The expected value of negative this partial derivative is n/σ 2 . ∂2 l 2 − ∂µ∂σ

P

(ln x i − µ ) σ3

The expected value of ln x i − µ is 0, so the expected value of this partial derivative is 0.

(ln x i − µ ) 2 n ∂l −  σ ∂σ σ3 P 3 (ln x i − µ ) 2 ∂2 l n + 2 − 2 4 ∂σ σ σ P

The expected value of (ln x i − µ ) 2 is σ2 , so the expected value of negative this partial derivative is n 2n 3nσ2 − 2  2 σ σ σ4 The information matrix is I ( µ, σ ) 

0

n σ2

!

2n σ2

0

The inverse of a diagonal matrix (one with entries only along the main diagonal) is obtained by inverting each element of the matrix. Therefore, the covariance matrix is Σ

σ2 n

0

0

!

σ2 2n

The covariance of the two estimators is 0. This is true for the MLE estimators of the two parameters of a lognormal distribution, but for other two-parameter distributions (like Pareto) the covariance of the estimators of the two parameters is not 0.

?

Quiz 34-1 A sample of 10 observations is fitted to a distribution with density function f ( x )  ax a−1

0≤x≤1

using maximum likelihood to estimate a. Calculate the asymptotic variance of the estimator of a as a function of a. Table 34.1 summarizes asymptotic variances that we developed, but is probably not worth memorizing. Let’s now work out examples with truncated and censored data. C/4 Study Manual—17th edition Copyright ©2014 ASM

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

654

Table 34.1: Asymptotic variance of MLE’s for common distributions

Distribution

Formula

Exponential

Var ( θˆ ) 

θ2

Uniform [0, θ]

Var ( θˆ ) 

nθ 2 ( n + 1) 2 ( n + 2)

Weibull, fixed τ

Var ( θˆ ) 

θ2 nτ 2

Pareto, fixed θ

Var ( αˆ ) 

α2 n

n

Distribution

Formula

Pareto, fixed α

( α + 2) θ 2 Var ( θˆ )  nα Var ( µˆ ) 

σ2 n

ˆ σˆ )  0 Cov ( µ,

Lognormal

Var ( σˆ ) 

σ2 2n

In this table, n is the sample size and Var means asymptotic variance. Example 34D Losses follow an exponential distribution with mean θ. On an insurance coverage with policy limit 10,000, the following 4 losses are observed: 1000

3000

4000

5000

In addition, there are 2 losses at the limit. Calculate the asymptotic variance of the maximum likelihood estimator of θ. Answer: Let m be the number of uncensored observations. In this case m is the number of observations not exceeding 10,000. The loglikelihood of an uncensored observation is −x i /θ − ln θ, while the loglikelihood of a censored observation is −10,000/θ. So the loglikelihood function of the sample is

Pm

i1

l (θ)  −

x i + 10,000 (6 − m ) − m ln θ θ

Differentiating twice, we get dl  dθ

Pm

d2 l − dθ 2

x i + 10,000 (6 − m )

m − 2 θ θ P  m 2 i1 x i + 10,000 (6 − m ) i1

θ3

+

m θ2

The expected value of m is 6 Pr ( X ≤ 10,000)  6 (1 − e −10,000/θ ) . Let p  1 − e −10,000/θ , so that E[m]  6p. The expected value of x i is the expected value of an exponential random variable X given that it is 10,000

no greater than 10,000, or 0 x f ( x ) dx/p. The integral is E[X ∧ 10,000] − 10,000 (1 − p ) , and looking E[X ∧ 10,000] up in the tables, E[X ∧ 10,000]  θp. So the expected value of x i is

R

E[X | X ≤ 10,000] 

θp − 10,000 (1 − p )  θ − 10,000 (1 − p ) /p p

The expected value of a sum of m x i s is the expected value of m times the expected value of x i . The information matrix is the negative of the expected value of the second derivative of the loglikelihood C/4 Study Manual—17th edition Copyright ©2014 ASM

34.1. INFORMATION MATRIX

655

function. Substituting the values we calculated, the information matrix is 2 6p θ − 10,000 (1 − p ) /p + 10,000 (6 − 6p )









6p − 2 θ3 θ 12pθ − 120,000 (1 − p ) + 120,000 (1 − p ) − 6pθ  θ3 6p  2 θ

The asymptotic variance is the inverse of this. Estimate the asymptotic variance by using the fitted value of θ in the formula. The MLE of θ is 33,000/4  8250. So the estimated asymptotic variance is p  1 − e −10,000/8250  0.702435

L ( θˆ )  Var

82502  16,149,190 6 (0.702435)



Example 34E Losses follow a two-parameter Pareto distribution with θ  5000. On an insurance coverage with ordinary deductible 500 and maximum covered loss 10,000, the following loss sizes are observed: 1000

2000

3000

6000

8000

and 3 additional losses are at the limit. Estimate the variance of the maximum likelihood estimate of α. Answer: The probability density function, taking into account truncation at 500, is f ( x; α ) 

α (5500α ) α (5000α ) / (5000 + x ) α+1  α (5000/5500) (5000 + x ) α+1

and the survival function at 10,000, taking into account truncation at 500, is

(5000/15,000) α 5500 S (10,000; α )   α (5000/5500) 15,000



Let m be the the number of observations not exceeding 10,000. Dropping the +1 in the exponent of the denominator, which is a constant, the likelihood function is α m (55008α ) α (8−m ) α ) i1 (5000 + x i ) (15,000

L ( α )  Qm

where the x i are the uncensored observations. We log and calculate the second derivative. l ( α )  m ln α + α *8 ln 5500 −

, m * dl  + 8 ln 5500 − dα α d2 l m − 2 dα 2 α C/4 Study Manual—17th edition Copyright ©2014 ASM

,

m X i1

m X i1

ln (5000 + x i ) − (8 − m ) ln 15,000+

-

ln (5000 + x i ) − (8 − m ) ln 15,000+

-

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

656

The expected value of m is 8 times the probability that an observation is less than 10,000 given that it is greater than 500, or 1 − (5500/15,000) α , so d2 l 8 (1 − (5500/15,000) α ) E − 2  dα α2

"

#

and the variance is the reciprocal of this expression. The variance can be estimated by using the estimate for α, obtained by setting the first derivative to 0 and setting the x i s equal to the 5 observations and m  5: 5 − (8 ln 5500 − ln 6000 − ln 7000 − ln 8000 − ln 11,000 − ln 13,000 − 3 ln 15,000)  0 α 5  0.949465 αˆ  − −5.26612 0.9494652 L ( αˆ )  Var   0.183448 8 1 − (5500/15,000) 0.949465

34.1.3



True information and observed information

So far, I have not heard reports of questions on observed information appearing on exams, so you are probably safe skipping this subsection. However, observed information will be used in the derivation of Section 34.4, a section which has material added to the October 2013 syllabus. The information matrix is negative the expected value of the second derivative. This expected value is usually a function of the parameters. Since the parameters are unknown in real life, the parameter estimates would be used to compute the value of the information matrix. We’ll call this true information, even though the parameter estimates are used, to contrast it with the concept of the next paragraph. Up to now in this lesson, every information matrix was true information. Often the expected value itself cannot be calculated. In other words, you cannot even develop a closed form expression for the information matrix in terms of the parameters. If so, an approximation of the information matrix would be to plug in the observed values rather than calculating the expected value. This approximation is called observed information. True information and observed information are contrasted in the following example. Example 34F You are given the following two observations: 1

4

The observations are fitted to a two-parameter Pareto distribution with α  1. The parameter θ is estimated using maximum likelihood. Determine the absolute difference between the true information and the observed information if the parameter estimate θˆ is used to approximate θ in the information expressions. Answer: For a Pareto with α  1, f ( x )  θ/ ( θ + x ) 2 . Then for n observations, L (θ)  Q

θn (θ + xi )2

l ( θ )  n ln θ − 2

n X

ln ( θ + x i )

i1

n

X 1 dl n  −2 dθ θ θ + xi i1

d2 l

dθ 2 C/4 Study Manual—17th edition Copyright ©2014 ASM

n

−

X n 1 +2 2 θ (θ + xi )2 i1

(*)

34.1. INFORMATION MATRIX

657

The maximum likelihood estimate for θ is calculated by setting

dl dθ

 0.

2 2 2 − − 0 θ θ+1 θ+4 We divide by 2 and multiply the denominators.

( θ + 1)( θ + 4) − θ2 − 4θ − θ2 − θ  0 θ 2 + 5θ + 4 − 2θ 2 − 5θ  0 θ2 − 4  0 θˆ  2

The observed information is obtained by negating (*) and setting θ  2 and setting x1  1, x 2  4. 1 1 1 2 1 2 2 −2 + +   −2 4 9 36 9 θ2 ( θ + 1) 2 ( θ + 4) 2

!





To calculate true information, we use formula (34.3) with α  1. I (θ) 



( α + 2) θ 2



(2)(1) 1  (3)(22 ) 6

The absolute difference between the true information and the observed information is

1 2 1 −  . 9 6 18

Let’s also redo Example 34D using observed information. Example 34G Losses follow an exponential distribution with mean θ. On an insurance coverage with policy limit 10,000, the following 4 losses are observed: 1000

3000

4000

5000

In addition, there are 2 losses at the limit. Calculate the asymptotic variance of the maximum likelihood estimator of θ using observed information. Answer: The likelihood of the four observations x i is e −x i /θ /θ, and the likelihood of the losses at the limit is e −10,000/θ . So the loglikelihood function, using the observed values, is l (θ)  − Also,

P

P

x i + 20,000 − 4 ln θ θ

x i + 20,000  33,000. Differentiating twice, we get dl 33,000 4  − dθ θ θ2 2 d l 2 (33,000) 4 − + 2 dθ 2 θ3 θ

The MLE is 33,000/4  8250, and the information matrix is 66,000 4 − 3 8250 82502 so the asymptotic variance is the inverse of this, or 17,015,625 . C/4 Study Manual—17th edition Copyright ©2014 ASM



34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

658

Many times, the true information approximated with the fitted parameter values is equal to the observed information. Example 34H A sample of n data points is fitted to an exponential using maximum likelihood. The variance of the fitted parameter is estimated using 1. The true information, approximated using the fitted value of θ 2. The observed information Show that both estimates are the same if the parameter θ in the information expressions is approximated using the fitted value of θ. Answer: For an exponential, f ( x )  θ1 e −x/θ . ¯ the sample We know from our previous work that the maximum likelihood estimate of θ is θˆ  x, mean. Let’s calculate the information. L (θ) 

1 − P x i /θ e θn P

l ( θ )  −n ln θ −

dl n xi − + 2 dθ θ θ P n 2 xi d2 l  − dθ 2 θ 2 θ3

xi θ

P

At this point, let’s go over the definitions of observed and true information. Observed information is negative the second derivative of the loglikelihood, with the x i ’s replaced with the observations. Here, it is n 2 xi Observed information  − 2 + θ θ3

P

is

True information is the expected value of negative the second derivative of the loglikelihood. Here, it

n E[X] + 2n 3 θ2 θ since E[x i ]  E[X] for every i. See the difference? Both expressions have θ’s in them, and we will need to estimate θ with θˆ to obtain a numerical result. ¯ For example, if there were three To get a numerical value for observed information, we’d use θˆ  x. P ) observations 2, 3, 7, then x i  12, θˆ  4 and the observed information would be − 432 + 2 (412  0.1875. 3 To get a numerical value for true information, note that E[X]  θ, since X is exponential with mean θ. Hence True information  −

d2 l dθ 2 n 2nθ − 2 + 3 θ θ

"

#

I (θ)  − E

¯ this becomes If we estimate θ with the fitted value x, − which is the same as the observed information. C/4 Study Manual—17th edition Copyright ©2014 ASM

n 2n x¯ + 3 2 θ θ



34.2. THE DELTA METHOD

34.2

659

The delta method

The delta method is frequently on exams; some exams have two questions on it. Even if you don’t like matrix arithmetic, do not skip this section. While we have developed an estimate of the variance of a parameter estimated with maximum likelihood, we often want the variance of a function of this parameter. For example, we may have developed the ˆ but want the variance of the estimated mean of the distribution, θ/ ( α − 1) . variance of the Pareto αˆ and θ, The delta method is a method of estimating the variance of a function of a (possibly multivariate) random variable from the variance of the random variable. To motivate it, let’s begin with a simple example. Example 34I A random variable X has mean 3 and variance 10. Estimate the variance of X 2 . Answer: There is no way to calculate the exact variance without knowing more about X. The variance of

g2

X 2 is E X 4 − E X 2 , and the fourth moment could be anything (almost). So what can we do? The only functions we know how to calculate variance for are linear functions. We know that

f

g

f

Var ( aX + b )  a 2 Var ( X ) But we can linearize a function by using its Taylor series and dropping all terms past the linear term. We have to pick a point around which to construct a Taylor series. The mean is the only natural point. Now, g ( x )  x 2 , the function of X we are being asked to calculate variance for, is already a polynomial, so the Taylor series will be a finite series. The Taylor series around x 0 is g ( x )  g ( x0 ) + g 0 ( x 0 )( x − x0 ) +

g 00 ( x0 ) ( x − x0 ) 2 + · · · 2!

In this case, where g ( x )  x 2 , g 0 ( x )  2x, g 00 ( x )  2, and x0  3, the Taylor series is g 00 (3) ( x − 3) 2 2  32 + 2 (3)( x − 3) + ( x − 3) 2

g ( x )  g (3) + g 0 (3)( x − 3) +

 9 + 6 ( x − 3) + ( x − 3) 2

As promised, we drop the quadratic term, and calculate the variance of the linear part. Var ( X 2 ) ≈ Var 9 + 6 ( X − 3)  62 Var ( X )





So the estimated variance is 62 (10)  360 .



This example illustrates the delta method. More generally, the formula for the delta method is that the approximate variance is the variance of the variable multiplied by the square of the derivative evaluated at the mean: Delta Method Formula—One Variable Var g ( X ) ≈ Var ( X )





dg dx

!2

(34.4)

To state this method for the more general case of a random vector (a vector of random variables), we recall the definition of the covariance matrix. If the random vector is ( X1 , X2 , . . . X n ) , let σi2 denote Var ( X i ) C/4 Study Manual—17th edition Copyright ©2014 ASM

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

660

and let σi j denote Cov ( X i , X j ) . Then the covariance matrix Σ is defined by: σ2

*. 1 . σ21 Σ  .. . . .. ,σn1

σ12 σ22 .. . σn2

··· ··· .. . ···

σ1n + σ2n // .. // . / σn2 -

This matrix used to be called the variance-covariance matrix in the textbook used on the syllabus before 2003 (and therefore you may find references to the variance-covariance matrix on older published exams), and is symmetric. Suppose X  ( X1 , . . . , X k ) is a k dimensional random variable, θ is its mean, and Σ is its covariance matrix. If g (X) is a function of X, then the delta method approximation of the variance is Delta Method Formula—General Var g (X) ≈ ( ∂g)0Σ ( ∂g)



 ∂g

∂g

(34.5)



0

where ∂g  ∂x1 , . . . , ∂x k is the column vector of partial derivatives of g evaluated at θ, and prime indicates a transpose. Notice how this formula generalizes the single variable formula by multiplying the “variance” (which is now a covariance matrix) by the ”square” of the derivative evaluated at the mean (which is done by left- and right-multiplying a vector of partial derivatives). In the two-variable case (the most variables you are likely to have in an exam question), if we call the variables X and Y, equation 34.5 reduces to Delta Method Formula—Two Variables ∂g Var g ( X, Y ) ≈ Var ( X ) ∂x





!2

∂g ∂g ∂g + 2 Cov ( X, Y ) + Var ( Y ) ∂x ∂y ∂y

!2

(34.6)

All derivatives in this formula are evaluated at the mean. Make sure to learn these important formulas! Example 34J Claim size X follows a single parameter Pareto distribution with known parameter θ  100. α is estimated as 3, with variance 0.5. Determine the variance of the estimate for Pr ( X < 200) .

D ( X < 200) . Only α is estimated, so we use formula (34.4). Answer: Let g ( α )  Pr 100 g (α)  1 − 200 dg 1 − dα 2





ln

1 2

1 1− 2



!

!3 ! 2  1 1 + * D Var Pr ( X < 200)  0.5 ln  0.003754 2 2 , 



Example 34K Claim size X follows a two-parameter Pareto distribution with estimated parameters α  3 ! 0.6 0.2 and θ  100. The covariance matrix is . 0.2 0.4 Determine the variance of the estimate of Pr ( X < 200) . C/4 Study Manual—17th edition Copyright ©2014 ASM

34.3. CONFIDENCE INTERVALS

661

Answer: Here, Pr ( X < 200) is a function of two variables, α and θ. Let g ( α, θ )  Pr ( X < 200) . The two parameters α and θ are estimated, not constants. To estimate the variance of g ( α, θ ) , we use formula (34.5). θ g ( α, θ )  1 − θ + 200 ∂g θ − θ + 200 ∂α



∂g θ  −α θ + 200 ∂θ

!α 1 θ − ln θ + 200 3

!

! α−1



1  0.04069 ln 3

!

1 200  −3 3 ( θ + 200) 2

!

Var Pr ( X < 200)  0.04069 −0.0007407



!3



 0.6 0.2! 0.2

!2

2  −0.0007407 900

!

0.04069  0.0009816 . −0.0007407

!

0.4

We calculate the variance of g ( α, θ )  Pr ( X < 200) by using formula (34.6).

?



Quiz 34-2 The parameter θ of a uniform distribution on [0, θ] is estimated using maximum likelihood methods to be θˆ  50. The variance of θˆ is 0.0048. Let vˆ be the estimated variance of the uniform distribution. Using the delta method, estimate Var ( vˆ ) .

34.3

Confidence Intervals

34.3.1

Normal Confidence Intervals

A normal confidence interval for a quantity estimated by maximum likelihood is constructed by adding and subtracting z p σ, where z p is an appropriate standard normal percentile and σ is the estimated standard deviation. Example 34L You are given: (i) A random variable has probability density function f ( x )  ax a−1

0 ≤ x ≤ 1, a > 0

(ii) The parameter a is estimated by maximum likelihood. (iii) A random sample of observations of X is 0.3, 0.6, 0.6, 0.8, 0.9. Construct a 95% normal confidence interval for Pr ( X < 0.5) .

Answer: This question is too long for an exam, but working it out will summarize a lot of what we learned. Let’s calculate the estimate of a. L (a )  a

5*

a−1

5 Y

xi + -

, i1

l ( a )  5 ln a + ( a − 1) ln

5 Y

xi

i1

5

Y dl 5  + ln xi  0 da a i1

aˆ  − C/4 Study Manual—17th edition Copyright ©2014 ASM

5 ln

Q5

i1

xi

 1.95762

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

662

Now let’s calculate the asymptotic variance. 5 d2 l − 2 2 da a " 2 # d l 5 −E  2 da 2 a

L ( aˆ )  Var

aˆ 2  0.76645 5

Next we use the delta method. Let g ( a ) be the function of a represented by Pr ( X < 0.5) .

D ( X < 0.5)  g ( a )  Pr

0.5

Z 0

ax a−1 dx  0.5a

g (1.95762)  0.51.95762  0.25745 dg  0.5a ln 0.5 da

L Pr D ( X < 0.5)  0.5aˆ ln 0.5 Var 





2

(0.76645)  0.02441

√ The confidence interval is 0.25745 ± 1.96 0.02441  (−0.04876, 0.56366) . Since probabilities can’t be less than 0, the actual confidence interval would be [0, 0.56366) This is a pretty wide confidence interval, but there were only five observations in the sample. 

34.3.2

Non-Normal Confidence Intervals

Section 13.4 of Loss Models fourth edition discusses non-normal confidence intervals for parameters. The use of normal confidence intervals assumes the maximum likelihood estimator is normally distributed, which is true asymptotically but not for small samples, and that building separate confidence intervals for each parameter separately is optimal. An alternative method for building confidence intervals is to solve an inequality for the loglikelihood equation. The confidence interval consists of the k-dimensional region in which the loglikelihood function is greater than c for some constant c, where k is the number of parameters. This region is not necessarily cubical, unlike the normal confidence region. As we’ll learn later when we’re studying likelihood ratio tests in Lesson 40, if we want p confidence, c is selected to be the maximum loglikelihood minus 0.5w, where w is the 100p th percentile of the chi-square distribution with k degrees of freedom. Once again, k is the number of parameters being estimated. To illustrate non-normal confidence intervals, let’s repeat the previous example. Example 34M You are given: (i) A random variable has probability density function f ( x )  ax a−1

0 ≤ x ≤ 1, a > 0

(ii) The parameter a is estimated by maximum likelihood. (iii) A random sample of observations of X is 0.3, 0.6, 0.6, 0.8, 0.9. Construct a 95% non-normal confidence interval for a. Answer: There is only one parameter. We select a region in which l ( a )  5 ln a + ( a − 1) ln Q some c. Since ln 5i1 x i  −2.55413, this reduces to the set {a | 5 ln a − 2.55413 ( a − 1) > c} C/4 Study Manual—17th edition Copyright ©2014 ASM

Q5

i1

x i > c for

34.3. CONFIDENCE INTERVALS

663

In our example, the maximum value of the loglikelihood function is l (1.95762)  5 ln 1.95762 + (0.95762)(−2.55413)  0.91276 The 95th percentile of chi-square with one degree of freedom is 3.841, so we want l ( x ) ≥ 0.91276 − 0.5 (3.841)  −1.0077 The equation l ( x )  5 ln x − 2.55413 ( x − 1)  −1.0077 requires a numerical technique to solve. Using Excel’s Solver, I obtained the interval (0.70206, 4.20726) √ for a. Compare this to the normal confidence interval 1.95762 ± 1.96 0.76645  (0.24169, 3.67354) . 

Note that in calculating c, the exact loglikelihood is needed. We cannot drop multiplicative constants. Since virtually any loglikelihood formula involves a combination of logs and polynomial terms, equating the loglikelihood to a constant usually requires a numerical technique. Therefore, there are very few possibilities for exam questions. One possibility for an exam question is to calculate a non-normal confidence interval for the parameter θ of a uniform distribution on [0, θ]. Note that the asymptotic theory of this lesson does not apply to a uniform distribution, because its maximum likelihood estimator is not determined through differentiation. For a random variable having a uniform distribution on [0, θ], the MLE is the maximum observation. The variance of the maximum observation of a uniform distribution (if you want to construct a normal confidence interval) can be determined directly; see Example 21E on page 357. The likelihood function of a sample of n is 1/θ n , with loglikelihood −n ln θ. The likelihood function is maximized at max x i . You can set the loglikelihood function equal to its maximum value minus some constant, and solve the resulting equation for θ. Note that the likelihood is 0 for θ < max x i , so any confidence interval for θ is a one-sided interval with left boundary max x i . Example 34N A random variable follows a uniform distribution on [0, θ]. A sample of 100 observations of the random variable has maximum 50. The parameter θ is estimated using maximum likelihood. Construct a 95% non-normal confidence interval for θ. Answer: The loglikelihood function is l ( θ )  −100 ln θ for θ ≥ 50, and is maximized at θ  50 as −100 ln 50  −391.202. Subtracting half of the 95th percentile of chi-square with one degree of freedom, or 0.5 (3.841) , from the maximum loglikelihood, we have −100 ln θ ≥ −391.202 − 0.5 (3.841)  −393.123 ln θ ≤ 3.93123

θ ≤ e 3.93123  50.97

The confidence interval is [50, 50.97) .



Another possibility for an exam question is the confidence interval for the estimator of µ of a normal or lognormal distribution with fixed σ. However, as the next example shows, this is not too interesting. Example 34O A random variable X has a normal distribution with variance 100. A random sample of ¯ The parameter µ is estimated from these observations using maxi25 observations has sample mean x. mum likelihood. Construct a 95% non-normal confidence interval for µ.

C/4 Study Manual—17th edition Copyright ©2014 ASM

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

664

Answer: The likelihood and loglikelihood (expressed generally in terms of σ  10 and n  25) are L (µ) 

e−

P

( x i −µ ) 2 /2σ2



( σ 2π ) n √

l ( µ )  −n ln ( σ 2π ) −

P

(xi − µ)2 2σ2

¯ which you can prove by differentiating l ( µ ) and setting the The maximum likelihood estimate of µ is x, derivative equal to 0. So the maximum loglikelihood is

P √ ( x i − x¯ ) 2 l ( x¯ )  −n ln ( σ 2π ) − 2σ2 Now, suppose µ0 is in the 95% confidence interval. In that case, we need 2|l ( µ0 ) − l ( x¯ ) | ≤ 3.841. Note that √ in the expression |l ( µ0 ) − l ( x¯ ) |, the first summand −n ln ( σ 2π ) cancels. We are left with

P * 2 l ( µ0 ) − l ( x¯ )  . , 



 ( x i − µ0 ) 2 − ( x i − x¯ ) 2 + / σ2 -

Each summand in the numerator2 is a difference of squares, and a 2 − b 2  ( a + b )( a − b ) , so

( x i − µ0 ) 2 − ( x i − x¯ ) 2  ( x¯ − µ0 )(2x i − µ0 − x¯ )

and the sum of n terms is 2 ( µ0 − x¯ )

X

x i  x¯ ( µ0 − x¯ )(2n x¯ − nµ0 − n x¯ )  n ( µ0 − x¯ ) 2

So 2|l ( µ0 ) − l ( x¯ ) | 

n ( x¯ − µ0 ) 2 σ2

In our case, n  25 and σ2  100. So we need 0.25 ( x¯ − µ0 ) 2 ≤ 3.841 √ | x¯ − µ0 | ≤ 15.364  3.920 and that defines the non-normal confidence interval. Now note that the variance of the sample mean is 100/25  4 whose square root is 2, so that the normal confidence interval would also be x¯ ± 2 (1.96)  x¯ ± 3.92. So both confidence intervals are identical. But what did you expect for a normal random variable? A remaining possibility for an exam question would be to ask whether a specific point is in the nonnormal confidence interval.

34.4

Variance of Exact Exposure Estimate of qˆ j

We’ll now derive formula (28.3). The estimate of the hazard rate λ is the maximum likelihood estimate for an exponential distribution. We’ll calculate observed information by differentiating twice. The likelihood is L ( λ )  λ d j e −e j λ 2This trick was shown to me by Axiom Choi. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 34

665

Log and differentiate twice. l ( λ )  d j ln λ − e j λ dj − ej l0 ( λ )  λ dj l 00 ( λ )  − 2 λ Notice that this is observed information, not true information, because we used the actual values of d j and e j in the calculation. To calculate information, we negate l 00 ( λ ) and plug in the estimate of λ, d j /e j . The resulting approximate information is e 2j /d j and the approximate variance of λ is d j /e 2j . We use the delta method to calculate the variance of qˆ j . g ( λ )  qˆ j  1 − e −nλ

q 0 ( λ )  ne −nλ  n (1 − qˆ j )

!  2 dj 0 2 2 ˆ L L ˆ ˆ Var ( q j )  q ( λ ) Var ( λ )  n (1 − q j ) 2

(28.3)

ej

Exercises 34.1. [4B-S93:15] (1 point) Which of the following are basic properties of maximum likelihood estimators? 1.

The variance of the maximum likelihood estimator is equal to its mean square error.

2.

Maximum likelihood estimators are asymptotically efficient.

3.

Maximum likelihood estimators have an asymptotically normal distribution.

(A) 2

(B) 1,2

(C) 1,3

(D) 2,3

(E) 1,2,3

34.2. [4B-F98:20] (1 point) Which of the following statements are always true with respect to maximum likelihood estimation with small samples? I. II. III.

The estimator is unbiased. The variance of the estimator is equal to the Rao-Cràmer lower bound. The estimator has a normal distribution.

(A) (B) (C) (D) (E)

None of the above statements are always true. I II III I, II

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

666

Information matrix Use the following information for questions 34.3 and 34.4: You are given the following sample of claim sizes: 1500 3500 1800 6000 3800 5500

4800 4200

3900 3000

The underlying distribution is assumed to be gamma, with parameters α and θ unknown. 34.3. [4B-S91:47] (2 points) If it is given that α  12, in what range does the maximum likelihood estiˆ of θ fall? mator, θ, (A) (B) (C) (D) (E)

Less than 250 At least 250, but less than 300 At least 300, but less than 350 At least 350, but less than 400 At least 400

34.4. [4B-S91:48] (2 points) Which of the following is an expression for the variance of the maximum ˆ found in the previous question? likelihood estimator, θ, (A) θ 2 /1200

(B) 1200/θ2

(C) θ2 /120

(D) 120/θ2

(E) 38000/12

Use the following information for questions 34.5 and 34.6: A random sample of n claims x1 , x2 , . . . , x n is taken from the distribution function F ( x )  1 − x −α ,

x > 1.

ˆ the maximum likelihood estimator of α. [4B-F92:4] (2 points) Determine α, √ √ √ n n n n n (B) Pn (D) Pn (A) Pn (C) Pn (E) Pn x i ln ( x ) x ln ( x ) x i i i1 i1 i i1 i1 i i1 e 34.5.

34.6. (A)

ˆ [4B-F92:5] (2 points) Determine the asymptotic variance of α. 2 n 1 *X + xi n

, i1

(B)

2n α

(C)

α 2n

(D)

n α2

(E)

α2 n

-

34.7. A population follows a normal distribution with mean 10 and unknown variance σ2 . A sample of 20 is used to estimate σ, the square root of the population’s variance. Calculate the Cràmer-Rao lower bound for the variance of an unbiased estimator of σ if the true value of σ is 10.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 34

667

Use the following information for questions 34.8 and 34.9: A random sample of 5 claims x1 , . . . , x 5 is taken from the probability density function f ( x i )  αθ α ( θ + x i ) −α−1 ,

α, θ, x i > 0.

In ascending order the observations are: 43, 145, 233, 396, 775 [4B-F93:8] (2 points) Given that θ  1000, determine α1 , the maximum likelihood estimate of α.

34.8. (A) (B) (C) (D) (E) 34.9.

Less than 2.2 At least 2.2, but less than 2.7 At least 2.7, but less than 3.2 At least 3.2, but less than 3.7 At least 3.7 [4B-F93:9] (2 points) Determine the asymptotic variance of α 1 from question #8.

(A) 10/α 2

(B) α 2 /10

(D) α 2 /5

(C) 5/α

(E) α2 /25

34.10. [4B-F94:9] (2 points) You are given the following: •

A random sample of 40 observations, x1 , . . . , x40 has been observed from the distribution f (x )  √

• •

1 2πθ

e −x

2 /2θ

,

−∞ < x < ∞.

ˆ The maximum likelihood estimator of θ is θ.

The lone element of the estimated one-by-one information matrix of θˆ is 20/θˆ 2 .

ˆ If θˆ  2.00, estimate the Mean Square Error of θ. (A) 0.20 (B) 0.45 (C) 5.00 (E) Cannot be determined from the given information.

(D) 10.00

34.11. [4B-F95:22] (2 points) You are given the following: • •

The parameters of a loss distribution are α and β.

ˆ have information matrix The maximum likelihood estimators of these parameters, αˆ and β, 75 −20 I ( α, β )  −20 6

!

ˆ Determine the approximate variance of α. (A) 0.12

(B) 0.40

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1.50

(D) 6.00

(E) 75.00

Exercises continue on the next page . . .

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

668

34.12. [4B-F97:16] (2 points) You are given the following: •

The parameters of a loss distribution are α and β.

ˆ have information matrix The maximum likelihood estimators of these parameters, αˆ and β,



6 −20 I ( α, β )  −20 75

!

ˆ Determine the approximate variance of α. (A) 0.12

(B) 0.40

(C) 1.50

(D) 6.00

(E) 75.00

Use the following information for questions 34.13 and 34.14: You are given the following: •

The random variable X has the density function f (x ) 



1 −x/θ , θe

x > 0.

A random sample of three observations of X yields the values x1 , x2 , and x 3 .

34.13. [4B-S96:14] (1 point) Determine the maximum likelihood estimator of θ. (A) (B) (C) (D) (E)

( x1 + x2 + x3 ) /3 (ln x1 + ln x2 + ln x3 ) /3  .

1 1 1 x1 + x2 + x3 e − ( x1 +x2 +x3 )/3

√3

3

x1 x2 x3

ˆ 34.14. [4B-S96:15] (2 points) If x 1 , x 2 , and x 3 are 10, 20, and 30, respectively, estimate the variance of θ, the maximum likelihood estimator of θ. (A) (B) (C) (D) (E)

Less than 50 At least 50, but less than 150 At least 150, but less than 250 At least 250, but less than 350 At least 350

34.15. [4-F03:18] The information associated with the maximum likelihood estimator of a parameter θ is 4n, where n is the number of observations. Calculate the asymptotic variance of the maximum likelihood estimator of 2θ. (A) 1/ (2n )

(B) 1/n

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 4/n

(D) 8n

(E) 16n

Exercises continue on the next page . . .

EXERCISES FOR LESSON 34

669

34.16. [4B-S99:22] (2 points) You are given the following: • •

The parameters of a loss distribution are α and β.

ˆ have covariance matrix The maximum likelihood estimators of these parameters, αˆ and β, 0.12 0.40

0.40 . 1.50

!

ˆ Determine the approximate correlation coefficient of αˆ and β. (A) (B) (C) (D) (E)

Less than −0.90 At least −0.90, but less than −0.30 At least −0.30, but less than 0.30 At least 0.30, but less than 0.90 At least 0.90

34.17. You are given the following sample: 5

5

7

10

12

20

The data are fitted to a Weibull distribution with τ  3. The parameter θ is estimated using maximum likelihood. Calculate the observed information of the sample, using the maximum likelihood estimate of θ. 34.18. You are given the following sample: 2

2

3

5

6

The data are fitted to a Weibull distribution with θ  1. The maximum likelihood estimate of τ is 0.61530. Calculate the observed information of the sample. 34.19. You are given the following sample: 1

5

7

11

15

20

The data are fit to a uniform distribution on (0, θ] using maximum likelihood. Express the variance of the fitted parameter θˆ as a function of θ. More generally, assuming there are n observations, the asymptotic variance of the maximum likelihood estimator of θ is θ a /n b . Determine a and b. Delta method ˆ You are 34.20. Claim size X follows an exponential distribution with mean θ. An estimator for θ is θ. given: (i) θˆ  1000. (ii) Var ( θˆ )  10,000. ˆ Estimate the variance of Pr ( X < 500) when calculated using θ. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

670

Use the following information for questions 34.21 and 34.22: 2 The random variables ( X1 , X2 ) have means (0, 10) and covariance matrix 3

3 . 8

!

34.21. Estimate the variance of X1 + X2 . 34.22. Estimate the variance of X1 X2 . 34.23. Claim size X follows an inverse Pareto distribution. The two ! parameters τ and θ are estimated as 0.01 −0.1 (2, 100); the covariance matrix of these estimators is . −0.1 5 Estimate the variance of Pr ( X > 500) when calculated using these estimators.

34.24. [4B-F96:16, 1999 C4 Sample:33] (2 points) You are given the following: •

The random variable X has the density function f (x ) 

1 −x/θ , θe

0 < x < ∞, θ > 0.



θ is estimated by the maximum likelihood estimator θˆ based on a large sample of data.



The probability that X is greater than k is estimated by the estimator e −k/θ . Determine the approximate variance of the estimator for the probability that X is greater than k.

(A) (B) (C) (D) (E)

Var ( θˆ ) e −k/θ Var ( θˆ ) e −2k/θ Var ( θˆ ) ( k/θ2 ) e −k/θ Var ( θˆ ) ( k 2 /θ4 ) e −2k/θ Var ( θˆ )

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 34

671

Use the following information for questions 34.25 and 34.26: A random sample of n claims x1 , x2 , . . . , x n is taken from the probability density function 2 1 e − ( x i −1000) /2θ , f (xi )  √ 2πθ

−∞ < x i < ∞.

34.25. [4B-S93:7] (2 points) Determine θ1 , the maximum likelihood estimate of θ. (A) (B)

1/2 n 1 X √ * ( x i − 1000) 2 + n , i1 n 1X ( x i − 1000) 2 n i1

(C) (D) (E)

1/2 n 1 *X 2+ ln ( x i − 1000) √ n , i1 n 1X ln ( x i − 1000) 2 n i1 n X

1 √ n

i1

ln ( x i − 1000) 2

34.26. [4B-S93:8] (2 points) Determine the asymptotic variance of θ1 . (A)

n 1X ( x i − 1000) 2 n

(B) (C) (D) (E)

θ/2n 2n/θ 2θ 2 /n θ 2 /n

i1

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

672

34.27. [4-S00:25] You model a loss process using a lognormal distribution with parameters µ and σ. You are given: •

The maximum likelihood estimates of µ and σ are: µˆ  4.215 σˆ  1.093



The estimated covariance matrix of µˆ and σˆ is: 0.1195 0



0 0.0597

The mean of the lognormal distribution is exp µ +



σ2 2

!



Estimate the variance of the maximum likelihood estimate of the mean of the lognormal distribution, using the delta method. (A) (B) (C) (D) (E)

Less than 1500 At least 1500, but less than 2000 At least 2000, but less than 2500 At least 2500, but less than 3000 At least 3000

34.28. [4-S01:25] You have modeled eight loss ratios as Yt  α + βt + ε t , t  1, 2, . . . , 8, where Yt is the loss ratio for year t and ε t is an error term. You have determined: αˆ 0.50  0.02 βˆ

!

!

0.00055 −0.00010 * αˆ + Var . ˆ /  −0.00010 0.00002 β

!

,

!

-

Estimate the standard deviation of the forecast for year 10, Yˆ 10  αˆ + βˆ · 10, using the delta method.

(A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.02 At least 0.02, but less than 0.03 At least 0.03, but less than 0.04. At least 0.04.

34.29. [4-F01:22] You fit an exponential distribution to the following data: 1000

1400

5300

7400

7600

Determine the coefficient of variation of the maximum likelihood estimate of the mean, θ. (A) 0.33

(B) 0.45

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.70

(D) 1.00

(E) 1.21

Exercises continue on the next page . . .

EXERCISES FOR LESSON 34

673

34.30. You fit an inverse exponential X using maximum likelihood to estimate θ. The resulting estimate of θ is 2.07. The variance of the estimate is 0.00264. You then estimate Y  E[X −1 ]. Using the delta method, estimate the variance of Y. 34.31. You fit a Weibull distribution to data using maximum likelihood. The resulting estimates of the parameters ( θ, τ ) are 5000 and 2 with covariance matrix 62,500 0

0 0.0008

!

You then estimate VaR0.95 ( X ) for the fitted Weibull distribution. Calculate the variance of the estimate of VaR0.95 ( X ) using the delta method. Use the following information for questions 34.32 and 34.33: The time to an accident follows an exponential distribution. A random sample of size two has a mean time of 6. Let Y denote the mean of a new sample of size two. 34.32. [C-S05:9] Determine the maximum likelihood estimate of Pr ( Y > 10) . (A) 0.04

(B) 0.07

(C) 0.11

(D) 0.15

(E) 0.19

34.33. [C-S05:10] Use the delta method to approximate the variance of the maximum likelihood estimator of FY (10) . (A) 0.08

(B) 0.12

(C) 0.16

(D) 0.19

(E) 0.22

Confidence intervals 34.34. [4B-F92:25] (2 points) You are given the following information: •

A random sample of 30 observations, x1 , x2 , . . . , x30 , has been observed from the distribution f (x )  √

• •

1 2πθ

e −x

2 /2θ

,

−∞ < x < ∞.

ˆ The maximum likelihood estimator of θ is θ.

The lone element of the estimated information matrix of θˆ is 15/θˆ 2 .

ˆ If θˆ  1.85, determine a 95% confidence interval for θ. (A) (1.403, 2.297) (B) (0.914, 2.786) (E) Cannot be determined

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) (1.162, 2.538)

(D) (1.499, 2.201)

Exercises continue on the next page . . .

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

674

Use the following information for questions 34.35 and 34.36: You are given the following: (i)

Claim sizes follow a distribution with density function f (x ) 

(ii)

1 −x/θ e , θ

0 < x < ∞,

θ>0

A random sample of 100 claims yields total aggregate losses of 12,500.

34.35. [4B-S99:14] (2 points) Using the maximum likelihood estimate of θ, estimate the proportion of claims that are greater than 250. (A) (B) (C) (D) (E)

Less than 0.11 At least 0.11, but less than 0.12 At least 0.12, but less than 0.13 At least 0.13, but less than 0.14 At least 0.14

34.36. [4B-S99:15] (3 points) Determine the length of an approximate 95% confidence interval for the proportion of claims that are greater than 250. (A) (B) (C) (D) (E)

Less than 0.025 At least 0.025, but less than 0.05 At least 0.05, but less than 0.075 At least 0.075, but less than 0.100 At least 0.100

34.37. Size of loss has been fitted to a Pareto distribution using maximum ! likelihood. Estimated param104 12 ˆ θˆ ) is . eters are αˆ  3, θˆ  5000. The information matrix for ( α, 12 11 Determine the upper bound of a 95% confidence interval for the mean of the Pareto distribution. 34.38. [4B-S94:28] (2 points) You are given the following: •

A random sample of 40 observations, x1 , x2 , . . . , x40 , has been observed from the distribution: f ( x )  θe −θx ,

• •

x > 0.

θˆ is the maximum likelihood estimator of θ.

The lone element of the estimated one-by-one information matrix of θˆ is 40/θˆ 12 . If θˆ  5.00, determine a 95% confidence interval for θ.

(A) (4.375, 5.625)

(B) (4.209, 5.791)

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) (3.775, 6.225)

(D) (3.450, 6.550)

(E) (2.521, 7.479)

Exercises continue on the next page . . .

EXERCISES FOR LESSON 34

675

34.39. [4B-F94:22] (3 points) You are given the following: •

The claims frequency rate for a group of insureds is believed to be an exponential distribution with unknown parameter θ; i.e., F ( x )  1 − e −x/θ , x > 0 where X  the claims frequency rate.



Ten random observations of X yield the following sample in ascending order: 0.001, 0.003, 0.053, 0.062, 0.127 0.131, 0.377, 0.382, 0.462, 0.481



Summary statistics for the sample data are: 10 X

x i  2.079;

i1

10 X i1

x 2i  0.773;

10 X i1

ln ( x i )  −25.973

θˆ is the maximum likelihood estimator for θ.



Use the normal distribution to determine a 95% confidence interval for θ based upon the sample data. 34.40.

Losses follow a Pareto distribution with parameters θ  1000 and α. Five observed losses were 500

500

800

2000

10,000

The parameter α is estimated using maximum likelihood. Select the smallest of the following numbers that is contained in a 95% non-normal confidence interval for α. (A) 0.2

(B) 0.3

(C) 0.4

(D) 0.5

(E) 0.6

34.41. X follows a lognormal distribution with parameters µ and σ  2. A sample of 25 observations has geometric mean e 5 . The parameter µ is estimated using maximum likelihood. Construct 95% asymptotic normal and 95% non-normal confidence intervals for µ. 34.42. [Sample:303] An actuary observed a sample of 1000 policyholders during the interval from age 30 to age 31 and found that 98 of them died in this age interval. Based on the assumption of a constant hazard rate in this age interval, the actuary obtained a maximum likelihood estimate of 0.100 for the conditional probability that a policyholder alive at age 30 dies before age 31. Calculate the estimate of the variance of this maximum likelihood estimator, using the delta method. (A) 0.000079

(B) 0.000083

(C) 0.000086

(D) 0.000092

Additional released exam questions: C-F05:14,18,20, C-F06:34, C-S07:18

Solutions 34.1. 1.

This statement implies MLE’s are always unbiased, which isn’t true. #

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E) 0.000097

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

676

2.

“Efficient” is vague, but an estimator is more efficient than another estimator if its variance is lower. , Asymptotically, MLE’s have the lowest variance of any unbiased estimator, so they’re efficient at least among unbiased estimators. !

3.

True, as we mentioned in the first paragraph of this lesson. !

(D) 34.2. All of these properties are true asymptotically as the size of the sample goes to infinity, but are not true about the estimator itself. (A) 34.3. By the gamma shortcut for maximum likelihood estimation, shortcut #2 on page 601, the MLE for θ is the sample mean divided by the parameter α, or 12. 1500 + 3500 + 1800 + 4800 + 3900 + 6000 + 3800 + 5500 + 4200 + 3000  3800 X¯  10 X¯ 3800 θˆ    316 32 (C) 12 12 34.4.

¯ In the previous exercise, we saw that θˆ  X/12. Therefore, Var ( x i ) 12θ 2  10 10

Var ( X¯ )  so

θ2 1 12θ 2 Var ( θˆ )  2  12 10 120

(C)

34.5. L (α)  Q

(

αn x i ) α+1

l ( α )  n ln α − ( α + 1)

X

ln x i

dl n  − ln x i dα α n (A) α P ln x i

X

2

34.6.

d l The information matrix is − dα 2 

n . α2

The variance is the inverse of this, or

α2 n

. (E)

34.7. The Cràmer-Rao lower bound for the variance of an unbiased estimator is the inverse of the information matrix. Let’s therefore calculate the information matrix. The density of a normal distribution, dropping constants (anything not a function of σ is a constant) is f (x ) ∼

e − ( x−µ) σ

2 /2σ 2

The loglikelihood of 20 observations x i is

P20 l (σ)  −

C/4 Study Manual—17th edition Copyright ©2014 ASM

i1 ( x i

− µ)2

2σ2

− 20 ln σ

EXERCISE SOLUTIONS FOR LESSON 34

677

Differentiating twice dl  dσ

P20

3 d2 l − 2  dσ

i1 ( x i − σ3

P20

µ)2



− µ)2

i1 ( x i σ4

20 σ



20 σ2

The expected value of ( x i − µ ) 2 is σ2 , so the information matrix, the expected value of the above line, is I(σ) 

60σ2 20 40 − 2  2 σ σ σ4

The lower bound for the variance of an unbiased estimator is σ2 /40, and if σ  10, it is 2.5 . 34.8.

We have a shortcut for the MLE of a Pareto, but we’ll work it out. L (α)  Q

α5 10005α (1000 + x i ) α+1

l ( α )  5 ln α + 5α ln 1000 − ( α + 1)

X

ln (1000 + x i )

dl 5  + 5 ln 1000 − ln (1000 + x i )  0 dα α X ln (1000 + x i )  ln (1043 · 1145 · 1233 · 1396 · 1775)  35.8331

X

αˆ 

5  3.8629 35.8331 − 5 ln 1000

(E)

34.9. We differentiate the loglikelihood function from the previous exercise another time to get the negative of the information matrix. Then we invert the information matrix to get the asymptotic variance. d2 l −5  2 dα 2 α Var ( αˆ ) 

α2 5

(D)

34.10. MLE’s are asymptotically unbiased, so the asymptotic MSE is the asymptotic variance, which is the inverse of the information matrix. θˆ 2 22   0.2 . MSE ( θˆ )  20 20

(A)

34.11. We invert the information matrix to obtain the asymptotic covariance matrix. The variance of the first parameter will be the upper left entry of the matrix. The order of the parameters matters! 1 6 20 I  50 20 75 6 Var ( αˆ )   0.12 50 −1

!

(A)

34.12. Same as last exercise, except the diagonal entries are switched. I−1  1.5 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

1 75 50 20

20 . Var ( αˆ )  6

!

75 50



34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

678

34.13.

X is exponential, so the MLE is the sample mean. (A)

34.14. The MLE is the sample mean. The variance of the sample mean is the true mean (which we estimate with the sample mean) squared divided by n. The sample mean is 20, so the answer is 202 /3  133 13 . (B) 34.15. The asymptotic variance of the estimator of θ is the inverse of the information, or 1/ (4n ) . The variance of 2θ is then 22 times the variance of θ, or 1/n . (B) 34.16. They made things easy for you here—you have the covariance matrix, not the information matrix, so you use it as is, you don’t invert it. Let ρ be the correlation coefficient. ρq

ˆ βˆ ) Cov ( α, Var ( αˆ ) Var ( βˆ )

√

0.40

0.40 √  0.9428 (0.12)(1.50) 0.18

(E)

34.17. From the Weibull shortcut, the parameter estimate is θˆ 

r 3

53 + 53 + 73 + 103 + 123 + 203  12.35695 6

The second derivative of the loglikelihood function is calculated as follows: L (θ) 

e−

P

x 3i /θ 3

θ 3n x 3i

P l (θ)  − dl  dθ

θ3 P 3 3 xi

− 3n ln θ

3n − θ θ4 P 3 2 12 x i d l 3n − 2  − 2 dθ θ5 θ Substituting n  6, θ  12.35695, and the values of x i , the observed information is 12 (11,321) 18 −  0.35365 12.356955 12.356952 34.18. The second derivative of the loglikelihood function is calculated as follows. Keep in mind that τ is not constant, so unlike the previous exercise it is not dropped from the likelihood function. L (τ)  τn

Y

x iτ−1 e −

P

l ( τ )  n ln τ + ( τ − 1)

x iτ

X

ln x i −

X

X dl n X  + ln x i − x iτ (ln x i ) dτ τ d2 l n X τ − 2  2+ x i (ln x i ) 2 dτ τ

x iτ

Substituting n  5, τ  0.61530, and the values of x i , the observed information is 5 + 2 (20.61530 )(ln 2) 2 + · · · + 60.61530 (ln 6) 2  33.6930 0.615302

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 34

679

34.19. This is an example of when the formula using the Fisher information matrix doesn’t apply, because the maximum likelihood estimator is not found through differentiation. The MLE is the maximum, and equation (21.2), page 357, gives the variance of the maximum as 6 ( θ2 ) nθ 2   0.015306θ 2 ( n + 1) 2 ( n + 2) 72 (8) To approximate this, the fitted value of θˆ would be used for θ. More generally, as n → ∞, n, n + 1, and n + 2 become more or less the same, making the asymptotic variance θ 2 /n 2 , and a  2, b  2 . 34.20. We apply the delta method for the function g ( θ )  Pr ( X < 500)  FX (500; θ )  1 − e −500/θ . g ( θ )  1 − e −500/θ

g 0 ( θˆ )  −e −500/θ



L g ( θ )  10000 − Var 



500 1 −1/2 − e 2000 θ2

!

1 −1/2 e 2000

2

 0.0009197

34.21. Var ( X1 + X2 )  Var ( X1 ) + Var ( X2 ) + 2 Cov ( X1 , X2 )  2 + 8 + 2 (3)  16 . The delta method isn’t needed for this, and the answer is exact, not an estimate. 34.22. Since X1 and X2 are not independent, the delta method is needed to estimate the variance of X1 X2 . Let g ( X1 , X2 )  X1 X2 . ∂g  X2  10 ∂X1 ∂g  X1  0 ∂X2

  2 3! 10! L  200 Var ( X1 , X2 )  10 0 3

8

0

34.23. For an inverse Pareto distribution, Pr ( X > 500)  1 − FX (500; τ, θ )  1 − delta method for this function. 500 g ( τ, θ )  1 − 500 + θ ˆ θˆ )  − g τ ( τ, ˆ θˆ )  τˆ g θ ( τ,

! τˆ

500

! τ−1 ˆ

500 + θˆ

L g ( τ, ˆ θˆ )  0.1266 Var 







500

500 + θˆ

500

5 ln − ˆ 6 500 + θ

!

500

!

(500 + θˆ ) 2  0.01

0.002315

−0.1

34.24. We apply the delta method for the function e −k/θ . g ( θ )  e −k/θ k g 0 ( θ )  2 e −k/θ θ C/4 Study Manual—17th edition Copyright ©2014 ASM

500  τ 500+θ .

!2

5 2 6 −0.1 5

5 ln  0.1266 6

!

!

5  0.002315 3600

!

0.1266  0.0001285 0.002315

!

!

We apply the

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

680

g02 ( θ ) 

L g ( θˆ )  Var 



k 2 −2k/θ e θ4 k2 θ4

e −2k/θ Var ( θˆ )

(E)

34.25. According to shortcut #3 on page 601, the MLE of the variance is the sample variance with division by n, or (B). 34.26. As in Example 34B, we can calculate this directly, without calculating the information matrix. We use the fact that Var ( Y )  E[Y 2 ] − E[Y]2 for any Y, and set Y  ( X i − 1000) 2 to obtain: Var ( X i − 1000) 2  E ( X i − 1000) 4 − E ( X i − 1000) 2





f

g

 f

 E ( X i − 1000) 4 − Var ( X i )

f

g



g 2

2

 3θ 2 − θ 2  2θ 2

because the fourth moment of a normal distribution is the kurtosis coefficient (3) times the variance (θ) squared. We sum up n of these and divide by n, obtaining Var *

,

n n 2θ 2 1X 1 X 2 2θ  ( x i − 1000) 2 +  2 n n n i1

34.27. The transforming equation is g ( µ, σ )  exp µ +



∂g σ2  exp µ + 2 ∂µ

(D)

i1

-

σ2 2



, and its derivatives are

!

1.0932 ≈ exp 4.215 +  123.0172 2

!

∂g σ2  σ exp µ + 2 ∂σ

!

≈ (1.093)(123.0172)  134.4578

By equation (34.6)

Var g ( µ, σ ) ≈ (0.1195)(123.01722 ) + (0.0597)(134.45782 )  2887.733





(D)

34.28. The transformation function is g ( α, β )  α + 10β, whose partial derivatives are 1 and 10, so by equation (34.6), the variance is 0.00055 (1) + 2 (−0.00010)(1)(10) + 0.00002 (100)  0.00055 √ and the standard deviation is 0.00055  0.02345 . (C) The delta method is unnecessary in this exercise, since the transformation is linear, and the formula for the variance of a linear expression gives ˆ βˆ ) Var ( αˆ + 10 βˆ )  Var ( αˆ ) + 102 Var ( βˆ ) + 2 (10) Cov ( α, The delta method reduces to the formula for variance of a linear expression if applied to a linear expression. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 34

681

34.29. Since they ask for the coefficient of variation—not an approximation—it is unlikely that the actual values of the sample have to be used. In fact, the maximum likelihood estimator is the sample mean, so its expected value is the mean of the exponential, or θ, and its variance is the variance of the exponential 2 divided by the sample size, or θ5 . The coefficient of variation is therefore

q

Var ( θˆ ) ˆ E[θ]

θ 1 √  √  0.4472 5(θ) 5

(B)

34.30. From the tables, g ( θ )  E[X −1 ]  θ −1 Γ (2) , and Γ (2)  1. Differentiating, g0 ( θ )  −

1 θ2

The variance of Y is 0.00264/ (2.072 ) 2  0.000144 . 34.31. From the tables, g ( θ, τ )  VaR0.95 ( X )  θ (− ln 0.05) 1/τ . Differentiating and substituting the estimated values of the parameters, ∂g  (− ln 0.05) 1/τ  (− ln 0.05) 1/2  1.73082 ∂θ ∂g θ (− ln 0.05) 1/τ 5000 (− ln 0.05) 1/2  ln (− ln 0.05)  ln ( − ln .05 )  −2373.79 −4 ∂τ −τ 2

The variance of the estimate is

(1.730822 )(62,500) + (2,373.792 )(0.0008)  191,741 34.32. The maximum likelihood estimate of the exponential parameter is the mean, or 6. That’s the easy part of the problem. The sum of two exponential random variables with the same θ is a gamma random variable with parameters α  2 and the same θ. In fact, it is Erlang. Let Z  2Y be the sum of the two exponential random variables. We want to calculate Pr ( Z > 20) , or SZ (20) . In Subsection 19.1.2, we developed formula (19.1) to compute that. Namely, compute the probability of less than 2 events by time 20 in a Poisson process. Here, the exponential θ  6, so the Poisson process has a rate of 1/6, and the Poisson parameter for the number of events in 20 units of time is λ  20/6. Therefore



SZ (20)  p0 + p 1  e −20/6 1 +

20  0.154587 6



If you aren’t comfortable calculating the Erlang distribution function, an alternative method for calculating Pr ( Z > 20) , is to calculate it directly. Once again, Z  X1 + X2 , where X1 and X2 are the two exponential observations. This is the probability that X1 > 20 or both X1 < 20 and X2 > 20 − X1 . Pr ( Z > 20)  Pr ( X1 > 20) + Pr ( X1 < 20) Pr ( X2 > 20 − X1 )

Pr ( X1 > 20)  e −20/6 Pr ( X1 < 20) Pr ( X2 > 20 − X1 ) 

20

Z 0

Z  C/4 Study Manual—17th edition Copyright ©2014 ASM

0

20

from the exponential distribution

e − (20−x1 )/6 f ( x1 ) dx 1 e − (20−x1 )/6

e −x1 /6 dx1 6

!

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

682

20 1 e −20/6 dx1 6 0 20 −20/6 e  6   20 −20/6 13 −10/3 Pr ( Z > 20)  1 + e  e  0.154587 6 3

Z



34.33. Pr ( Y > 10)  1 − FY (10) has the same variance, so we will calculate the variance of Pr ( Y > 10) . Based on the last question, this is the probability that an Erlang random variable with parameters α  2, θ is greater than 20. We must express this in terms of θ. Based on the solution to the last question, we need p 0 + p1 for a Poisson process with λ  20/θ, so



Pr ( Z > 20)  e −20/θ 1 +

20 θ



ˆ the maximum likelihood estimator of θ, is the sample mean of a sample of size 2, so its variance θ, is the variance of an exponential divided by 2. The variance of the exponential, θ 2 , is estimated as θˆ 2 or   L ( θˆ )  18. Then we calculate g 0 ( θ ) 2 to apply the delta method, using the product rule for 62  36, so Var derivatives:

20 g (θ)  +1 θ



0



20 −20/θ 20 −20/θ e − 2e θ2 θ

!

202 −20/θ e θ3

!



g 0 (6)  (1.851852)(0.035674)  0.066063 The answer is

L ( θˆ ) g 0 ( θˆ ) 2  18 (0.0660632 )  0.078558 Var

(A)

34.34. The asymptotic variance of the estimator is the inverse of the information matrix, so the confidence √ √ ˆ 15)  1.85 ± 1.96 (1.85/ 15)  (0.914, 2.786) . (B) interval is θˆ ± 1.96 ( θ/

34.35. For an exponential, the estimate of θ is the sample mean, 125. Then

D ( X > 250)  e −250/125  0.135335 Pr

(D)

34.36. The length of the 2-sided confidence interval for the proportion is 2 times 1.96 (the 97.5th percentile of a standard normal distribution) times the standard deviation of the estimate for the proportion. We will use the delta method. In the previous exercise, we estimated θˆ  125. Let g ( θ ) be Pr ( X > 250 | θ ) . Then g ( θ )  S (250)  e −250/θ By the delta method

Var g ( θˆ ) ≈ Var ( θˆ ) g 0 ( θˆ )







2

Since θˆ is the sample mean, its variance is the variance of the distribution divided by the size of the sample. The variance of an exponential with mean θ is θ 2 , so the variance of θˆ is Var ( θˆ )  C/4 Study Manual—17th edition Copyright ©2014 ASM

θ2 125 ≈ 100 10

!2

EXERCISE SOLUTIONS FOR LESSON 34

683

where as usual we approximate θ with its estimate. 250 −250/θ e θ2 250 −250/125 2 −2 g 0 ( θˆ )  e  e 125 1252 g0 ( θ ) 

 125 Var g ( θˆ ) ≈ 10 

!2 

 0.2e −2



2 −2 e 125

2

2

The standard deviation of g ( θˆ ) is 0.2e −2 , and the length of the confidence interval is 2 (1.96)(0.2e −2 )  0.1061 . (E) 34.37. We apply the delta method to the function expressing the mean of a Pareto distribution in terms θ of its parameters, g ( α, θ )  E[X; α, θ]  α−1 . In the following, I is the information matrix. 5000 θˆ   2500 αˆ − 1 2 ∂g −5000 −θ    −1250 2 4 ∂α ( α − 1) ∂g 1   0.5 ∂θ α − 1 ! 11 −12 ! −1 ! −12 104 104 12 0.011 −0.012   I−1  12 11 −0.012 0.104 (104)(11) − 122

ˆ θˆ )  µˆ  g ( α,

Var ( µˆ )  −1250



0.5



0.011 −0.012

−0.012 0.104

!

!

−1250  17202.526 0.5

√ The upper bound of the confidence interval is 2500 + 1.96 17202.526  2757.07 34.38. The asymptotic variance of the MLE is the inverse of the information matrix. θˆ 2 Var ( θˆ )   0.625 40 √ Confidence interval  5 ± 1.96 0.625  (3.450, 6.550) 34.39. For an exponential, θˆ  X¯  (

P

2.079 10

 0.2079 and the variance is

θ2 n

x 2i was given to confuse you.) So the confidence interval is 0.2079 ± 1.96

34.40. By the Pareto MLE shortcut, the estimate αˆ is

(D)

which is estimated as 0.2079 √

10

 (0.0790, 0.3368) .

K  5 ln 1000 − ln (15002 )(1800)(3000)(11,000)  34.53878 − 39.43400  −4.89522 −5 αˆ   1.02140 −4.89522

and the loglikelihood function is

l ( α )  5 ln α + 5α ln 1000 − ( α + 1) C/4 Study Manual—17th edition Copyright ©2014 ASM

X

0.20792 10 .

ln (1000 + x i )  5 ln α − 4.89522α − 39.434

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

684

So for α0 to be in the interval, we need twice the difference between loglikelihoods of α0 and α to not exceed 3.841, or |5 ln ( α/α0 ) − 4.89522 ( α − α0 ) | ≤ 1.92 Plugging in 0.4:

5 ln (1.02140/0.4) − 4.89522 (1.02140 − 0.4)  1.645

so 0.4 is in the interval. Plugging in 0.3,

5 ln (1.02140/0.3) − 4.89522 (1.02140 − 0.3)  2.594 so 0.3 is out of the interval. The answer is 0.4 . 34.41. The loglikelihood function is 2

l ( µ )  −0.5n ln 2πσ −

X

ln x i −

P

(ln x i − µ ) 2 2σ2

(*)

Differentiating once with respect to µ and setting equal to 0, we find dl  dµ so µˆ  (

P

P

(ln x i − µ ) σ2

0

ln x i ) /n. In our case, µˆ 

P

ln x i  ln n

Differentiating again, we get

qY n

x i  ln e 5  5

n d2 l − 2 dµ2 σ

making n/σ 2 the information matrix, so the asymptotic variance is σ2 /n  4/25  0.16. The asymptotic √ normal confidence interval is 5 ± 1.96 0.16  (4.216, 5.784) . Let µ0 be a point in the 95% non-normal confidence interval. Twice the difference between the logˆ since the first two summands of (*) are the same for both, is likelihood of µ0 and of µ,

P

(ln x i − µ0 ) 2 − (ln x i − 5) 2 4



25 (5 − µ0 ) 2 4

where the simplification of the numerator is discussed in the solution to Example 34O. So we need 25 (5 − µ ) 2 ≤ 3.841 4 √ 5 − µ ≤ 0.4 3.841  1.96 (0.4) and we end up with the same confidence interval as the asymptotic normal confidence interval. 34.42. This question could have been placed in Lesson 28 along with other similar exercises, but it was placed here since this is where formula (28.3) is derived by the delta method as the variance of an exponential estimator. We need the exact exposure e j in order to use that formula. From the estimate of 0.1, we have 1 − e −98/e j  0.1 e −98/e j  0.9

C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 34

685 98  − ln 0.9 ej 98 ej  −  930.14 ln 0.9

Then, since n  1 for a one-year period, Var ( qˆ j )  (1 − q j ) 2

dj e 2j

 0.92

98  0.00009175 930.142

(D)

Quiz Solutions 34-1. L ( a )  a 10

Y

x ia−1

l ( a )  10 ln a + ( a − 1) ln

Y

xi

Y 10 dl  + ln xi da a 10 d2 l − 2  2 da a The asymptotic variance is a 2 /10 . 34-2.

The variance of a uniform distribution on [0, θ] is g ( θ )  θ 2 /12. Using the delta method, g0 ( θ ) 

θ 6

!2 1 50 L Var ( vˆ )  (0.0048)  6

C/4 Study Manual—17th edition Copyright ©2014 ASM

3

686

C/4 Study Manual—17th edition Copyright ©2014 ASM

34. VARIANCE OF MAXIMUM LIKELIHOOD ESTIMATORS

Lesson 35

Fitting Discrete Distributions Reading: Loss Models Fourth Edition 6.5, 14.1–14.4, 14.6

35.1

Poisson distribution

The Poisson distribution has one parameter, λ, and has probability function p k  e −λ

λk . k!

with mean and variance λ. If the exact claim frequency is known for each insured, λ can be fitted with ¯ so it is equivalent to the method of moments. the method of moments. Maximum likelihood sets λˆ  x, (This fact was mentioned as shortcut 4 on page 601). Sometimes claim frequency is grouped. The way this usually happens is that all insureds with more than a certain high number of claims are grouped together. In that case, the method of moments cannot be used, and the maximum likelihood equation usually requires a numeric technique to solve. The variance of the sample mean is the variance of the distribution, λ, divided by n, the number in the sample; in other words, λ/n. This is therefore the variance of the maximum likelihood estimator when there is complete individual data. Example 35A You are given the following information for frequency of claims in one year: Number of claims

Number of policies

0 1 2 3 4 5+

131 125 64 23 7 0

Estimate the mean number of claims using maximum likelihood. Answer: The maximum likelihood estimator is the sample mean, or x¯ 

?

125 + 64 (2) + 23 (3) + 7 (4) 350   1 131 + 125 + 64 + 23 + 7 350

Quiz 35-1 In Example 35A, estimate the variance of the maximum likelihood estimate of λ.

C/4 Study Manual—17th edition Copyright ©2014 ASM

687



35. FITTING DISCRETE DISTRIBUTIONS

688

35.2

Negative binomial

The negative binomial distribution is a two-parameter distribution. If r and β are the parameters, the probability function is: r+k−1 pk  k

!

1 1+β

!r

β 1+β

!k

with mean rβ and variance rβ (1 + β ) . With complete individual data, these parameters can be estimated with the method of moments as long as the variance of the sample is greater than the sample mean. The estimators are σˆ 2 − x¯ βˆ  x¯

rˆ 

x¯ 2 σˆ 2 − x¯

with σˆ 2 being the sample variance with division by n. The parameter β is 1 less than the ratio of the variance to the mean, and must be positive. As we discussed in Lesson 30, when matching moments, the variance of the sample is calculated with division by n rather than by n − 1. In Example 35A, the variance with division by n  350 is σˆ 2 

131 (0 − 1) 2 + 64 (2 − 1) 2 + 23 (3 − 1) 2 + 7 (4 − 1) 2 350  1 350 350

and is equal to the sample mean, so it is not possible to fit a negative binomial using the method of moments to this data. With maximum likelihood, the product of the estimators of r and β equals the sample mean, or rβ  x¯ as in the method of moments, but estimating the two parameters separately requires a numeric procedure. There will be a solution only if the biased sample variance (with division by n) is greater than the sample mean; otherwise, the likelihood will increase as r → ∞ and β → 0. If r is known, maximum likelihood selects β so that the product is the sample mean, even if the sample variance is less than or equal to the sample mean. Example 35B For the data in Example 35A, you wish to fit a negative binomial distribution with r  10, using maximum likelihood. Determine β. Answer: The maximum likelihood estimator is the mean, so 10β  1, and β 

35.3

1 10

.



Binomial

The binomial distribution is a two-parameter distribution. If m and q are the parameters, the probability function is:    mk q k (1 − q ) m−k k ≤ m pk   . 0 k>m

 µ  mq and σ2  mq (1 − q ) . If you attempt to estimate these parameters with the method of moments, an m that is not an integer ¯ will usually result, which makes the fit invalid. Given m, maximum likelihood selects q so that mq  x. To maximize likelihood when both parameters are unknown, the loglikelihood is calculated for each m greater than or equal to the highest number of claims for any insured. This is done for each m until the C/4 Study Manual—17th edition Copyright ©2014 ASM

35.3. BINOMIAL

689

loglikelihood reaches its maximum and starts declining. If the sample variance is not less than than the sample mean, there will not be a maximum. On the Fall 2006 exam, they asked to calculate the loglikelihood for a particular value of m. Example 35C You are given the following claim frequency data Number of Claims

Number of Risks

0 1 2 3 4+

40 52 32 4 0

Estimate the parameters using the method of moments. Answer: We solve the moment equations: 52 + 32 (2) + 4 (3) 128  1 40 + 52 + 32 + 4 128 40 (0 − 1) 2 + 32 (2 − 1) 2 + 4 (3 − 1) 2 88 mq (1 − q )  σˆ 2    0.6875 128 128 1 − q  0.6875 mq  x¯ 

q  0.3125 1 m  3.2 0.3125

Since m must be an integer, we set m  3 . Then from the first moment equation, q 

1 3

, even though

the variance does not match with this choice.

 Calculator Tip

If you wish to use the statistics registers of the TI30XS/B-Multiview to calculate mean and variance in a question like the previous example, enter the number of claims in one column and the frequencies in another:

Clear table

Enter number of claims in column 1

data data 4

0

s% 1 s% 2 s% 3 enter

L1

L3

L2

L3

L1(1)= L1 1 2 3 L1(5)=

C/4 Study Manual—17th edition Copyright ©2014 ASM

L2

35. FITTING DISCRETE DISTRIBUTIONS

690

Calculator Tip Enter frequencies in column 2

Calculate statistics

t% 40 s% 52 s% 32 s% 4 enter 2nd [stat]1 (Select L1 for DATA and L2 for FRQ) enter

L1 1 2 3

L2 52 32 4

L3

L2(5)=

s% s%

¯ is 1, while statistic 4, σ, is 0.829156198, so σ2  0.8291561982  0.6875. Statistic 2, x,

35.4

Fitting ( a, b, 1) class distributions

We will use the same notation as in Section 11.2. Namely, p kM or p Tk is the probability that the ( a, b, 1) random variable is k in the modified or truncated distribution respectively, whereas p k is the probability that the corresponding ( a, b, 0) random variable, the one with the same a and b, is k. Maximum likelihood has the following properties when fitting ( a, b, 1) distributions: 1. The fitted probability of 0 will match the proportion of 0 in the sample. In other words n0 pˆ 0M  n where n0 is the number of observed zeros and n is the sample size. If the sample has any observations of 0, maximum likelihood will not fit a truncated distribution, since in such a distribution the likelihood of 0 is zero. The converse is true as well since the fitted probability of 0 matches the observed proportion. 2. The fitted mean will equal the sample mean. As a result of this and the first property, the conditional mean given that the variable is greater than 0 will equal the sample mean of the non-zero observations. Numerical methods are needed to calculate the parameters other than p0M in almost every case. The textbook derives formulas, and none of these formulas allows solutions without a computer. For a zero-modified Poisson, the formula is x¯ 

1 − pˆ 0M 1 − p0

λ

and since p 0  e −λ , this is a mixed exponential/linear equation. Note that the right-hand side is the theoretical mean of the distribution. For a zero-modified binomial, it is necessary to construct likelihood profiles for each m, and once again the sample mean equals the theoretical mean: x¯ 

1 − pˆ 0M 1 − p0

mq

Since p0  (1 − q ) m , this is an m − 1 degree polynomial. You can maximize the likelihood for m ≤ 3, but that’s about it. The possibility m  1 is only available if no observations are greater than 1, and is the same as fitting a Bernoulli which you can easily calculate maximum likelihood for, so that only leaves m  2 or m  3. C/4 Study Manual—17th edition Copyright ©2014 ASM

35.4. FITTING ( a, b, 1) CLASS DISTRIBUTIONS

691

Example 35D You have the following observations of a discrete random variable: Value

Number of Observations

0 1 2

10 6 4

You are to fit these to a zero-modified binomial distribution using maximum likelihood. Calculate the fitted value of q when m  2 and when m  3, and determine the resulting loglikelihoods. Answer: As discussed above, pˆ 0M  n0 /n  10/20  0.5. Then for m  2, 6 + 4 (2) 1 − 0.5 (2q )  x¯   0.7 20 1 − (1 − q ) 2

q  0.7 (2q − q 2 )  1.4q − 0.7q 2

0.7q  0.4 4 qˆ  7 The fitted probabilities are p 0M  0.5

0.5 2q (1 − q )



p 1M  p 2M 



1 − (1 − q ) 2

0.5q 2 1 − (1 − q ) 2

so the likelihood and loglikelihood with qˆ  4/7 are 0.520 2q (1 − q )



L ( q | m  2) 



6  4

1 − (1 − q ) 2

q2

 10

l ( q | m  2)  14 ln 0.5 + 14 ln q + 6 ln (1 − q ) − 10 ln 1 − (1 − q ) 2





 −9.70406 − 7.83462 − 5.08379 + 2.02941  −20.5931

For m  3, 0.5 (3q )  0.7 1 − (1 − q ) 3

1.5q  0.7 (3q − 3q 2 + q 3 ) 1.5  2.1 − 2.1q + 0.7q 2

0.7q 2 − 2.1q + 0.6  0 √ 2.1 − 2.73 qˆ   0.31981 1.4 The other solution to the quadratic is rejected since then q > 1. C/4 Study Manual—17th edition Copyright ©2014 ASM

35. FITTING DISCRETE DISTRIBUTIONS

692

The likelihood and loglikelihood with qˆ  0.31981 are 0.520 3q (1 − q ) 2



L ( q | m  3) 



6 

3q 2 (1 − q )

1 − (1 − q ) 3

4

 10

l ( q | m  3)  20 ln 0.5 + 10 ln 3 + 14 ln q + 16 ln (1 − q ) − 10 ln 1 − (1 − q ) 3





 −13.86294 + 10.98612 − 15.96056 − 6.16604 + 3.77900  −21.2244

Since the maximum likelihood is less for m  3 than for m  2, it will continue to decrease for higher values of m. The maximum likelihood fit is mˆ  2, qˆ  4/7.  For a zero-modified negative binomial, two parameters in addition to p0M must be fitted. We needed numerical methods even for a non-modified negative binomial, and certainly need them here. Even when r is known numerical methods are usually required to solve the rational equation equating the mean to the sample mean, with the exception of special values of r. One particularly easy case is r  1, which characterizes a zero-modified geometric. Example 35E You have the following observations of a discrete random variable: Value

Number of Observations

0 1 2

10 6 4

You are to fit these to a zero-modified geometric distribution using maximum likelihood. ˆ Determine β. Answer: Maximum likelihood reduces to method of moments. The fitted probability of 0 is the proportion of zeros in the sample. 10 of the 20 observations are 0, so the fitted probability of 0 is  In the sample, . 0.5. The sample mean is 6 (1) + 4 (2) 20  0.7. The mean of a zero-truncated geometric is 1 + β. Therefore the mean of a zero-modified geometric is (1 − p 0M )(1 + β ) . We match the fitted mean to the sample mean:

(1 − p0M )(1 + β )  0.7 0.5 (1 + β )  0.7 βˆ  0.4



Because of the difficulty in fitting ( a, b, 1) distributions, as well as the rarity of any ( a, b, 1) distribution questions on past exams, I think the likelihood of questions from this topic, which was added to the syllabus starting with the Fall 2009 exam, is low.

35.5

Adjusting for exposure

If you are given exposures and claims for several periods, you may hypothesize a distribution for each individual and fit it to the data using maximum likelihood. If the distribution is Poisson, the fitted λ is the sample mean. Example 35F You are given the following number of exposures and claims for 5 years: C/4 Study Manual—17th edition Copyright ©2014 ASM

35.6. CHOOSING BETWEEN DISTRIBUTIONS IN THE ( a, b, 0) CLASS

Year

Number of Exposures

Number of Claims

2001 2002 2003 2004 2005

180 200 195 205 220

5 6 8 4 7

693

For each exposure, number of claims follows a Poisson distribution with the same parameter. Estimate the parameter using maximum likelihood. Answer: Maximum likelihood reduces to the method of moments, so the answer is λˆ 

35.6

5+6+8+4+7 30   0.03 180 + 200 + 195 + 205 + 220 1000



Choosing between distributions in the ( a, b, 0) class

Given a set of data which is to be fitted to a member of the ( a, b, 0) class, one way to choose the distribution to fit is by comparing the mean to the sample variance.1 If the variance is greater than the mean, then use a negative binomial; if equal, use a Poisson; if less, use a binomial. Another method multiplies the ( a, b, 0) class relationship by k: kp k  ak + b p k−1 If a sample has n k observations of k, and

P∞

k0

n k  n, then multiplying through by n, we have kn k  ak + b n k−1

(35.1)

so these fractions should be a line with slope a. If a is positive, a negative binomial is indicated; if zero, a Poisson is indicated; if negative, a binomial is indicated. This method cannot be used when n k  0, and is less reliable if there is a small amount of data; in particular, one wouldn’t expect a line for larger values of k. Of course, if the graph doesn’t look like a line, then using a member of the ( a, b, 0) class may be inappropriate. Example 35G You are given the following information for frequency of claims in one year: Number of claims

Number of policies

0 1 2 3 4 5 6+ Total

672 660 260 55 7 1 0 1655

Which distribution from the ( a, b, 0) class would best fit this data? 1Presumably one would want to use the unbiased sample variance, calculated with division by n −1, for this test, and that is how it is done in the next example. However, if the mean is greater than the variance calculated by dividing by n, it will be impossible to fit a negative binomial using method of moments or maximum likelihood. C/4 Study Manual—17th edition Copyright ©2014 ASM

35. FITTING DISCRETE DISTRIBUTIONS

694

1 0.8 0.6 0.4 0.2 0

0

1

2

3

4

5

6

Figure 35.1: Graph of ( a, b, 0) ratios for observed data

Answer: The ratios are: n1 n0 2n2 n1 3n3 n2 4n4 n3 5n5 n4

    

660  0.982143 672 2 (260)  0.787879 660 3 (55)  0.634615 260 4 (7)  0.509091 55 5 (1)  0.714286 7

These are graphed in Figure 35.1. The decreasing slope indicates a binomial fit. There is a sudden increase at 5, but this is due to the  small amount of data. . The sample mean is 660 (1) + 260 (2) + 55 (3) + 7 (4) + 1 (5) 1655  0.8326, and the second moment is 660 (12 ) +260 (22 ) +55 (32 ) +7 (42 ) +1 (52 ) 1655  1.4091, so the sample variance is 0.7163 which is less than the sample mean, also indicating a binomial fit.



C/4 Study Manual—17th edition Copyright ©2014 ASM

.

1655 2 1654 (1.4091−0.8326 )





35.6. CHOOSING BETWEEN DISTRIBUTIONS IN THE ( a, b, 0) CLASS

695

Table 35.1: Summary of Fitting Discrete Distributions

• For a Poisson with complete data, the method of moments and maximum likelihood estimators of ¯ λ are both x. • For a negative binomial with complete data, – The method of moments estimators are σˆ 2 − x¯ βˆ  x¯

rˆ 

x¯ 2 σˆ 2 − x¯

¯ If one of them is known, the other one is set equal to x¯ – Maximum likelihood sets rˆ βˆ  x. divided by the known one. • For a binomial with complete data, method of moments may not set m equal to an integer. Maximum likelihood proceeds by calculating a likelihood profile for each m ≥ max x i . The maximum ¯ likelihood estimate of q given m is x/m. When the maximum likelihood for m + 1 is less than the one for m, the maximum overall is attained at m. • For modified ( a, b, 1) distributions, pˆ 0M  n0 /n and the mean is set equal to the sample mean. • Fitting λ of a zero-modified Poisson requires numerical techniques. • Fitting q for a zero-modified binomial for fixed m requires solving a high-degree polynomial unless m ≤ 3. • Fitting β for a zero-modified negative binomial for fixed r requires numerical techniques except for special values of r, like 1.

• If you are given a table with varying exposures and claims, and individual claims have a Poisson distribution with the same λ, the maximum likelihood estimate of λ is the sample mean, or the sum of all claims over the sum of all exposures. • To choose between ( a, b, 0) distributions to fit to data, two methods are available: ¯ Choose binomial if it is less, Poisson 1. Compare the sample variance σˆ 2 to the sample mean x. if equal, and negative binomial if greater. 2. Calculate kn k /n k−1 , and observe the slope as a function of k. Choose binomial if negative, Poisson if zero, and negative binomial if positive.

C/4 Study Manual—17th edition Copyright ©2014 ASM

35. FITTING DISCRETE DISTRIBUTIONS

696

Exercises You are given the following claim frequency data:

35.1.

Frequency

Number of Insureds

0 1 2 3 4 5 6+

520 195 57 22 5 1 0

A Poisson distribution is fitted to the data using maximum likelihood. Determine the estimated probability of no claims. You are given the following claims frequency data:

35.2.

Frequency 0–1 2 3 4 5+ Number of insureds 35 10 4 1 0 Data for 0 and 1 claims has been combined. A Poisson distribution is fitted to the data, using maximum likelihood. Determine the resulting estimate of the mean. [4B-F98:17] (2 points) You are given the following:

35.3. •

The number of claims per year for a given risk follows a distribution with probability function p (n ) 



λ n e −λ , n  0, 1, . . . , λ > 0. n!

Two claims were observed for this risk during Year 1, and one claim was observed for this risk during Year 2. If λ is known to be an integer, determine the maximum likelihood estimate of λ.

(A) 1

(B) 2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 3

(D) 4

(E) 5

Exercises continue on the next page . . .

EXERCISES FOR LESSON 35

35.4.

697

[4B-S90:24] (1 point) Assume that the number of claims for an insured has a Poisson distribution e −θ · θ n . n!

p (n ) 

The following numbers of claims were observed: 3,

1,

2,

1

Calculate the method of moments estimate of θ. (A) (B) (C) (D) (E) 35.5.

Less than 1.40 At least 1.40, but less than 1.50 At least 1.50, but less than 1.60 At least 1.60, but less than 1.70 At least 1.70 You are given the following data for claim frequency: Number of Claims

Number of Policies

0 1 2 3 4 5+

134 45 15 5 1 0

The data are fitted to a geometric distribution. Determine the maximum likelihood estimate of p0 , the probability of 0 claims. 35.6.

You are given the following claim frequency data: Number of Claims Number of Insureds

0 60

1 22

2 11

3 5

4 2

5+ 0

The data are fitted to a negative binomial distribution using the method of moments. Determine the resulting estimate of the probability of 0 claims. 35.7.

You are given the following claim frequency data: Number of Claims Number of Policies

0 34

1 50

2 16

3+ 0

Calculate the maximum value of the loglikelihood function of the binomial distribution at m  4.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

35. FITTING DISCRETE DISTRIBUTIONS

698

35.8.

You are given the following observations for number of claims per year from a policy over 5 years: 0

2

0

1

1

You fit a Poisson distribution with mean λ to the number of claims using percentile matching at the median. The values of λ for which the median of the Poisson distribution equals the observed median are in an interval ( a, b]. Determine a. 35.9. [4-F02:6] The number of claims follows a negative binomial distribution with parameters β and r, where β is unknown and r is known. You wish to estimate β based on n observations, where x¯ is the mean of those observations. Determine the maximum likelihood estimate of β. ¯ 2 (A) x/r

¯ (B) x/r

(C) x¯

(E) r 2 x¯

(D) r x¯

35.10. [4-F04:32] You are given: (i) The number of claims follows a Poisson distribution with mean λ. (ii) Observations other than 0 and 1 have been deleted from the data. (iii) The data contain an equal number of observations of 0 and 1. Determine the maximum likelihood estimate of λ. (A) 0.50

(B) 0.75

(C) 1.00

(D) 1.25

(E) 1.50

35.11. You are given: (i) The number of claims follows a Poisson distribution with mean λ. (ii) Half of the observations in the data are 0. Determine the maximum likelihood estimate of λ. 35.12. You are given: (i) The number of claims follows a geometric distribution with mean β. (ii) Half of the observations of number of claims are 0. Determine the maximum likelihood estimate of β.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 35

699

35.13. You are given the following claims settlement activity for a book of automobile claims as of the end of 2008:

Year Reported 2006 2007 2008

Number of Claims Reported 6 8 6

Number of Claims Settled 2006 4

Year Settled 2007 2 5

2008 0 3 6

L  Year Settled − Year Reported is a random variable describing the time lag in settling a claim. L follows a Poisson distribution. Determine the maximum likelihood estimate of mean lag in settling a claim, E[L]. 35.14. You are given the following claims settlement activity for a book of automobile claims as of the end of 2008: Year Reported 2006 2007 2008

Number of Claims Settled Year Settled 2006 2007 4 2 5

2008 Unknown 3 6

L  Year Settled − Year Reported is a random variable describing the time lag in settling a claim. L follows a Poisson distribution. Determine the maximum likelihood estimate of mean lag in settling a claim, E[L]. 35.15. [4-S01:34] You are given the following claims settlement activity for a book of automobile claims as of the end of 1999: Year Reported 1997 1998 1999

Number of Claims Settled Year Settled 1997 1998 Unknown 3 5

1999 1 2 4

L  Year Settled − Year Reported is a random variable describing the time lag in settling a claim. The probability function of L is f L ( l )  (1 − p ) p l , for l  0, 1, 2, . . .. Determine the maximum likelihood estimate of the parameter p.

(A) 3/11

(B) 7/22

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1/3

(D) 3/8

(E) 7/15

Exercises continue on the next page . . .

35. FITTING DISCRETE DISTRIBUTIONS

700

35.16. You are given the following claims settlement activity for a book of automobile claims as of the end of 2008:

Year

Number of Claims

Reported 2006 2007 2008

Reported 7 10 12

Number of Claims Settled

Number of Claims Not Settled 1 2 6

Year Settled 2006 4

2007 2 5

2008 0 3 6

L  Year Settled − Year Reported is a random variable describing the time lag in settling a claim. L follows a geometric distribution. Determine the maximum likelihood estimate of mean lag in settling a claim, E[L]. 35.17. The variable X follows a Poisson distribution with mean λ. Determine the information matrix for 50 observations of X if λ  20. 35.18. A sample of 50 is drawn from a Poisson distribution with mean λ  2 Calculate the information matrix for λ. 35.19. A sample of 60 is drawn from a geometric distribution with mean β  5. Calculate the information matrix for β. 35.20. In a sample of ten policyholders who submitted at least one claim in a year, nine of them submitted exactly one claim and the tenth one submitted n claims. The data were fitted to a zero-truncated Poisson distribution using maximum likelihood. The estimated value of λ is 0.54986. Determine n. 35.21. You are given the following accident data from 100 insurance policies: Number of accidents 0 1 2 3 4+

Number of policies 70 16 8 6 0

The data are fitted to a zero-modified negative binomial distribution with r  −0.5. Determine the fitted value of β.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 35

701

35.22. [4B-F92:21] (1 point) A portfolio of 10,000 risks yields the following: Number of Claims 0 1 2 3 4

Number of Insureds 6,070 3,022 764 126 18

Based on the portfolio’s sample moments, which of the following distributions provides the best fit to the portfolio’s number of claims? (A) (B) (C) (D) (E)

Binomial Poisson Negative binomial Lognormal Pareto

35.23. [4-S00:40] You are given the following accident data from 1000 insurance policies: Number of accidents

Number of policies

0 1 2 3 4 5 6 7+ Total

100 267 311 208 87 23 4 0 1000

Which of the following distributions would be the most appropriate model for this data? (A) (B) (C) (D) (E)

Binomial Poisson Negative Binomial Normal Gamma

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

35. FITTING DISCRETE DISTRIBUTIONS

702

35.24. You are given the following claim frequency data: Number of Claims Number of Insureds

0 20

1 14

2 10

3 6

4 3

Which probability distribution is suggested by this data based on (i) successive ratios of probabilities, and (ii) moments? (A) (B) (C) (D) (E)

(i) Binomial, (ii) Binomial (i) Poisson, (ii) Binomial (i) Negative binomial, (ii) Poisson (i) Negative binomial, (ii) Negative binomial The correct answer is not given by (A) , (B) , (C) , or (D) .

35.25. You are given the following claim frequency data: Number of Claims Number of Insureds

0 536

1 400

2 150

3 37

4 7

Which probability distribution is suggested by this data based on (i) successive ratios of probabilities, and (ii) moments? (A) (B) (C) (D) (E)

(i) Binomial, (ii) Binomial (i) Poisson, (ii) Binomial (i) Negative binomial, (ii) Poisson (i) Negative binomial, (ii) Negative binomial The correct answer is not given by (A) , (B) , (C) , or (D) .

35.26. [4-F03:32] The distribution of accidents for 84 randomly selected policies is as follows: Number of Accidents

Number of Policies

0 1 2 3 4 5 6 Total

32 26 12 7 4 2 1 84

Which of the following models best represents these data? (A) (B) (C) (D) (E)

Negative binomial Discrete uniform Poisson Binomial Either Poisson or Binomial

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 35

703

35.27. [4-F04:8] You are given the following observations for number of claims per year from a policy over 5 years: 0

0

1

2

2

You fit a binomial distribution ( m, q ) with the following requirements: (i) The mean of the fitted model equals the sample mean. (ii) The 33rd percentile of the fitted model equals the smoothed empirical 33rd percentile of the sample. Determine the smallest estimate of m that satisfies these requirements. (A) 2

(B) 3

(C) 4

(D) 5

(E) 6

35.28. [Sample C4:3] A fleet of cars has had the following experience for the last three years: Earned Car Years

Number of Claims

500 750 1,000

70 60 100

The Poisson distribution is used to model this process. Determine the maximum likelihood estimate of the Poisson parameter for a single car year. 35.29. [4-F04:34] You are given: (i)

(ii)

The ages and number of accidents for five insureds are as follows: Insured

X=Age

Y=Number of Accidents

1 2 3 4 5

34 38 45 25 21

2 1 0 3 3

Total

163

9

Y1 , Y2 ,. . . , Y5 are independently Poisson distributed with µ i  βX i , i  1, 2, . . . , 5.

ˆ Estimate the standard deviation of β. (A) (B) (C) (D) (E)

Less than 0.015 At least 0.015, but less than 0.020 At least 0.020, but less than 0.025 At least 0.025, but less than 0.030 At least 0.030

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

35. FITTING DISCRETE DISTRIBUTIONS

704

35.30. [C-S05:13] You are given claim count data for which the sample mean is roughly equal to the sample variance. Thus you would like to use a claim count model that has its mean equal to its variance. An obvious choice is the Poisson distribution. Determine which of the following models may also be appropriate. (A) (B) (C) (D) (E)

A mixture of two binomial distributions with different means A mixture of two Poisson distributions with different means A mixture of two negative binomial distributions with different means None of (A), (B) or (C) All of (A), (B) and (C)

Additional released exam questions: C-F05:29, C-F06:12,15

Solutions 35.1. n  520 + 195 + 57 + 22 + 5 + 1  800 Since the maximum likelihood estimator is the sample mean, 195 (1) + 57 (2) + 22 (3) + 5 (4) + 1 (5)  0.5 800  0.6065

x¯  e −0.5 35.2.

The shortcut cannot be used here because of the grouped data. The likelihood of 0 or 1 claims is Pr ( N  0 or 1)  e −λ (1 + λ )

so we have: L ( λ )  e −50λ (1 + λ ) 35 λ 2 (10) +3 (4) +4 (1) l ( λ )  −50λ + 35 ln (1 + λ ) + 36 ln λ dl 35 36  −50 + + 0 dλ 1+λ λ 50λ 2 − 21λ − 36  0 √ 21 + 441 + 7200  1.0841 λ 100 35.3. If not for the constraint that λ has to be an integer, the answer would be the sample mean over 2 −λ λ and the likelihood of years, or 2+1 2  1.5. We thus should check 1 and 2. The likelihood of 1 claim is e 1! 2 2 claims is e −λ λ2! . Multiplying these together, we have L ( λ )  e −2λ

λ3 2

1 −2 e  0.068 2 L (2)  4e −4  0.073

L (1) 

The answer is 2 . (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 35

35.4. 35.5.

705

The estimate is θ˜  (3 + 1 + 2 + 1) /4  1.75 . (E)

45 (1) + 15 (2) + 5 (3) + 1 (4) 94   0.47. 134 + 45 + 15 + 5 + 1 200 The maximum likelihood estimator is the same as the method of moments estimator (by the shortcut; a geometric distribution is a negative binomial with r  1) , so the mean of the geometric distribution, β, is 1  0.6803 . 0.47. Then p 0  1.47 x¯ 

35.6. x¯  0.67  rβ 1 σˆ 2  (143) − 0.672  0.9811  rβ (1 + β ) 100 0.9811 1+β   1.4643 0.67 0.67 β  0.4643, r  0.4643 p0  (1 + β ) −r  0.5767 35.7. L l q

X

Y 4! xi

X

ln

q

P

xi

(1 − q ) 400−

P

xi

4 + 82 ln q + 318 ln (1 − q ) xi

!

82  0.205 400

4 ln  50 ln 4 + 16 ln 6  97.9829 xi

!

l  97.9829 + 82 ln (0.205) + 318 ln (0.795)  −104.92 35.8. The sample median is 1. Thus you want to fit a Poisson distribution for which 1 is the median. Unlike for continuous distributions, you cannot find the median by setting F ( x )  0.5 for a discrete distribution. The median is defined as x such that Pr ( X < x ) < 0.5 and Pr ( X ≤ x ) ≥ 0.5. This means we need F (1) ≥ 0.5 and F (0) < 0.5. The higher F (0) is, the lower λ is. So to get the lowest value of λ, a, (or more strictly speaking, the greatest lower bound of the possible values of λ), we set F (0)  0.5. (Setting F (1)  0.5 will get b.) Thus e −a  0.5, and a  − ln (0.5)  0.6931 . 35.9.

¯ so β  xr¯ . (B) The maximum likelihood estimator of the mean rβ is the sample mean, x,

35.10. Since there are only two classes (0 or 1), this question is best done using the Bernoulli technique (Section 33.5). By that technique, the fitted probabilities equal the sample probabilities, so Pr ( N  0 | N < 2)  Pr ( N  1 | N < 2) , or p0 p1  p0 + p1 p0 + p1 p0  p1 e

−λ

 λe −λ

λ 1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

35. FITTING DISCRETE DISTRIBUTIONS

706

35.11. Using the Bernoulli technique, p0  e −λ is the observed frequency, or 12 , so λ  − ln (1/2)  0.6931 . 35.12. Using the Bernoulli technique, Pr ( N  0)  0.5. However, for a geometric, Pr ( N  0) 

1 1+β ,

so

β 1. 35.13. Putting all the data together, there were 20 claims, of which 15 were settled with L. 0 and 5 with  ˆ L  1. The Poisson maximum likelihood estimate is the sample mean, or λ  15 (0) + 5 (1) 20  0.25 . 35.14. This time, the data are truncated to the right; you have no information whatsoever about claims settled in later years. Thus this question is similar to exercise 35.10. Note that 2008 claims provide no information: the likelihood of a lag of 0 in settling a claim, given that the lag is no more than 0, is of course 1. From 2006 and 2007, there are 9 claims with L  0 and 5 claims with L  1. Thus maximum likelihood (using the Bernoulli technique) sets p1  95 p0 . Let λ be the Poisson parameter. p 1  λe −λ  λp0 , so λ 

5 9

.

35.15. A lot of students had difficulty with this question. The data are truncated both on the left and on the right. For claims reported in 1997, you only have data for L  1 and L  2. If L > 2, you have no data at all, since the study is stopped in 1999. This is not censoring, where you know how many claims weren’t settled by 1999; this is right truncation. And the lack of knowledge about 1997 (L  0) is left truncation. This means that in calculating the likelihood, you must divide by the condition that the L  1 or L  2. The likelihood for the 3 settlements in 1998 is

(1 − p ) p (1 − p ) p + (1 − p ) p 2 and for the settlement in 1999

!3

(1 − p ) p 2 (1 − p ) p + (1 − p ) p 2

since the probability of L  1 or 2 is (1 − p ) p + (1 − p ) p 2 . The data in 1998 is right truncated as well; you only have it if L  0 or 1, so you must divide by the probability of L  0 or 1, which is (1 − p ) + (1 − p ) p. The likelihood of the 5 settled in 1998 is

and for the two claims settled in 1999

(1 − p ) (1 − p ) + (1 − p ) p

!5

(1 − p ) p (1 − p ) + (1 − p ) p

!2

The data in 1999 has likelihood 1. The probability that a claim reported in 1999 was settled in 1999, given that it was settled no later than 1999, is 1.

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 35

707

Multiplying the five likelihoods together, with a little simplification, we get L (p )   

1−p 1 − p2

!3

p − p2 1 − p2

!

p 3 (1 − p ) 11 (1 − p 2 ) 11

1−p 1 − p2

!5

p − p2 1 − p2

!2

p3 (1 + p ) 11

l ( p )  3 ln p − 11 ln (1 + p ) dl 3 11  − 0 dp p 1 + p 3 + 3p − 11p  0 pˆ 

3 8

(D)

That was the hard way to solve it. The easy way was to use the Bernoulli technique. We need two facts: 1. 2.

As discussed above, the claim reported in 1999 has no information. A geometric distribution has no memory at the integers. This means Pr ( N ≥ m | N ≥ k )  Pr ( N > m − k ) for m ≥ k.

With the given truncation, the claims settled in 1998 have the same likelihood regardless of whether they were reported in 1997 and 1998, and the same for claims settled in 1999. Thus there are only 2 possibilities: settled in 1998 and settled in 1999. With 2 ranges and 1 parameter, maximum likelihood gives each range the observed proportion. Thus the probability of settlement in year 0 given settlement in year 0 or 1 is 3+5 8 3 3+5+1+2  11 and the probability of settlement in year 1 given settlement in year 0 or 1 is 11 . However, Pr ( N  1 | N  0 or 1)  p Pr ( N  0 | N  0 or 1) , so p  this way, but the math was easy.

3/11 8/11



3 8

. A lot of insight was needed to solve it

35.16. Unlike in the previous exercise, the Bernoulli technique is unavailable because there are more than two possibilities for claim lag in the sample. Let β be the mean, and let p  β/ (1 + β ) . For L fol n lowing a geometric distribution with parameter β and n an integer, Pr ( L ≥ n )  β/ (1 + β )  p n ; see equation (11.1). Therefore, the likelihoods are Pr ( L  0)  1 − p

Pr ( L  1)  (1 − p ) p

Pr ( L > 0)  p

Pr ( L > 1)  p 2 Pr ( L > 2)  p 3 The likelihood function is

 5      (1 − p ) 15 (1 − p ) p p 6 ( p 2 ) 2 p 3  (1 − p ) 20 p 18

We know that whenever the likelihood function is of the form p a (1 − p ) b , the maximum will be a/ ( a + b ) ; see the discussion at the beginning of Section 33.5. So pˆ  18/38  9/19. The mean is pˆ  C/4 Study Manual—17th edition Copyright ©2014 ASM

βˆ

1 + βˆ

35. FITTING DISCRETE DISTRIBUTIONS

708

βˆ 

pˆ 9/19 9   1 − pˆ 10/19 10

35.17. The likelihood function, ignoring division by

Q

x i !, is

L ( λ )  e −50λ λ

P50

i1

xi

Logging and differentiating twice, l ( λ )  −50λ + *

50 X

x i + ln λ

, i1 P50

-

xi dl  −50 + i1 dλ λ P50 x d2 l i  − i12 dλ 2 λ The information matrix is the expected value of the negative of the second derivative. The expected value of x i is λ, so the information matrix is 50λ/λ 2  50/λ. If λ  20, this is 2.5 . Since the MLE is the sample mean, the variance of the MLE is the distribution variance over the size of the sample, or λ/50. The information matrix is the reciprocal of the variance. 35.18. Let the sample be x 1 , . . . , x50 . The probability of each item, dropping the constant 1/x i !, is e −λ λ x i , with logarithm −λ + x i ln λ, so the loglikelihood, which is the sum of these logarithms, is l ( λ )  −50λ + 50 x¯ ln λ

¯ 2 . The expected value of x¯ where we’ve used x¯  x i /50. The negative second derivative of this is 50x/λ is λ, so this is 50/λ. Since λ  2, the information matrix is 25 . Since the MLE is the sample mean, the variance of the MLE is the variance of the distribution divided by n. The variance of a Poisson is λ and n  50, so the variance of the MLE is λ/50. The inverse of this is 50/λ. This is the inverse of the true variance and the information matrix is the inverse of the asymptotic variance, so potentially they could be different, but to my knowledge, when the MLE is the sample mean, the information matrix is always equal to n divided by the distribution’s variance.

P

35.19. Let the sample be x 1 , . . . , x60 . The probability of each item is β x i / (1+β ) x i +1 , with logarithm x i ln β− ( x i + 1) ln (1 + β ) , so the loglikelihood, which is the sum of these expressions, is where we’ve used x¯ 

l ( β )  60x¯ ln β − 60 ( x¯ + 1) ln (1 + β )

P

x i /60. The negative second derivative of this is 60 x¯ 60 ( x¯ + 1) − β2 (1 + β ) 2

The expected value of x¯ is β, so this is 60 60 60 −   2 β 1 + β β (1 + β ) Since the MLE is the sample mean, the variance of the MLE is the variance of the distribution divided by n. The variance of is β (1 + β ) and n  60, so the variance of the MLE is β (1 + β ) /60. The  a geometric  inverse of this is 60/ β (1 + β ) . As in the previous exercise, the inverse of the true variance is equal to the information matrix. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 35

709

35.20. The maximum likelihood estimate of the expected value is the sample mean. The expected value of a zero-truncated Poisson is λ/ (1 − e −λ ) , and 0.54986/ (1 − e −0.54986 )  1.3. It follows that the sample mean is 1.3, or n  4 . 35.21. Maximum likelihood sets pˆ 0M equal to the proportion of observed 0’s, or 0.7. Then we match the

fitted mean to the sample mean, which is 16 (1) + 8 (2) + 6 (3) /100  0.5.



0.3



−0.5β  0.5 1 − (1 + β ) 0.5

!

−0.3β  1 − (1 + β ) 0.5

0.3β + 1  (1 + β ) 0.5 0.09β2 + 0.6β + 1  1 + β 0.09β − 0.4  0 40 βˆ  9 35.22. x¯ 

X

5,000  0.5 10,000

k 2 n k  7,500 σˆ 2 

7,500 − 0.52  0.5 10,000

Poisson (B) You could adjust the sample variance to make it unbiased by multiplying by a difference.

10,000 9,999 , but it hardly makes

35.23. The mean is x¯ 

100 (0) + 267 (1) + 311 (2) + 208 (3) + 87 (4) + 23 (5) + 4 (6) 2 1000

The second moment is µ02 

100 (0) + 267 (1) + 311 (4) + 208 (9) + 87 (16) + 23 (25) + 4 (36)  5.494 1000

The variance is 5.494 − 22  1.494. If you wish, you could multiply this by 1000 999 to obtain the unbiased sample variance, but it hardly makes a difference. The variance is significantly less than the mean, making the binomial the most appropriate distribution for this model of the three discrete distributions. The normal and gamma distributions are continuous and therefore inappropriate. Alternatively, you could use the ( a, b, 0) class criterion. Calculating nknk−1k , you get k

1

2

3

4

5

6

kn k n k−1

2.67

2.33

2.01

1.67

1.32

1.04

These ratios steadily decrease, implying a < 0 (in the ( a, b, 0) class expression ak + b, formula (35.1)) which implies the distribution is binomial. (A) C/4 Study Manual—17th edition Copyright ©2014 ASM

35. FITTING DISCRETE DISTRIBUTIONS

710

35.24. (i) k

1

2

3

4

kn k n k−1

0.7

1.43

1.8

2

Negative binomial (ii) x¯ 

64  1.2075 53

64 156 − σˆ  53 53 2

!2

 1.4853 > 1.2075

Negative binomial (D) Adjusting the sample variance by multiplying by

53 52

would make it even higher.

35.25. (i) k

1

2

3

4

kn k n k−1

0.746

0.75

0.74

0.757

Poisson (ii) 839  0.7425 1130 400 + 150 (22 ) + 37 (32 ) + 7 (42 ) 1445 µ02    1.2788 1130 1130 σˆ 2  1.2788 − 0.74252  0.7275 x¯ 

Binomial (B) If we calculate s 2  35.26. The mean is

1130 2 ˆ 1129 σ

x¯ 

 0.7281, it is closer to the mean, but still far enough to use a binomial.

32 (0) + 26 (1) + 12 (2) + 7 (3) + 4 (4) + 2 (5) + 1 (6)  1.2262 84

The second moment is µ02 

32 (0) + 26 (1) + 12 (4) + 7 (9) + 4 (16) + 2 (25) + 1 (36)  3.4167 84

The variance is 3.4167 − 1.22622  1.9131. If you wish, you could multiply this by 84 83 to obtain the unbiased sample variance, but it doesn’t make a difference. The variance is significantly greater than the mean, making the negative binomial the most appropriate distribution for this model of the three discrete distributions. Alternatively, you could use the ( a, b, 0) class criterion. Calculating nknk−1k , you get C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 35

711

k

1

2

3

4

5

6

kn k n k−1

0.8125

0.9231

1.75

2.2857

2.5

3

These ratios steadily increase, implying a > 0 (in the ( a, b, 0) class expression ak + b, formula (35.1)) which implies the distribution is negative binomial. (A) 35.27. The sample mean is 1, so condition (i) requires mq  1. Refer to page 555 for the definition of the smoothed empirical percentile. The 33rd smoothed empirical percentile is the order statistic determined from 0.33 (5 + 1)  2, so it is the second order statistic, the second lowest number in the sample, or 0. Thus we have two conditions on m: 1.

mq  1 and

2.

Pr ( N  0) ≥ 0.33.

Then

1 m ≥ 0.33 m This can be solved by trial and error. (As m → ∞, the expression goes to 1/e  0.3679). Alternatively,2 you can log the expression:   1 ≥ ln 0.33 m ln 1 − m and expand ln (1 − x ) in a Taylor series:



Pr ( N  0)  (1 − q ) m  1 −

ln (1 − x )  −

1+



x x2 x3 − − −··· 1 2 3

1 1 + · · · ≤ − ln 0.33  1.10867 2m 3m 2

Anything less than m  5 causes an overage already with just 1 + 1/2m, and m  5 gives you 1.1 when you add up two terms, but 1.1133 when you add up three terms 1 + 1/2m + 1/3m 2 , which is too high. That leaves 6 as the only possibility. 6 can then be verified directly as working. (E) 35.28. The maximum likelihood estimate is the mean, or 70 + 60 + 100 230   0.1022 500 + 750 + 1000 2250 35.29. When you think a little about it, you realize that X, age, plays the same role as exposure, and therefore we use the principle of Section 35.5. We know that the maximum likelihood estimator is the 9 mean, 163 . The variance of the mean is the variance of a single observation over the number in the sample, which here is 163 (remember, age is like exposure, so there are 163 exposures in the sample, not 5). Here, 9 . So the variance each observation is Poisson, so the variance of a single observation equals the mean, or 163 √ 2 of 163 observations is 9/163  0.000338741 and the standard deviation is 0.000338741  0.0184 . (B) 35.30. The variance of a mixture is always at least as large as the weighted average of the variances of the components, and usually greater. This follows from the conditional variance formula, equation (4.2) on page 64. If we let X be the mixed random variable and I the component of the mixture X belongs to, then f g   Var ( X )  EI VarX ( X | I ) + VarI EX [X | I] 2David Bassein showed me this method C/4 Study Manual—17th edition Copyright ©2014 ASM

712

35. FITTING DISCRETE DISTRIBUTIONS

and the weighted average of the variances is the first summand only. Since a binomial has a variance less than its mean, mixing two may make the variance equal to the mean, making (A) a possibility. The other mixtures wouldn’t work because in each case the variance would be greater than the mean.

Quiz Solutions 35-1. Since the estimator is the sample mean, the variance is λ/n, which can be estimated by using the fitted value of λ as 1/350 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 36

Hypothesis Tests: Graphic Comparison Reading: Loss Models Fourth Edition 16.1–16.3 With respect to fitting parametric models, keep the following in mind. When testing an alternative hypothesis, we do not want to accept it (and reject the null hypothesis) unless it is very unlikely that we would observe what we did if the null hypothesis were true. We make it hard to accept the alternative hypothesis, even though this means that we will often not reject the null hypothesis when the alternative hypothesis is true. The two hypotheses are not symmetric. In fitting parametric models, the null hypothesis is the model we fitted; the alternative hypothesis is that the parametric model is wrong. Loss Models 16.2 introduces the following notation, used in the rest of the chapter. Fn and f n continue to refer to the empirical distribution. The fitted distribution will now be denoted by F and f . F ∗ and f ∗ will be used to denote the fitted distribution adjusted. To make the fitted distribution consistent with the empirical distribution, it will be truncated when the empirical distribution is. In other words, if observed data are left-truncated at d, and if we let F and f refer to the unmodified distribution, F (x ) − F (d ) 1 − F (d ) f (x ) f ∗ (x )  1 − F (d )

F∗ (x ) 

(36.1)

No adjustment to F is made for censoring. In a censored distribution, we won’t have observations for values of x above the censored value since all we would know about such an observation is that it is above the censored value; thus it will not be possible to compare the fitted and actual distribution for x above the censored value.

36.1

D ( x ) plots

We can plot Fn and F ∗ , or f n and f ∗ , on the same graph and compare them. This can be done even with grouped data, in which case the histogram is used for f n and the ogive for Fn . To amplify differences, we can plot the function D ( x ) , defined by D ( x )  Fn ( x ) − F ∗ ( x ) . This plot is only for individual data.1 Example 36A You are given the following five data points: 1

2

4

9

14

You fit these to an exponential using maximum likelihood. Create a D ( x ) plot for this fit. Answer: The mean of the five points is 6, and maximum likelihood for an exponential is the same as method of moments (see item number 1 on page 601), so the fitted θ is 6. Thus we plot the empirical function minus 1 − e −x/6 , which is 1Loss Models, in Example 16.2, says this cannot be done for grouped data, but I don’t know why the ogive can’t be used for this purpose. C/4 Study Manual—17th edition Copyright ©2014 ASM

713

36. HYPOTHESIS TESTS: GRAPHIC COMPARISON

714

0.15 0.1 0.05 0 −0.05 −0.1 −0.15 2

4

6

8

10

12

14

16

18

Figure 36.1: D ( x ) plot for Example 36A

e −x/6 − 1 e −x/6 − 0.8 e −x/6 − 0.6 e −x/6 − 0.4 e −x/6 − 0.2 e −x/6 The plot is shown in Figure 36.1.

?

in the interval (0, 1) , in the interval (1, 2) , in the interval (2, 4) , in the interval (4, 9) , in the interval (9, 14) , and in the interval (14, ∞) .



Quiz 36-1 You are given the following five observations: 2

5

9

18

35

They are fitted to a single-parameter Pareto with θ  1 using maximum likelihood. Determine D (10) .

36.2

p–p plots

Another graph, useful only for individual data, is a p–p plot. Suppose the n observations, sorted, are x1 ≤ x2 ≤ · · · ≤ x n . The p–pplot plots the empirical distribution on the x axis against the fitted distribution  on the y axis—the points Fn ( x j ) , F ∗ ( x j ) . However, for this purpose, we define Fn ( x j )  j/ ( n + 1) for j  1, . . . , n instead of the usual definition Fn ( x j )  j/n. We divide by n + 1 instead of by n, since the expected value of Fn ( x j ) is j/ ( n + 1) . Thus the plot has an x axis ranging from 0 to 1 and a y axis ranging from 0 to 1. Horizontally, there is one point plotted at every multiple of 1/ ( n + 1) , except as noted in the next paragraph. If there is a tie among the x j ’s, we can either plot all the points anyway (the “y” value, F ∗ ( x j ) , will be the same for all of them, but the “x” value will be different) or use an average of the “x 00 values. Censored values do not get plotted. Example 36B (Same data as Example 36A) You are given the following five data points: 1 C/4 Study Manual—17th edition Copyright ©2014 ASM

2

4

9

14

36.2. p–p PLOTS

715

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

1/6

2/6

3/6

4/6

0

1

5/6

0

0.2

0.4

0.6

0.8

1

Figure 36.3: p – p plot for Example 36C

Figure 36.2: p – p plot for Example 36B

You fit these to an exponential using maximum likelihood. Create a p–p plot for this fit. Answer: The fitted values for each of the five observations are F ∗ ( x j )  1 − e −x j /6 : xj F∗ (x j )

1 0.15352

2 0.28347

4 0.48658

9 0.77687

14 0.90303

The plot is shown in Figure 36.2. The five points are drawn, and are connected with lines, although they don’t have to be; if there are a lot of observation points, they will have a pattern even without connecting the dots. The diagonal line from (0,0) to (1,1) is typically drawn for comparison. A perfect fit would have this line as its p–p plot. 

?

Quiz 36-2 You are given the following five observations: 2

5

9

18

35

They are fitted to a single-parameter Pareto with θ  1 and α  0.5. Determine the x and y coordinates of the point on a p–p plot for the fit corresponding to 9. For a good fit, the smoothed empirical percentiles and the fitted percentiles should be close. The graph should be close to a straight line from (0,0) to (1,1), as indicated in the answer to the previous example. If you connect consecutive points and the slope is less than 45◦ , this means that the fitted distribution is putting too little weight in that region; conversely, if the slope is greater than 45◦ , the fitted distribution is putting too much weight in that region. Example 36C A loss distribution is fit to a Pareto. The p–p plot for the fit is in Figure 36.3. In which interval(s) does the fit have too much weight? Answer: The slope of the fit is greater than 1 in the intervals (0, 0.1), (0.4, 0.6), and (0.9, 1.0). Therefore, the answer is (0, 0.1), (0.4, 0.6), (0.9, 1.0) . Notice that the weight is too high in (0, 0.1) and in (0.9, 1.0), even though you don’t plot the points (0,0) and (1,1). The fact that the p–p plot’s second coordinate is 0.15, rather than 0.1, when the first coordinate is 0.1 implies that too much weight was placed on the interval (0, 0.1). Don’t forget these intervals at the ends!  C/4 Study Manual—17th edition Copyright ©2014 ASM

36. HYPOTHESIS TESTS: GRAPHIC COMPARISON

716

Exercises 36.1.

The D ( x ) plot comparing the model to the data is shown in Figure 36.4.

In which of the following region(s) does the model put too much weight? I.

0–100

II.

100–200

III.

200–300

IV.

300–400

0.1

0.05

0

−0.05 −0.1 0

200

100

300

400

Figure 36.4: Use with exercise 36.1

36.2.

You are given the following 10 losses from an insurance coverage with a deductible of 10: 12

12

15

20

30

50

100

200

250

500

These observations are fitted to a Weibull with τ  2 by matching the Weibull’s median with the smoothed empirical median of the sample. Calculate D (30) . 36.3.

The following observed losses are fit to a uniform distribution on [0, 1000]: 70

82

125

210

330

515

690

800

870

980

Calculate the x and y coordinates in a p–p plot corresponding to the loss of 125.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 36

717

Use the following information for questions 36.4 and 36.5: You have the following observations for the loss size (before application of the deductible, but considering the limit) of a coverage with deductible 500 and policy limit 4500: 600

1200

1800

3000

5000

5000

5000

The observations of 5000 are for losses greater than 5000. 36.4. The ground-up losses are fit to an exponential with mean 4000. A D ( x ) plot is drawn to judge the goodness of the fit. Calculate D (4000) . 36.5. The ground-up losses are fit to an exponential using maximum likelihood. A D ( x ) plot is drawn to judge the goodness of the fit. Calculate D (4000) . 36.6.

[C-S05:5] You are given the following p–p plot: 1.0

F (x )

0.8 0.6 0.4 0.2 0.0 0.0

0.2

0.4

0.6

0.8

1.0

Fn (x ) The plot is based on the sample: 1

2

3

15

30

50

51

99

100

Determine the fitted model underlying the p–p plot. (A) (B) (C) (D) (E)

F ( x )  1 − x −0.25 , x ≥ 1 F ( x )  x/ (1 + x ) , x ≥ 0 Uniform on [1, 100] Exponential with mean 10 Normal with mean 40 and standard deviation 40

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

36. HYPOTHESIS TESTS: GRAPHIC COMPARISON

718

36.7. The following are observed payments on a coverage with deductible 100 and maximum covered claim 1000: 70

70

85

125

210

515

690

800

900

900

The ground-up loss distribution is fitted to a Pareto with parameters α  4, θ  2000. Calculate the x and y coordinates in a p–p plot corresponding to the payment of 125. 36.8.

For an insurance coverage, you have the following loss size data for four claims: 0.2

0.3

0.4

1.6

You fit a loglogistic distribution with parameters γ  2, θ  0.4 to this data. To determine the goodness of fit, you graph a p–p plot. Let ( x, y ) represent the coordinates of the points of this graph. Determine the y coordinate that corresponds to x  0.4. 36.9. An auto collision coverage has a deductible of 500 and a maximum covered loss of 10,000. The following payments have been made on this coverage: 100

150

150

200

250

400

920

3000

9500

9500

The payments of 9500 are for losses above the maximum covered loss. The loss distribution is fitted to a Weibull with τ  3, θ  1000. To evaluate this fit, you draw a p–p plot. Distinct points are used for the two points corresponding to payments of 150. Calculate the x and y coordinates that correspond to the payment of 150 with the lower x coordinate. 36.10. You are given the following p–p plot for a sample of 4 data points fitted to a Weibull distribution with τ  0.5, θ  1000. 1

0.75 F (x )

0.65

0.40

0.10 0

0

0.2

0.4

0.6

0.8

1

Fn (x ) Determine the four data points.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 36

719

36.11. [4-F01:6] The graph below shows a p–p plot of a fitted distribution compared to a sample. 1 0.9 0.8 0.7 0.6 Fitted

0.5 0.4 0.3 0.2 0.1 0

0.1

0

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Sample Which of the following is true? (A) (B) (C) (D) (E)

The tails of the fitted distribution are too thick on the left and on the right, and the fitted distribution has less probability around the median than the sample. The tails of the fitted distribution are too thick on the left and on the right, and the fitted distribution has more probability around the median than the sample. The tails of the fitted distribution are too thin on the left and on the right, and the fitted distribution has less probability around the median than the sample. The tails of the fitted distribution are too thin on the left and on the right, and the fitted distribution has more probability around the median than the sample. The tail of the fitted distribution is too thick on the left, too thin on the right, and the fitted distribution has less probability around the median than the sample.

36.12. For an insurance coverage with policy limit 100, you have the following six observations: 12

20

35

50

75

90

In addition, three observations are for losses greater than 100, for which 100 is paid. The data are fitted to a uniform distribution on [0, θ] using maximum likelihood. The fit is evaluated with a p–p plot. Determine the x and y coordinates corresponding to the observation 75.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

36. HYPOTHESIS TESTS: GRAPHIC COMPARISON

720

36.13. For an insurance coverage with policy limit 100, you have the following five observations: 15

25

50

75

100

The observation of 100 is for a loss with exact size 100. In addition, two observations are for losses greater than 100, for which 100 is paid. The data are fitted to a uniform distribution on [0, 200]. The fit is evaluated with a p–p plot. Determine the x and y coordinates corresponding to the observation 100. 36.14. For an insurance coverage with deductible 100, you have the following observations of payments: 200

1000

2000

3000

20000

The ground-up losses are assumed to follow a two-parameter Pareto distribution with α  3 and θ  15,000. A p–p plot of the fit is drawn. Determine the x and y coordinates corresponding to the payment 3000. Additional released exam questions: C-F05:31

Solutions 36.1. Keep in mind that D ( x )  Fn ( x ) − F ∗ ( x ) , so whenever the model is to heavy with F ∗ ( x ) increasing more rapidly than Fn ( x ) , D ( x ) will decrease. The graph declines from 0 to −0.05 in 0–100, so the model is too heavy there. It is flat in 100–200 (starting at −0.05 and ending at −0.05) and rises in 200–300 (starting at −0.05 and ending at 0.01), but then declines from 0.01 to 0 in 300–400, so once again the model is too heavy there. So the answer is I and IV . 36.2.

The truncated Weibull has survival function 2

e − ( x/θ ) S (x )  e − (10/θ ) 2 ∗

The smoothed empirical median is 40. Matching medians, e (10

2 −402 ) /θ 2

 0.5

−1500  θ 2 ln 0.5 θˆ 

We can now calculate D (30) .

p

−1500/ ln 0.5  46.5193 2 −302 ) /46.51932

D (30)  0.5 − (1 − e (10 36.3.

)  0.1910

The x coordinate is Fn (125) and the y coordinate is F ∗ (125) . Fn (125)  F (125)  ∗

The answer is



3 1 , 11 8



C/4 Study Manual—17th edition Copyright ©2014 ASM

3 11 where 125 1 1000  8

the denominator is n + 1 for this purpose

EXERCISE SOLUTIONS FOR LESSON 36

36.4.

721

The straightforward method is F7 (4000)  F ∗ (4000)  D (4000) 

4 7

(1 − e −4000/4000 ) − (1 − e −500/4000 )  0.58314 1 − (1 − e −500/4000 )

4 7

− 0.58314  −0.01171

A shortcut for calculating F ∗ (4000) is to use the fact that it is an exponential and therefore memoryless. If we let F (without an asterisk) represent an exponential not truncated at 500, then F ∗ (4000)  F (3500)  1 − e −3500/4000  0.58314.

36.5. The exposure is 100 + 700 + 1300 + 2500 + 3 (4500)  18,100. There are four losses. So the exponential parameter is 18,100/4  4525. F ∗ (4000)  D (4000) 

(1 − e −4000/4525 ) − (1 − e −500/4525 )  0.53860. 1 − (1 − e −500/4525 )

4 − 0.53860  0.03283 7

As in the last exercise, we can use the fact that the exponential is memoryless to simplify the calculation of F ∗ (4000)  F (3500)  1 − e −3500/4525  0.53860.

36.6. Since there are n  9 points, the empirical smoothed percentiles for the 9 observations are i/ ( n +1) , or 0.1 for 1, 0.2 for 2, 0.3 for 3, 0.4 for 15, . . . , 0.9 for 100. The points of the p–p plot are ( Fn ( x ) , F ( x )) , so the correspondences are: x

Fn ( x )

F (x )

1 2 3 15 30 50 51 99 100

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

F (1)  0 F (2) ≈ 0.2 F (3) ≈ 0.25 F (15) ≈ 0.55 F (30) ≈ 0.6 F (50) F (51) F (99) ≈ 0.75 F (100) ≈ 0.75

The approximate values of F ( x ) given in the above table are taken from the plot. The fact that F (1)  0 eliminates (B), (D), and (E). The fact that F (100) , 1 eliminates (C). Calculating 1 − x −0.25 for a couple of observations confirms that (A) is the answer. For comparison, the p–p plots for the other choices are shown in Figure 36.5. Considering how different these plots are from the one in the question, the question should have been easy! 36.7.

A payment of 125 corresponds to a loss of 225, so we shall calculate Fn (225) and F ∗ (225) . Fn (225) 

4 11

 0.3636 where the denominator is n + 1 for this purpose

F (225) − F (100) F (225)   1 − F (100) ∗

The answer is (0.3636, 0.2065) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

2000 4 2100

2000 4 2225 2000 4 2100



 0.2065

36. HYPOTHESIS TESTS: GRAPHIC COMPARISON

722

1.0

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0 0.0

0.2

0.4

0.6

0.8

(a) Choice (B): F ( x )  x/ (1 + x )

0.0 0.0

1.0

1.0

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0 0.0

0.2

0.4

0.6

0.8

(c) Choice (D): Exponential with mean 10

0.0 0.0

1.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

(b) Choice (C): Uniform on [1, 100]

(d) Choice (E): Normal with mean 40 and standard deviation 40

Figure 36.5: p – p plots for wrong choices in exercise 36.6 j

j

36.8. The x coordinates for the 4 points are n+1  5 , j  1, 2, 3, 4. 0.4  observation point 0.3. The fitted cumulative distribution function at 0.3 is F (0.3) 

2 5

corresponds to the second

(0.3/θ ) γ (0.3/0.4) 2 9/16 9     0.36 . γ 1 + (0.3/θ ) 1 + (0.3/0.4) 2 1 + 9/16 25

36.9. A payment of 150 corresponds to a loss of 650. Thus we need Fn (650) , F ∗ (650) . We want the lower x coordinate, hence need to look at the second payment in sequence.





2  0.1818 11 3 3 e − (500/1000) − e − (650/1000) ∗ F (650)   0.1390 e − (500/1000) 3

Fn (650) 

The answer is (0.1818, 0.1390) . 36.10. The y coordinates provide F ∗ ( x i ) for the four observations, so we need ( F ∗ ) −1 ( y i ) of the y-coordinates. We can look this up in the tables you are given on the exam by looking up VaR for a Weibull, since

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 36

723

−1 VaRp ( X ) is the 100p th percentile of X, or FX ( p ) . The formula is

F −1 ( p )  θ − ln (1 − p )



 1/τ

 1000 − ln (1 − p )



2

for p  0.1, 0.4, 0.65, 0.75. F −1 (0.1)  1000 (− ln 0.9) 2  11

F −1 (0.4)  1000 (− ln 0.6) 2  261

F −1 (0.65)  1000 (− ln 0.35) 2  1102 F −1 (0.75)  1000 (− ln 0.25) 2  1922 36.11. The slope of the fit is greater than 1 near 0 and less than 1 near 0.5 and 1, so it is too thick on the left and too thin on the right and in the middle. (E) The question used the term “median” to mean “sample median”, so the relevant slope is around where the x axis is 0.5. However, if “median” means the median of the fitted function, then it refers to the slope around the where the y-axis is 0.5, and the fitted distribution has more probability than the sample in that area.2 36.12. The x coordinate is the sequence number divided by one more than the number of observations, including the censored ones, or x  5/10  0.5 . The maximum likelihood estimator of θ is the censoring point (which is 100 here) times the total number of losses divided by the number of losses below the censoring point, or 100 (9/6)  150, as discussed in Subsection 33.4.1. Therefore, the y coordinate is the distribution function at 75, or y  F ∗ (75)  75/150  0.5 . 36.13. The x coordinate is the sequence number divided by one more than the number of observations, including the censored ones, or x  5/8  0.625 . The y coordinate is y  F ∗ (100)  100/200  0.5 . 36.14. The x coordinate is the sequence number divided by the number of observations plus 1, or x  4/6  2/3 . The y coordinate is the fitted distribution function adjusted for truncation at 100, or y  F ∗ (3100) 

F (3100) − F (100) 1 − F (100)

For a Pareto, the shortcut is that a truncated Pareto is a Pareto with the truncation added to θ, so F ∗ (3100) can be evaluated from a Pareto with α  3, θ  15,100 evaluated at 3000: 15,100 y 1− 15,100 + 3000

2This was pointed out to me by Simon Schurr. C/4 Study Manual—17th edition Copyright ©2014 ASM

!3  0.4194

36. HYPOTHESIS TESTS: GRAPHIC COMPARISON

724

Quiz Solutions 36-1.

Using the formula in Section 33.4.2, the maximum likelihood estimate of α is

Then D (10) is

1 K  ln Q  − ln 2 · 5 · 9 · 18 · 35  − ln 56,700  −10.9455 xi 5  0.456808 αˆ  − −10.9455 1 D (10)  F5 (10) − F (10)  0.6 − *1 − 10 , ∗

36-2.

F ∗ (9)  1 − (1/9) 0.5  2/3, so the point is (1/2, 2/3) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

! 0.456808 +  −0.0507 -

Lesson 37

Hypothesis Tests: Kolmogorov-Smirnov Reading: Loss Models Fourth Edition 16.4.1 This topic appears on about one out of every two exams. In the following, we continue to use the notation F ∗ to indicate the fitted function adjusted for truncation. See the discussion of equations (36.1) on page 713 for the definition of F ∗ .

37.1

Individual data

The Kolmogorov-Smirnov statistic D is the maximum difference, in absolute value, between the empirical and fitted distributions, max |Fn ( x ) − F ∗ ( x; θˆ ) |, t ≤ x ≤ u, where t is the lower truncation point (0 if none) and u is the upper censoring point (∞ if none).1 You need individual data to use this test.2 To evaluate D, assuming the observation points are sorted,3 x1 ≤ x2 ≤ · · · ≤ x n , at each observation point x j , take

the maximum of |F ∗ ( x j ) −

j n|

and |F ∗ ( x j ) −

j−1 n |.

If there is a tie between observations, you only have to

do this once for the tied observations; if x j  x j+1 , for example, take the maximum of |F ∗ ( x j ) − j−1 n |.

Take the maximum of these maxima over all j. j) − Here is an example.

|F ∗ ( x

j+1 n |

and

Example 37A The amount of waiting time in a supermarket line for 5 customers, in minutes, is as follows: 0.5

1.0

1.0

1.3

5.0

You fit this data to a Pareto distribution with parameters α  1, θ  1. Calculate the Kolmogorov-Smirnov statistic to test this fit. Answer: The fitted curve has the following distribution function: F∗ (x )  1 − The empirical distribution F5 ( x ) increases by

1 5

1 1+x

x≥0

at every observation and therefore is the following:

Range of x

F5 ( x )

[0, 0.5) [0.5, 1) [1, 1.3) [1.3, 5)

0 0.2 0.6 0.8

The empirical function is graphed as a dashed line and the fitted function as a solid line in Figure 37.1. We must find the largest difference between the two functions, or the largest distance between the two lines. It is clear from the graph that this occurs at one of the observation points. 1The letter D is related to the D ( x ) plot; D  max |D ( x ) |. 2In a footnote, the textbook mentions a paper discussing modifying this test for use with grouped data. 3In this lesson, we’re going to assume x1 ≤ x2 ≤ · · · ≤ x n , rather than use a special notation like y j or x ( j ) for order statistics. C/4 Study Manual—17th edition Copyright ©2014 ASM

725

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

726

(5.0,1.0)

1 (1.3,0.8)

0.8

0.6

(5.0,0.8333)

(1.3,0.5652)

(1.0,0.5) 0.4 (0.5,0.3333) 0.2

LEGEND Fitted Empirical

(1.0,0.2)

(0.5,0)

2

1

3

4

5

Figure 37.1: Calculation of Kolmogorov-Smirnov statistic in Example 37A

At x  0.5, the largest distance is from the lower empirical point, or the value of F5 (0.5− )  0, to the fitted curve; the distance is 0.3333. At x  1, the curve is between F5 (1− ) and F5 (1) ; the larger distance is to F5 (1− ) , and it is 0.3. At x  1.3, the curve is below both F5 (1.3− ) and F5 (1.3) , so the larger distance is to F5 (1.3) , and it is 0.2348. At x  5, the curve is between F5 (5− ) and F5 (5) , but unlike at x  1, the larger distance is now to the upper point, F5 (5) , and it is 0.1667. In the graph, the largest distance at each observation point is shown by an arrow and large dots on the corresponding points of the empirical and the fitted functions. Here is a table summarizing the calculations: xj

F∗ (x j )

Fn ( x −j )

Fn ( x j )

Largest Difference

0.5 1.0 1.3 5.0

0.3333 0.5 0.5652 0.8333

0 0.2 0.6 0.8

0.2 0.6 0.8 1.0

0.3333 0.3 0.2348 0.1667

The largest difference is at x  0.5, and it is D  0.3333 , which is the Kolmogorov-Smirnov statistic.

C/4 Study Manual—17th edition Copyright ©2014 ASM

37.1. INDIVIDUAL DATA

727

Calculator Tip The TI-30XS/B Multiview calculator may be useful to quickly calculate the distribution function at the observation points. You can then compare them to the empirical distribution by visual inspection. This is probably adequate. However, if you want to use the calculator to compare the fitted and empirical functions, you should enter each observation point twice, and enter the empirical distribution function right before the point and at the point on the two rows. The steps are then: 1. Enter the observations in column 1, with each unique one entered twice. 2. Enter a formula for the fitted function in column 2. 3. Overtype an entry in column 2 in order to paste values. 4. Enter 0, 1/n, 1/n, 2/n, 2/n, . . . ( n − 1) /n, ( n − 1) /n, 1 in column 3. (If there are multiple observation values, skip the inapplicable values.) 5. In column 1, enter a formula for the absolute value of the difference between columns 2 and 3, namely the square root of the square difference. 6. Check the statistics registers for the maximum, or find the maximum by inspection. Here is how this algorithm would be carried out in Example 37A. In the paste values step, 0.5 was overtyped rather than 0.3333 because it is shorter and exact. Clear table

Enter observations in column 1

Enter formula for F ∗ ( x j ) in column 2

Paste values in column 2

Enter empirical distribution in column 3

data data 4

.5

s% .5 s% 1 s% 1 s% 1.3 s% 1.3 s% 5 s% 5 enter

t% data t% 1

1 − 1 ÷ ( 1 + data 1 ) enter

s% s% .5 enter t% 0 s% .2 s% .2 s% .6 s% .6 s% .8 s% .8 s% 1 enter

L1

L3

L2

L3

L1(1)= L1 1.2 5 5 L1(9)= L1 L2 L3 0.5 0.3333 0.5 0.3333 1 0.5 1 0.5 L2(1)=0.333333333.. L1 L2 L3 0.5 0.3333 0.5 0.3333 1 0.5 1 0.5 L2(4)=0.5 L1 1.3 5 5 L3(9)=

C/4 Study Manual—17th edition Copyright ©2014 ASM

L2

L2 0.5652 0.8333 0.8333

L3 0.8 0.8 1

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

728

Calculator Tip

t%

t%

Enter formula for absolute difference in column 1

data 1 √ 2nd [ ] ( data 2 − data 3 ) x 2 ) enter

Check statistics

2nd [stat]1 (choose L1 for data)

L1 L2 L3 0.3333 0.3333 0 0.1333 0.3333 0.2 0.3 0.5 0.2 0.1 0.5 0.6 L1(1)=0.333333333..

s% s% enter q%

1-Var:L1,1 9↑Med=0.15 A:Q3=0.267391304 B:maxX=0.3333333

Statistic B is the maximum. The use of the letter D for the Kolmogorov-Smirnov statistic suggests a relationship to the D ( x ) plot. In fact, the Kolmogorov-Smirnov statistic is the maximum of the absolute value of D ( x ) in that plot. Example 37B Consider the following figure, which is the same as Figure 36.1 and is based on the data in Example 36A:

0.15 0.1 0.05 0 −0.05 −0.1 −0.15 2

4

6

8

10

12

14

16

18

Determine the Kolmogorov-Smirnov statistic for this fit. Answer: The largest absolute difference from 0 occurs at x  9. From the information in the example and its answer, the value of D ( x ) at x  9 at the bottom of the line is e −9/6 − 0.4, and |e −9/6 − 0.4|  0.17687 . Here is an example of the calculation of the Kolmogorov-Smirnov statistic from an old CAS 4B exam. Example 37C [4B-S90:49] (2 points) The observations 1.7, 1.6, 1.6, and 1.9 are taken from a random sample. You wish to test the goodness of fit of a distribution with probability density function given by f ( x )  x/2 for 0 ≤ x ≤ 2. You are given the following table of critical values for the Kolmogorov-Smirnov statistic: α c C/4 Study Manual—17th edition Copyright ©2014 ASM

0.10 √ 1.22/ n

0.01 √ 1.63/ n

37.1. INDIVIDUAL DATA

729

Which of the following should you do? (A) Accept at both levels (B) Accept at the 0.01 level but reject at the 0.10 level (C) Accept at the 0.10 level but reject at the 0.01 level (D) Reject at both levels (E) Cannot be determined Answer: We need the distribution function to calculate the Kolmogorov-Smirnov statistic, so we integrate the given density function. x

Z F (x )   

u du 2

0≤x≤2

0 x u2



4 0

x2 4

We set up a table of values of the fitted and empirical distribution functions at each observation point. xj

F∗ (x j )

Fn ( x −j )

Fn ( x j )

Largest Difference

1.6 1.7 1.9

0.64 0.7225 0.9025

0 0.50 0.75

0.50 0.75 1.00

0.64 0.2225 0.1525

√ The Kolmogorov-Smirnov statistic, the largest difference, is D  0.64. √ We multiply by n so that we can compare to the value of the numerator of c (1.22 and 1.63). 0.64 4  1.28, which is above 1.22 but below 1.63. Therefore, accept at 0.01, reject at 0.1 (B) 

?

Quiz 37-1 You are given the following sample of 8 observations: 5

10

20

37

52

65

92

100

These observations are fitted to a uniform distribution on [0, 100]. Calculate the Kolmogorov-Smirnov statistic for this fit. √ √ The critical values of the Kolmogorov-Smirnov statistic are 1.22/ n for α  0.10, 1.36/ n for α  0.05, √ and 1.63/ n for α  0.01. Note that the critical values vary with n. Do not memorize these values—you will √ get them on the exam if you need them. Just remember that the critical value is inversely proportional to n, so given a fixed test result, the higher the n, the more likely you’ll reject the model. Some comments about these critical values: • They should be smaller if u, the right censoring point, is less than ∞.

• This table of Kolmogorov-Smirnov critical values assumes that the distribution is completely specified. If parameters have been fitted, different tables are needed depending on which distribution was fitted; this complicated situation is beyond the scope of the course. The textbook suggests splitting the data into halves; use one half to estimate the parameters, and then perform the KolmogorovSmirnov test on the other half of the data.

C/4 Study Manual—17th edition Copyright ©2014 ASM

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

730

The Kolmogorov-Smirnov test is only appropriate for continuous distributions, not discrete distributions. This fact is stated in the second paragraph of Loss Models 16.4.1. As in the previous example, to make the exam question a bit more challenging, you may be given the density function and asked to calculate the Kolmogorov-Smirnov statistic. Don’t forget to first obtain the distribution function before calculating the Kolmogorov-Smirnov statistic. When data are censored, you need to check the difference between fitted and actual at the censoring point. Consider the following example: Example 37D For an insurance coverage with a policy limit of 5000, you have the following observations: 2000 4000 5000 5000 The ground-up loss distribution is assumed to be a uniform distribution on (0, 10,000] Calculate the Kolmogorov-Smirnov statistic: 1. Assuming that the two observations of 5000 are exact. 2. Assuming that the two observations of 5000 are censored. Answer: 1. If the two observations of 5000 are exact, then F4 (5000)  1. The following table computes the K-S statistic: xj

F∗ (x j )

F4 ( x −j )

F4 ( x j )

Largest Difference

2000 4000 5000

0.2 0.4 0.5

0 0.25 0.50

0.25 0.50 1.00

0.20 0.15 0.50

2. If the two observations of 5000 are censored, then F4 (5000)  F4 (5000− )  0.5. xj

F∗ (x j )

F4 ( x −j )

F4 ( x j )

Largest Difference

2000 0.2 0 0.25 0.20 4000 0.4 0.25 0.50 0.15 5000 0.5 0.50 0 See exercise 37.9 for an example of calculating the Kolmogorov-Smirnov statistic for data that is censored and truncated. A summary of the main characteristics of the Kolmogorov-Smirnov statistic is given in Table 39.1 on page 767.

37.2

Grouped data

You’re probably safe skipping this section. The exact Kolmogorov-Smirnov statistic cannot be computed for grouped data. However, exam questions before 2000 used to present grouped data, and ask you to calculate the maximum and minimum possible values for the Kolmogorov-Smirnov statistic. No such question has appeared on released exams since spring 2000, so perhaps they’ve abandoned this style of question. But if you want to know, here’s how to do it. To calculate the minimum possible value is easy. Compare the fitted distribution to the empirical distribution at the interval endpoints (other than the first and the last) only, and take the maximum absolute difference. The Kolmogorov-Smirnov statistic can be no smaller, and there is always a way to arrange the C/4 Study Manual—17th edition Copyright ©2014 ASM

37.2. GROUPED DATA

731

grouped data so that it is no larger. Even though you’re calculating the minimum Kolmogorov-Smirnov statistic, you must take the maximum of the differences, since by definition the Kolmogorov-Smirnov statistic is the maximum difference between the fitted and the empirical distributions. Although we arrange the empirical data to minimize the differences as much as we can, we still have to use the largest of the differences we’re left with. To calculate the maximum possible value is trickier. The two most extreme situations for grouped data in an interval [c j−1 , c j ) are that all the losses in the interval are for the amount c j−1 and that all the losses in the interval are for the amount c j − ε. In either case, Fn is constant throughout the interval. Since the fitted distribution F ∗ is monotonically non-decreasing, if F ∗ ( c j−1 ) < Fn∗ ( c j−1 ) , the difference will decline as x increases from c j−1 to c j , so the worst case (the case with the biggest difference) is the case where all the losses are at the beginning of the interval, which means you have to consider Fn ( c j ) − F ∗ ( c j−1 ) . If F ∗ ( c j−1 ) > Fn ( c j−1 ) , the difference will increase as x increases from c j−1 to c j . The worst case is then where all the losses are at the end of the interval, and you have to consider F ∗ ( c j ) − Fn ( c j−1 ) . So you have to consider both of these differences at each endpoint c j . To do this as simply as possible, do the following: 1. List the interval endpoints in a three-column table: c j , Fn ( c j ) starting at 0, and F ∗ ( c j ) starting at 0. 2. For each c j starting with c 1 , compute the absolute difference |Fn ( c j ) − F ∗ ( c j−1 ) |. In other words, compare column 2 row j to column 3 row j − 1. 3. For each c j starting with c 1 , compute the absolute difference |F ∗ ( c j ) − Fn ( c j−1 ) |. In other words, compare column 3 row j to column 2 row j − 1. 4. The maximum of the numbers computed in the previous 2 steps is the maximum KolmogorovSmirnov statistic. Note that the minimum Kolmogorov-Smirnov statistic is the maximum of the absolute differences |Fn ( c j ) − F ∗ ( c j ) |. Here’s an example. Example 37E You are given the following data on claim sizes: Claim Size

Number of Claims

0–1000 1000–2000 2000–3000 3000–4000

15 50 20 15

You hypothesize that claim sizes follow a uniform distribution on (0, 4000). Although the exact value of the Kolmogorov-Smirnov statistic cannot be calculated, bounds can be placed on it. Determine the minimum and maximum possible values of the Kolmogorov-Smirnov statistic. Answer: We calculate Fn at the endpoints: cj 0 1000 2000 3000 4000

Fn ( c j )

F∗ (c j )

0 0.15 0.65 0.85 1.00

0 0.25 0.50 0.75 1.00

|Fn ( c j ) − F ∗ ( c j ) | 0 0.1 0.15 0.10 0

|F ∗ ( c j ) − Fn ( c j−1 ) | 0.25 0.35 0.10 0.15

|Fn ( c j ) − F ∗ ( c j−1 ) | 0.15 0.40 0.35 0.25

The fitted uniform distribution is shown as a solid line in Figure 37.2. Diamonds in the graph represent C/4 Study Manual—17th edition Copyright ©2014 ASM

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

732

1 (3000, 0.85)

0.8 (2000, 0.65)

0.6 0.4 0.2

LEGEND Fitted Minimum empirical Maximum empirical

(1000, 0.15)

1000

2000

3000

4000

Figure 37.2: Calculation of Kolmogorov-Smirnov statistic in Example 37E

empirical observations. These are the only empirical points we are sure of. The minimum KolmogorovSmirnov statistic is based on the maximum (not the minimum!) distance of these points from the fitted curve, since we can always arrange an empirical function which goes through these points and is never any further away from the fitted curve. Therefore, the minimum Kolmogorov-Smirnov statistic is the maximum of the fourth column, or 0.15 . The two extreme possibilities for the empirical function are shown by the dashed line and the dotted line. At x  1000, for example, the empirical function can be as little as 0 (the dashed line goes through (1000, 0) or as high as 0.65 (the dotted line goes through (1000, 0.65) . At 2000, the empirical function can be as low as 0.15 or as high as 0.85. For this reason, every value of the fitted function at an observation point has to be compared to Fn of the previous observation point and to Fn of the following observation point. The maximum Kolmogorov-Smirnov statistic is therefore the maximum of the fifth and sixth columns of the table, or 0.40 . 

Exercises 37.1. [4B-F93:10] (3 points) A random sample of 5 claims x 1 , . . . , x5 is taken from the probability density function f ( x i )  αλ α ( λ + x i ) −α−1 , α, λ, x i > 0 In ascending order the observations are: 43, 145, 233, 396, 775 Suppose the parameters are α  1.0 and λ  400. Determine the Kolmogorov-Smirnov statistic for the fitted distribution. (A) (B) (C) (D) (E)

Less than 0.050 At least 0.050, but less than 0.140 At least 0.140, but less than 0.230 At least 0.230, but less than 0.320 At least 0.320

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 37

733

37.2. [4B-S95:11] (2 points) Given the sample 0.1, 0.4, 0.8, 0.8, 0.9 you wish to test the goodness of fit of the distribution with a probability density function given by f (x ) 

1 + 2x , 2

0 ≤ x ≤ 1.

Determine the Kolmogorov-Smirnov goodness of fit statistic. (A) (B) (C) (D) (E)

Less than 0.15 At least 0.15, but less than 0.20 At least 0.20, but less than 0.25 At least 0.25, but less than 0.30 At least 0.30

Use the following information for questions 37.3 and 37.4: You are given the following: •

A random sample of 20 observations of a random variable X yields the following values: 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0



The null hypothesis, H0 , is that X has a uniform distribution on the interval [0, 20].

37.3. (A) (B) (C) (D) (E) 37.4.

[4B-F95:9] (2 points) Determine the value of the Kolmogorov-Smirnov statistic used to test H0 . Less than 0.075 At least 0.075 but less than 0.125 At least 0.125 but less than 0.175 At least 0.175 but less than 0.225 At least 0.225 [4B-F95:10] (1 point) Use the following table to determine which of the following statements is true. a (significance level) c (critical value)

(A) (B) (C) (D) (E)

0.20 √ 1.07/ n

0.10 √ 1.22/ n

0.05 √ 1.36/ n

0.01 √ 1.63/ n

H0 will be rejected at the 0.01 significance level. H0 will be rejected at the 0.05 significance level, but not at the 0.01 level. H0 will be rejected at the 0.10 significance level, but not at the 0.05 level. H0 will be rejected at the 0.20 significance level, but not at the 0.10 level. H0 will be accepted at the 0.20 significance level.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

734

Use the following information for questions 37.5 and 37.6: You are given the following: •

Claim sizes follow a lognormal distribution with parameters µ and σ.



A random sample of five claims yields the values 0.1, 0.5, 1.0, 2.0, and 10.0 (in thousands).

37.5.

[4B-F98:3] (2 points) Determine the maximum likelihood estimate of σ.

(A) (B) (C) (D) (E)

Less than 1.6 At least 1.6, but less than 1.8 At least 1.8, but less than 2.0 At least 2.0, but less than 2.2 At least 2.2

37.6. [4B-F98:4] (2 points) Determine the value of the Kolmogorov-Smirnov statistic using the maximum likelihood estimates. (A) (B) (C) (D) (E)

Less than 0.07 At least 0.07, but less than 0.09 At least 0.09, but less than 0.11 At least 0.11, but less than 0.13 At least 0.13

37.7. [160-S87:5] Two lives are observed beginning at time t  0. One dies at t1  5 and the other dies at t2  9. The survival function S ( t )  1 − ( t/10) is hypothesized. Calculate the Kolmogorov-Smirnov statistic.

(A) 0.4 37.8.

(B) 0.5

(C) 0.9

(D) 1.3

(E) 1.4

[160-S90:17] From a laboratory study of nine lives, you are given:

(i) The times of death are 1, 2, 4, 5, 5, 7, 8, 9, 9 (ii) It has been hypothesized that the underlying distribution is uniform with ω  11. Calculate the Kolmogorov-Smirnov statistic for the hypothesis. 37.9. You observe the following seven losses on a coverage with deductible 500 and maximum covered loss 10,000: 620

800

1250

1250

2000

2500

4000

In addition, you observe two losses above 10,000, for which payments of 9,500 were made. You fit these losses to a two-parameter Pareto with θ  1000, α  1. Calculate the Kolmogorov-Smirnov statistic to test the fit.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 37

735

37.10. [4-S00:11] The size of a claim for an individual insured follows an inverse exponential distribution with the following probability density function: f (x | θ) 

θe −θ/x , x2

x>0

The parameter θ has a prior distribution with the following probability density function: e −θ/4 , 4

g (θ) 

θ>0

For a particular insured, the following five claims are observed: 1

2

3

5

13

Determine the value of the Kolmogorov-Smirnov statistic to test the goodness of fit of f ( x | θ  2) .

(A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20

37.11. [4-S01:12] You are given the following random observations: 0.1

0.2

0.5

1.0

1.3

You test whether the sample comes from a distribution with probability density function: f (x ) 

2

(1 + x ) 3

,x>0

Calculate the Kolmogorov-Smirnov statistic. (A) 0.01

(B) 0.06

(C) 0.12

(D) 0.17

(E) 0.19

37.12. [4-F02:17] You are given: (i)

A sample of claim payments is: 29

64

90

135

182

(ii) Claim sizes are assumed to follow an exponential distribution. (iii) The mean of the exponential distribution is estimated using the method of moments. Calculate the value of the Kolmogorov-Smirnov test statistic. (A) 0.14

(B) 0.16

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.19

(D) 0.25

(E) 0.27

Exercises continue on the next page . . .

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

736

37.13. [4-F04:38] You are given a random sample of observations: 0.1

0.2

0.5

0.7

1.3

You test the hypothesis that the probability density function is: f (x ) 

4 , (1 + x ) 5

x>0

Calculate the Kolmogorov-Smirnov test statistic. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.15 At least 0.15, but less than 0.25 At least 0.25, but less than 0.35 At least 0.35

37.14. [C-S05:1] You are given: (i)

A random sample of five observations from a population is: 0.2

(ii)

0.7

1.1

1.3

You use the Kolmogorov-Smirnov test for testing the null hypothesis, H0 , that the probability density function for the population is: f (x ) 

(iii)

0.9

4 , (1 + x ) 5

x>0

Critical values for the Kolmogorov-Smirnov test are: Level of Significance

0.10

0.05

0.025

0.01

Critical Value

1.22 √ n

1.36 √ n

1.48 √ n

1.63 √ n

Determine the result of the test. (A) (B) (C) (D) (E)

Do not reject H0 at the 0.10 significance level. Reject H0 at the 0.10 significance level, but not at the 0.05 significance level. Reject H0 at the 0.05 significance level, but not at the 0.025 significance level. Reject H0 at the 0.025 significance level, but not at the 0.01 significance level. Reject H0 at the 0.01 significance level.

37.15. You are given a random sample of five observations. Four of them are 1

3

6

15

You test the hypothesis that the underlying distribution is uniform on [0, 20]. The Kolmogorov-Smirnov test statistic is 0.3. Determine all possible values for the fifth observation.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 37

737

37.16. For an insurance coverage with policy limit 1000, you are given: (i) Three observed losses are for 200, 300, and 600. (ii) Two additional losses are higher than the limit. (iii) You use the Kolmogorov-Smirnov test for testing the null hypothesis, H0 , that the probability density function for the population is: f (x ) 

4 (50004 ) , (5000 + x ) 5

x>0

Determine the value of the Kolmogorov-Smirnov test statistic. 37.17. For an insurance coverage with policy limit 10,000, you are given: (i) Five observed losses are 5000, 5500, 5500, 6000, 7500. (ii) Three losses are over the limit. (iii) You use the Kolmogorov-Smirnov test for testing the null hypothesis, H0 , that the underlying distribution for losses is a single-parameter Pareto with θ  5000 and α  3. Determine the value of the Kolmogorov-Smirnov test statistic. 37.18. For an insurance coverage with deductible 500, you are given: (i) Losses below the deductible are not reported. (ii) Five observed losses are 600, 1200, 1500, 2500, 8000. (iii) You test whether the underlying ground-up loss distribution has probability density function: f (x ) 

e −x/2000 , 2000

x>0

Calculate the Kolmogorov-Smirnov test statistic. 37.19. For an insurance coverage with deductible 500, you are given: (i) Losses below the deductible are not reported. (ii) Four observed losses are 700, 1000, 2000, 4500 (iii) You test whether the underlying ground-up loss distribution has probability density function: f (x ) 

1000e −1000/x , x2

x>0

Calculate the Kolmogorov-Smirnov test statistic. 37.20. For an insurance coverage with deductible 1000 and maximum covered loss 3000, you are given: (i) Losses below the deductible are not reported. (ii) Six observed losses are 1500, 1500, 1700, 2000, 2500, 3000. (iii) Four losses are above the maximum covered loss. (iv) You test whether ground-up losses follow a uniform distribution on (0, 5000]. Calculate the Kolmogorov-Smirnov test statistic.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

738

37.21. For an insurance coverage with deductible 1000 and maximum covered loss 3000, you are given: (i) Losses below the deductible are not reported. (ii) Seven observed losses are 1500, 1500, 1700, 2000, 2500, 3000, 3000. (iii) Three losses are above the maximum covered loss. (iv) You test whether ground-up losses follow a uniform distribution on (0, 5000]. Calculate the Kolmogorov-Smirnov test statistic. The following exercises, through the end of the lesson, involve getting bounds for the KolmogorovSmirnov statistic for grouped data. As indicated in Section 37.2, exam questions on this topic are unlikely, so you may skip these questions. 37.22. [4B-S97:28 and 1999 C4 Sample:23] (2 points) You are given the following: •



Forty (40) observed losses have been recorded in thousands of dollars and are grouped as follows: Interval ($000)

Number of Losses

(1, 4/3) [4/3, 2) [2, 4) [4, ∞)

16 10 10 4

The null hypothesis, H0 , is that the random variable X underlying the observed losses, in thousands, has the density function 1 f ( x )  2 , 1 < x < ∞. x

Since exact values of the losses are not available, it is not possible to compute the exact value of the Kolmogorov-Smirnov statistic used to test H0 . However, it is possible to put bounds on the value of this statistic. Based on the information above, determine the smallest possible value and the largest possible value of the Kolmogorov-Smirnov statistic used to test H0 . (A) (B) (C) (D) (E)

Smallest possible value = 0.10; largest possible value = 0.25 Smallest possible value = 0.10; largest possible value = 0.40 Smallest possible value = 0.15; largest possible value = 0.25 Smallest possible value = 0.15; largest possible value = 0.40 Smallest possible value = 0.25; largest possible value = 0.40

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 37

739

Use the following information for questions 37.23 and 37.24: You are given the following loss experience: Payments

Number of Payments

0–1 1–2 2–3 3–4

13 15 0 12

The null hypothesis is that losses are uniformly distributed on [0,4]. Although the exact values of the Kolmogorov-Smirnov statistic cannot be computed, bounds can be put on the value of this statistic. 37.23. Determine the smallest possible value of the Kolmogorov-Smirnov statistic to test the hypothesis. 37.24. Determine the largest possible value of the Kolmogorov-Smirnov statistic to test the hypothesis. 37.25. [4B-S99:12] (2 points) You are given the following: •

One hundred claims greater than 3,000 have been recorded as follows: Interval Number of Claims (3,000, 5,000] (5,000, 10,000] (10,000, 25,000] (25,000, ∞)

6 29 39 26



Claims of 3,000 or less have not been recorded.



The null hypothesis, H0 , is that claim sizes follow a Pareto distribution α  2 and θ  25,000.

Since exact values of the claims are not available, it is not possible to compute the exact value of the Kolmogorov-Smirnov statistic used to test H0 . However, it is possible to put bounds on the value of this statistic. Referring to the information above, determine the smallest possible value of the Kolmogorov-Smirnov statistic used to test H0 . (A) (B) (C) (D) (E)

Less than 0.03 At least 0.03, but less than 0.06 At least 0.06, but less than 0.09 At least 0.09, but less than 0.12 At least 0.12

37.26. [160-S88:18] You are given the following observed data from 1600 lives retired at age 62: (i) (ii) (iii)

q62  0.087 q63  0.020 q64  0.050

You hypothesize that S ( x ) is linear over the interval (62, 65] and calculate qˆ62 , qˆ 63 , and qˆ64 . These estimates match the observed survival rate to age 65. Determine the smallest possible value of the Kolmogorov-Smirnov departure measure statistic.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

740

37.27. You are given the following information on 100 claims: Range

Number of Claims

0– 2,000 2,000– 5,000 5,000–10,000 10,000–20,000 20,000– ∞

30 25 15 15 15

The null hypothesis is that claim sizes follow the distribution with density function f (x ) 

5,000 . (5,000 + x ) 2

Although the exact values of the Kolmogorov-Smirnov statistic cannot be computed, it is possible to put bounds on the value of this statistic. Determine the largest possible value of the Kolmogorov-Smirnov statistic to test the hypothesis. 37.28. You are given the following information on 10 claims: Range

Number of Claims

0–100 100–200 200–300 300–400

2 2 3 3

The null hypothesis is that claim sizes follow a uniform distribution on [0, 400]. Although the exact values of the Kolmogorov-Smirnov statistic cannot be computed, it is possible to put bounds on the value of this statistic. Determine the largest possible value of the Kolmogorov-Smirnov statistic to test the hypothesis. Additional released exam questions: C-S07:20

Solutions 37.1.

37.2.

This is a Pareto distribution. F ( x )  1 − 400/ (400 + x ) .

Here, F ( x )  ( x +

C/4 Study Manual—17th edition Copyright ©2014 ASM

x 2 ) /2,

xj

F∗ (x j )

Fn ( x −j )

Fn ( x j )

Largest Difference

43 145 233 396 775

0.097 0.266 0.368 0.497 0.660

0.0 0.2 0.4 0.6 0.8

0.2 0.4 0.6 0.8 1.0

0.103 0.134 0.232 0.303 0.340

0 ≤ x ≤ 1.

(E)

EXERCISE SOLUTIONS FOR LESSON 37

741

xj

F∗ (x j )

Fn ( x −j )

Fn ( x j )

Largest Difference

0.1 0.4 0.8 0.9

0.055 0.280 0.720 0.855

0 0.2 0.4 0.8

0.2 0.4 0.8 1.0

0.145 0.120 0.320 0.145

(E)

37.3. Rather than testing all the values, note that the difference grows up to 5 since the slope of the observations is half of what it should be to be uniform (e.g. F (0.5)  1/40 but Fn (0.5)  1/20). The difference is constant from 5 to 15, where the slope is exactly right. Testing at 5, F (5)  0.25, Fn (5)  0.5, Fn− (5)  0.45, so the maximum difference is 0.25 . (E) √ 37.4. 0.25 20  1.12. Reject at 0.20, but not at 0.10. (D) 37.5. This is a maximum likelihood question, but sets the stage for the next question. As mentioned in Subsection 33.3.1, the estimator for µˆ is the average of ln x i , and the estimator for σ2 is the variance (with division by n) of the ln x i . You are asked to fit the claim distribution for claim sizes in units, not for claim sizes in thousands. However, if we fit a lognormal distribution to the claim size in thousands, then we only need to scale it to make it a distribution for units. As mentioned on page 29, a lognormal distribution is scaled by r by adding ln r to µ; σ is unchanged. Since we are only asked for σ, it therefore suffices to fit the claim sizes in thousands. Q P ˆ since it is needed for the next question. Note that x i  1, so ln x i  0. We will calculate µ, µˆ  σˆ  37.6.

For each x, we evaluate Φ

ln x−µ  σ

1X ln x i  0 5

qX



(ln x i ) 2 /5  1.5208 ln x  1.5208 .

F5− ( x j )

F∗ (x j )

xj

(A)

F5 ( x j )

Largest Difference

0.1

Φ

ln 0.1  1.5208

 Φ (−1.51)  0.0655

0

0.2

0.1345

0.5

Φ

ln 0.5  1.5208

 Φ (−0.46)  0.3228

0.2

0.4

0.1228

 Φ (0)  0.5000

0.4

0.6

0.1000

1.0

Φ

ln 1  1.5208

2.0

Φ

ln 2  1.5208

 Φ (0.46)  0.6772

0.6

0.8

0.1228

10.0

Φ

ln 10  1.5208

 Φ (1.51)  0.9345

0.8

1.0

0.1345

D  0.1345 . (E) 37.7. xj 5 9

C/4 Study Manual—17th edition Copyright ©2014 ASM

Fn− ( x j ) 0 1/2

Fn ( x j )

F (x j )

Largest Difference

1/2 1

1/2 9/10

0.5 0.4

(B)

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

742

37.8. xj 1 2 4 5 7 8 9

Fn− ( x j )

Fn ( x j )

0 1/9 2/9 3/9 5/9 6/9 7/9

1/9 2/9 3/9 5/9 6/9 7/9 1

F∗ (x

Largest Difference

j)

1/11 2/11 4/11 5/11 7/11 8/11 9/11

9/99 7/99 14/99 12/99 8/99 6/99 18/99

18/99  0.1818 . 37.9. F (500)  1 − F∗ (x ) 

xj 620 800 1250 2000 2500 4000 10000

Fn− ( x j ) 0 1/9 2/9 4/9 5/9 6/9 7/9

2 3



1000 1  1000 + 500 3 1000 1000+x 2 3

1−

1500 1000 + x

Fn ( x j )

F∗ (x j )

Largest Difference

1/9 2/9 4/9 5/9 6/9 7/9

0.0741 0.1667 0.3333 0.5000 0.5714 0.7000 0.8636

0.0741 0.0556 0.1111 0.0556 0.0953 0.0778 0.0858

Notice that we do not set F9 (10,000)  1, which would result in a larger difference, 1−0.8636  0.1364. F9 (10,000)  7/9 because the two claims above 10,000 are censored, so only seven of the nine claims are less than or equal to 10,000. 37.10. The inverse exponential density is f ( x | θ  2)  2e −2/x /x 2 and the distribution is F ( x | θ  2)  e −2/x . We compare fitted to actual at the five observations: xi 1 2 3 5 13

F5 ( x −i ) 0 0.2 0.4 0.6 0.8

F5 ( x i )

F∗ (xi )

Largest Difference

0.2 0.4 0.6 0.8 1.0

0.1353 0.3679 0.5134 0.6703 0.8574

0.1353 0.1679 0.1134 0.1297 0.1426

(D)

37.11. This Pareto density has distribution function F ( x )  1 − 1/ (1 + x ) 2 . We compare fitted to actual at the five observations:

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 37

xi 0.1 0.2 0.5 1.0 1.3

F5 ( x −i ) 0 0.2 0.4 0.6 0.8

743

F5 ( x i )

F∗ (xi )

Largest Difference

0.2 0.4 0.6 0.8 1.0

0.1736 0.3055 0.5556 0.7500 0.8109

0.1736 0.1055 0.1556 0.1500 0.1891

(E)

37.12. The average of the five observations is (29 + 64 + 90 + 135 + 182) /5  100, which is the method of moments estimate for the parameter of the exponential distribution, θ. The fitted distribution function is F ( x )  1 − e −x/100 . We compare the fitted to the actual at the five observations: Largest xi F5 ( x −i ) F5 ( x i ) F ∗ ( x i ) Difference 29 64 90 135 182

0 0.2 0.4 0.6 0.8

0.2 0.4 0.6 0.8 1.0

0.2517 0.4727 0.5934 0.7408 0.8380

0.2517 0.2727 0.1934 0.1408 0.1620

(E)

37.13. The distribution is a Pareto with α  4 and θ  1. The distribution function is F ( x )  1 − 1/ (1 + x ) 4 . We compare the fitted to the actual at the five observations: Largest xi F5 ( x −i ) F5 ( x i ) F ∗ ( x i ) Difference 0.1 0.2 0.5 0.7 1.3

0 0.2 0.4 0.6 0.8

0.2 0.4 0.6 0.8 1.0

0.3170 0.5177 0.8025 0.8803 0.9643

0.3170 0.3177 0.4025 0.2803 0.1643

(E)

37.14. The density function is a Pareto with α  4 and θ  1, so the distribution function is F ( x )  1 − 1/ (1 + x ) 4 . We compare the fitted to the actual at the five observations: xi 0.2 0.7 0.9 1.1 1.3

F5 ( x −i ) 0 0.2 0.4 0.6 0.8

F5 ( x i )

F∗ (xi )

Largest Difference

0.2 0.4 0.6 0.8 1.0

0.5177 0.8803 0.9233 0.9486 0.9643

0.5177 0.6803 0.5233 0.3486 0.1643

The Kolmogorov-Smirnov statistic, the largest difference, is 0.6803. Multiplying this by have (0.6803)(2.2361)  1.5211. This is in between 1.48 and 1.63, so the answer is (D).



n

√ 5, we

37.15. Let x be the fifth observation. If x ≤ 6, then F5 (6)  0.8, but F ∗ (6)  6/20  0.3, which would make the Kolmogorov-Smirnov statistic at least 0.8 − 0.3  0.5, contradicting that it is 0.3. If x ≥ 15, then F5 ( y )  0.6 for y ∈ [6, 15) while F ∗ ( y ) ranges from 0.3 to 0.75 in this range, so the absolute difference is always less than 0.3 in that range. Since F ∗ (15)  0.75 and F5 (15)  0.8 and both functions increase to no more than 1 after 15, the two functions never differ by more than 0.3 in [15, 20], so the Kolmogorov-Smirnov statistic is 0.3. We’re left with determining possible values of x in the [6, 15) range. We need to assure that F ∗ and F5 do not differ by more than 0.3 in the range (6, 15) . In the range 6 < y < x, F5 ( y )  0.6 and F ∗ ( y ) ranges C/4 Study Manual—17th edition Copyright ©2014 ASM

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

744

from 0.3 to 0.75, so the difference is never more than 0.3. In the range x ≤ y < 15, F5 ( y )  0.8, so we need F ∗ ( x ) ≥ 0.5. This will be true as long as x ≥ 10. In summary, x can have any value 10 or higher. x ≥ 10 37.16. The null hypothesis density function is a two-parameter Pareto with α  4 and θ  5000. We calculate the value of the fitted distribution function F ∗ ( x )  1 − 5000/ (5000 + x ) and at the limit.



4

at the 3 observations

F ∗ (200)  1 − (5000/5200) 4  0.1452

F ∗ (300)  1 − (5000/5300) 4  0.2079

F ∗ (600)  1 − (5000/5600) 4  0.3645

F ∗ (1000)  1 − (5000/6000) 4  0.5177 xi 200 300 600 1000

F5 ( x −i ) 0 0.2 0.4 0.6

F5 ( x i )

F∗ (xi )

Largest Difference

0.2 0.4 0.6

0.1452 0.2079 0.3645 0.5177

0.1452 0.1921 0.2355 0.0823

The test statistic is the maximum difference, or 0.2355 . 37.17. We calculate the value of the fitted distribution function F ∗ ( x )  1 − (5000/x ) 3 at the 4 unique observations and at the limit. F ∗ (5000)  0 5000 F (5500)  1 − 5500

!3

F ∗ (6000)  1 −

5000 6000

!3

F ∗ (7500)  1 −

5000 7500

!3



F ∗ (10,000)  1 −

5000 10,000

 0.2487  0.4213  0.7037

!3

 0.875

Since there are 8 observations, the empirical distribution function increases by 1/8 or 0.125 at each observation point, except it increases by 0.25 at 5500 for which there are two observations. xi 5000 5500 6000 7500 10000

C/4 Study Manual—17th edition Copyright ©2014 ASM

F8 ( x −i ) 0 0.125 0.375 0.500 0.625

F8 ( x i )

F∗ (xi )

Largest Difference

0.125 0.375 0.500 0.625

0 0.2487 0.4213 0.7037 0.875

0.125 0.1263 0.0787 0.2037 0.25

EXERCISE SOLUTIONS FOR LESSON 37

745

37.18. The fitted distribution is exponential with θ  2000. We need to adjust it for the deductible of 500. For an exponential, which is memoryless, the conditional distribution of payments is the same as the ground-up distribution of losses, exponential with mean 2000, so the fitted distribution after truncation is F ∗ ( x )  1 − e − ( x−500)/2000 , where x is the loss size (including deductible). We calculate F ∗ ( x ) at the five observations. F ∗ (600)  1 − e −100/2000  0.0488

F ∗ (1200)  1 − e −700/2000  0.2953

F ∗ (1500)  1 − e −1000/2000 − 0.3935

F ∗ (2500)  1 − e −2000/2000  0.6321 F ∗ (8000)  1 − e −7500/2000  0.9765

xi 600 1200 1500 2500 8000

Fn ( x −i ) 0 0.2 0.4 0.6 0.8

Fn ( x i )

F∗ (xi )

Largest Difference

0.2 0.4 0.6 0.8 1.0

0.0488 0.2953 0.3935 0.6321 0.9765

0.1512 0.1047 0.2065 0.1679 0.1765

37.19. The fitted distribution is inverse exponential with θ  1000. We need to adjust it for the deductible of 500. F ( x ) − F (500) e −1000/x − e −2 F∗ (x )    1.156518 ( e −1000/x − 0.135335) 1 − F (500) 1 − e −2 We now compute this at the four observations

F ∗ (700)  1.156518 ( e −1000/700 − 0.135335)  0.1206

F ∗ (1000)  1.156518 ( e −1000/1000 − 0.135335)  0.2689

F ∗ (2000)  1.156518 ( e −1000/2000 − 0.135335)  0.5449

F ∗ (4500)  1.156518 ( e −1000/4500 − 0.135335)  0.7695 xi 700 1000 2000 4500

Fn ( x −i ) 0 0.25 0.50 0.75

Fn ( x i )

F∗ (xi )

Largest Difference

0.25 0.50 0.75 1.00

0.1206 0.2689 0.5449 0.7695

0.1294 0.2311 0.2051 0.2305

37.20. The fitted distribution function adjusted for the deductible is uniform on (1000, 5000]. For example, F ∗ (1500)  (1500 − 1000) / (5000 − 1000)  0.125. The other values of F ∗ ( x ) are computed similarly. The empirical distribution function F10 ( x ) increases by 1/10 for each observation. F10 (3000)  0.6 since the sixth ordered observation is 3000, but F10 ( x ) is unknown above 3000.

C/4 Study Manual—17th edition Copyright ©2014 ASM

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

746

xi 1500 1700 2000 2500 3000

Fn ( x −i ) 0 0.2 0.3 0.4 0.5

Fn ( x i )

F∗ (xi )

Largest Difference

0.2 0.3 0.4 0.5 0.6

0.125 0.175 0.25 0.375 0.5

0.125 0.125 0.15 0.125 0.1

37.21. This is the same as the previous exercise except that there are two observations of 3000, so that Fn (3000)  0.7 instead of 0.6, and the largest difference occurs at 3000: Fn (3000) − F ∗ (3000)  0.7 − 0.5  0.2 . 37.22.

F ∗ ( x )  1 − 1/x. cj Fn ( c j ) 1 4/3 2 4 ∞

0 0.40 0.65 0.90 1.00

F (c j ) 0 0.25 0.50 0.75 1.00

|F ∗ ( c j ) − Fn ( c j ) | 0 0.15 0.15 0.15 0

|Fn ( c j ) − F ∗ ( c j−1 ) |

|F ( c j ) − Fn ( c j−1 ) |

0.40 0.40 0.40 0.25

0.25 0.10 0.10 0.10

(D)

37.23. The following table answers both this exercise and the next one. cj 0 1 2 3 4

Fn ( c j ) 0 0.325 0.7 0.7 1

F∗ (c j ) 0 0.25 0.5 0.75 1

|F ∗ ( c j ) − Fn ( c j ) | 0 0.075 0.2 0.05 0

|Fn ( c j ) − F ∗ ( c j−1 ) |

|F ∗ ( c j ) − Fn ( c j−1 ) |

0.325 0.45 0.2 0.25

0.25 0.175 0.05 0.3

37.24. Based on the table for the solution to the previous exercise, the answer is 0.45 . 37.25. The data are truncated at 3000. Therefore, the fitted distribution F ∗ is calculated from the Pareto F as follows: F∗ (x ) 

F ( x ) − F (3000) 1 − F (3000)









1−





− 1−

25,000 2 28,000 25,000 2 25,000  2 − 25,000+x 28,000 25,000 2 28,000 25,000  2 25,000+x 1− 25,000 2 28,000 !2

1−

C/4 Study Manual—17th edition Copyright ©2014 ASM

25,000  2 25,000+x

28,000 25,000 + x

25,000 2 28,000



EXERCISE SOLUTIONS FOR LESSON 37

747

Using this formula, we get 28,000 F (5000)  1 − 30,000

!2

28,000 F (10,000)  1 − 35,000

!2

28,000 F (25,000)  1 − 50,000

!2



∗ ∗

5000 0.1289 0.06 0.0689

cj F∗ (c j ) Fn ( c j ) Difference

 0.1289  0.3600  0.6864

10,000 0.3600 0.35 0.01

25,000 0.6864 0.74 0.0536

(C) 37.26. The smallest possible value takes into account only the known differences between fitted and actual, namely the differences at exact ages 63, and 64, and ignores differences at non-integral ages. The fitted survival function is S∗ (3)  (0.913)(0.98)(0.95)  0.85. We can calculate the maximum difference of the survival functions S instead of F; they are the same.

37.27.

xj

Sn ( x j )

S∗ ( x j )

1 2

0.91300 0.89474

0.95 0.90

|S n ( x j ) − S∗ ( x j ) | 0.03700 0.00526

F ∗ ( x )  1 − 5000/ (5000 + x ) . Fn ( c j )

cj 0 2,000 5,000 10,000 20,000 ∞

0 0.3 0.55 0.7 0.85 1

cj

Fn ( c j )

0 100 200 300 400

0 0.2 0.4 0.7 1.0

F∗ (c j ) 0 0.2857 0.5 0.6667 0.8 1

|Fn ( c j ) − F ∗ ( c j−1 ) |

|F ∗ ( c j ) − Fn ( c j−1 ) |

0.3 0.2643 0.2 0.1833 0.2

0.2857 0.2 0.1167 0.1 0.15

37.28. F∗ (c j ) 0 0.25 0.5 0.75 1.0

|Fn ( c j ) − F ∗ ( c j−1 ) |

|F ∗ ( c j ) − Fn ( c j−1 ) |

0.2 0.15 0.2 0.25

0.25 0.3 0.35 0.3

Unlike the previous exercise, where the maximum occurred in the first column, the maximum in the problem occurs in the last column. Strictly speaking, the Kolmogorov-Smirnov statistic can never equal 0.35, but can be made arbitrarily close to 0.35 by making the 3 claims in the 200–300 range equal to 300 − ε, ε ) and actual (0.40) then occurs at ε arbitrarily small. The maximum difference between fitted (0.75 − 400 300 − ε. C/4 Study Manual—17th edition Copyright ©2014 ASM

37. HYPOTHESIS TESTS: KOLMOGOROV-SMIRNOV

748

Quiz Solutions 37-1.

Here is the table: xj

F∗ (x j )

Fn ( x −j )

Fn ( x j )

Largest Difference

5 10 20 37 52 65 92 100

0.05 0.10 0.20 0.37 0.52 0.65 0.92 1.00

0 0.125 0.250 0.375 0.500 0.625 0.750 0.875

0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000

0.075 0.150 0.175 0.130 0.105 0.100 0.170 0.125

The largest difference is D  0.175 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 38

Hypothesis Tests: Anderson-Darling Reading: Loss Models Fourth Edition 16.4.2 The Anderson-Darling test was included in the pre-2000 Course 160 syllabus, but was not on the Course 4 syllabus during 2000–2002. It has been on the Course 4 or Course C syllabus since 2003. On released exams, it has only appeared in theory questions that are parts of multiple true/false questions. Students have reported that there have been some calculation questions involving Anderson-Darling on recent CBT exams. However, I believe such questions are not common, so I would give low priority to memorizing formula (38.1). In the following, we continue to use the notation F ∗ to indicate the fitted function adjusted for truncation; see the discussion of equations (36.1) on page 713 for details. The Kolmogorov-Smirnov statistic is crude in that it uses a single point, the point of maximal difference between Fn and F ∗ . Anderson-Darling, in contrast, integrates the difference over the entire range from the lower truncation point t to the upper censoring point u. It weights the difference by the reciprocal of the variance. The formula for the Anderson-Darling A2 statistic is: A2  n

u

Z



Fn ( x ) − F ∗ ( x )

2

F∗ (x ) 1 − F∗ (x )



t

 f ∗ ( x ) dx

In this formula, n includes censored observations. That formula is usually hard to evaluate, but if the fitted distribution is uniform it may not be too bad (see the exercises). The product in the denominator is small when S∗ ( x ) or F ∗ ( x ) is small, near t and u; thus heavier weight is put on the tails of the distribution. For individual data, this can be evaluated as a sum. Number the unique ordered non-censored data points1 from 1 to k (in other words, some observations may be tied; count these observations only once), so that they are t  y0 < y1 < · · · < y k < y k+1  u. Note that y0 is set equal to t and y k+1 is set equal to u in order to make the following formula work. A2  − nF ∗ ( u ) + n +n

k  X j1

k  X

Sn ( y j )

j0

Fn ( y j )

2 

2 

ln S∗ ( y j ) − ln S ∗ ( y j+1 )

ln F ∗ ( y j+1 ) − ln F ∗ ( y j )





(38.1)

As before, n includes censored observations. To help you memorize the formula, note the symmetry of the S’s and the F’s. The second factor in each sum is arranged in the order that makes the difference positive. The second sum could start from 0 too, but the summand corresponding to 0 would be 0 since Fn ( t )  0. In the first sum, if u  ∞ or if S∗ ( u )  0 for any other reason, skip the last term. If there are no censored observations, S n ( y k )  0, which will also allow skipping the last term. 1In this lesson, we’ll use the y j notation for order statistics. C/4 Study Manual—17th edition Copyright ©2014 ASM

749

38. HYPOTHESIS TESTS: ANDERSON-DARLING

750

Since this statistic is the integral of a square, it cannot be negative. If you get a negative answer, you know you made a mistake. Example 38A An insurer offers a coverage with a policy limit of 2000. The following three claims are observed on this coverage: 300, 300, 800. In addition, one claim is for an amount over 2000 and is censored at 2000. You model the ground-up losses using a uniform distribution on [0, 2500]. You test this model against the experience using the Anderson-Darling A2 statistic. Calculate A2 . Answer: We have that F ∗ (300)  0.12, F ∗ (800)  0.32, and F ∗ (2000)  0.80. Also, Fn (300)  12 and Fn (800)  43 . Since n  4, the first term of formula (38.1) is −nF ∗ ( u )  −4 (0.8)  −3.2. The first sum of the A2 formula is: 4 12 (0 − ln 0.88) +



1 2 2 (ln 0.88

− ln 0.68) +

1 2 4 (ln 0.68

− ln 0.2)  4 (0.268777)  1.07511.



The second sum of the A2 formula is: 4

 2 1 2

(ln 0.32 − ln 0.12) +

So A2 is:

3 2 4 (ln 0.8

− ln 0.32)  4 (0.760621)  3.04248.



A2  −3.2 + 1.07511 + 3.04248  0.91759



The critical values for the Anderson-Darling statistic are fixed; they do not vary with sample size n (unlike the Kolmogorov-Smirnov statistic). However, they should be smaller if parameters are estimated or if u < ∞, just like the Kolmogorov-Smirnov statistic. A summary of the main characteristics of the Anderson-Darling test is given in Table 39.1 on page 767. It is more likely you will be tested on these characteristics than on actually calculating the statistic.

Exercises 38.1.

You have observed one loss of 1. The assumed distribution of losses is uniform on (0, 2].

Calculate the Anderson-Darling A2 statistic for this fit. 38.2.

You have observed one loss of 0.5. The assumed distribution of losses is uniform on (0, 2].

Calculate the Anderson-Darling A2 statistic for this fit. 38.3.

Losses are assumed to follow an exponential distribution with mean 100.

For an insurance coverage with policy limit 50, you observe one loss of 25. Calculate the Anderson-Darling A2 statistic for this fit. 38.4.

Losses are assumed to follow a uniform distribution on (0, 100].

For an insurance coverage with policy limit 60, you observe one loss of 25 and one loss at the limit. Calculate the Anderson-Darling A2 statistic for this fit. Use the following information for questions 38.5 and 38.6: Losses are assumed to follow a uniform distribution on (0, 100]. For an insurance coverage with ordinary deductible 10, you observe one loss of 50 before deductible. 38.5.

Calculate the Anderson-Darling A2 statistic for this fit.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 38

38.5–6.

751

(Repeated for convenience) Use the following information for questions 38.5 and 38.6:

Losses are assumed to follow a uniform distribution on (0, 100]. For an insurance coverage with ordinary deductible 10, you observe one loss of 50 before deductible. 38.6. Suppose that there are five losses of 50 instead of one loss. How would the answer to the previous exercise change? 38.7. You have observed the following 3 losses: 200, 500, 2000. You fit these to a parametric distribution F. You are given the following table of values for F: x

F (x )

0 200 500 2000

0.0 0.3 0.6 0.9

You test the fit using the Anderson-Darling A2 statistic. Calculate A2 . 38.8. An insurance coverage has deductible of 250 and a maximum covered loss of 20,000. 22 observed losses are fitted to a lognormal distribution with µ  7 and σ  2. Of these 22 observed losses, 15 are below the limit. Each one is for a different amount. Let Fn and S n be the empirical distribution and survival functions, and F ∗ and S∗ be the fitted distribution and survival functions after taking truncation at 250 into account. Let 500  y1 < y2 < · · · < y15 be the 15 observed losses below the limit. You are given: (i)

P15

(ii)

P15

j1 j1

S 2n ( y j ) ln S∗ ( y j ) − ln S∗ ( y j+1 )

 





Fn2 ( y j ) ln F ∗ ( y j+1 ) − ln F ∗ ( y j )

 







 0.505



 0.423

Calculate the Anderson-Darling A2 test statistic for the fit. 38.9. An insurer offers a coverage with a policy limit of 2000. The following 4 claims are observed on this coverage: 300, 300, 800, 1500. You model these losses using a uniform distribution on [0, 2500]. You test this model against the experience using the Anderson-Darling A2 statistic. Calculate A2 . 38.10. A mortality study on a group of 3 results in failure times of 2, 5, and 5. You fit an exponential distribution to this data using maximum likelihood. You test this fit using the Anderson-Darling A2 statistic. Calculate A2 . 38.11. An insurer offers a coverage with a deductible of 500 and a maximum covered loss of 5000. Observed losses (including the deductible) are 1000 and 2000; in addition, there is one claim censored at 5000. You fit a Pareto with α  1, θ  1000 to this data and then test the fit using the Anderson-Darling A2 statistic. Calculate A2 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

38. HYPOTHESIS TESTS: ANDERSON-DARLING

752

Solutions 38.1.

Since there’s only one loss, it is not hard to calculate the statistic directly from the integral. We have

 0 Fn ( x )   1

x 3)  500 − 401.2594 − 88.2771 − 9.7105 − 0.7121  0.0409 The 4+ group must be considered, even though there are no observed claims in it. On the other hand, even though there are 5 observations of size 3, in order to get 5 expected observations, we must combine 2, 3, and 4+, leaving us with 3 groups. This group has 9.7105+0.7121+0.0409  10.4635 expected observations. The chi-square statistic is then Q

(410 − 401.2594) 2

+

(75 − 88.2771) 2

+

(15 − 10.4635) 2

401.2594 88.2771 10.4635  0.190396 + 1.996901 + 1.966786  4.154082

Since we estimated λ, we lose one degree of freedom. There is 3 − 1 − 1  1 degree of freedom. The hypothesis is accepted at 2.5% significance (critical value 5.024), but not at 5% (critical value 3.841). (C). A chi-square random variable has a gamma distribution with parameters θ  2 and α  d/2, where d is the number of degrees of freedom. If d  2, then it is exponential and you don’t need the tables to calculate critical values. C/4 Study Manual—17th edition Copyright ©2014 ASM

39.5. DATA FROM SEVERAL PERIODS

767

Table 39.1: Comparison of the three methods of testing goodness of fit

Kolmogorov-Smirnov Should be used only for individual data Only for continuous fits Should lower critical value if u 1. x Using Pearson’s goodness-of-fit statistic and the chi-square table shown below, determine which of the following statements is true.

(A) (B) (C) (D) (E)

Degrees of Freedom

0.10

2 3 4 5

4.61 6.25 7.78 9.24

Significance Level 0.05 0.02 0.01 5.99 7.81 9.49 11.07

7.82 9.84 11.67 13.39

9.21 11.34 13.28 15.09

H0 will be rejected at the 0.01 significance level. H0 will be rejected at the 0.02 significance level, but not at the 0.01 level. H0 will be rejected at the 0.05 significance level, but not at the 0.02 level. H0 will be rejected at the 0.10 significance level, but not at the 0.05 level. H0 will be accepted at the 0.10 significance level.

39.13. [4B-F99:30] (1 point) You wish to test the hypothesis that a set of data arises from a given parametric distribution with given parameters. (Thus no parameters are estimated from the data.) Which of the following statements is true? (A) (B) (C) (D) (E)

The value of Pearson’s chi-square statistic depends on the endpoints of the chosen classes. The value of Pearson’s chi-square statistic depends on the number of parameters of the distribution. The value of the Kolmogorov-Smirnov statistic depends on the endpoints of the chosen classes. The value of the Kolmogorov-Smirnov statistic depends on the number of parameters of the distribution. None of the above statements is true.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 39

775

Use the following information for questions 39.14 and 39.15: You are given the following: •

The observed number of claims for a group of 50 risks has been recorded as follows: Number of Claims Number of Risks 0 1 2 3 4



7 10 12 17 4

The null hypothesis, H0 , is that the number of claims per risk follows a uniform distribution on 0, 1, 2, 3, and 4.

39.14. [4B-S98:10] (2 points) A chi-square test is performed using Pearson’s goodness-of-fit statistic with five classes. Using the chi-square table shown below, determine which of the following statements is true.

(A) (B) (C) (D) (E)

Degrees of Freedom

0.10

2 3 4 5

4.61 6.25 7.78 9.24

Significance Level 0.05 0.02 0.01 5.99 7.81 9.49 11.07

7.82 9.84 11.67 13.39

9.21 11.34 13.28 15.09

H0 will be rejected at the 0.01 significance level. H0 will be rejected at the 0.02 significance level, but not at the 0.01 level. H0 will be rejected at the 0.05 significance level, but not at the 0.02 level. H0 will be rejected at the 0.10 significance level, but not at the 0.05 level. H0 will be accepted at the 0.10 significance level.

39.15. [4B-S98:11] (2 points) Two adjacent classes of the five classes above are combined, and a chi-square test is performed using Pearson’s goodness-of-fit statistic with four classes. Determine which of the following combinations will result in a conclusion different from the one reached in the previous question. (A) (B) (C) (D) (E)

Combining the risks with 0 claims and the risks with 1 claim Combining the risks with 1 claim and the risks with 2 claims Combining the risks with 2 claims and the risks with 3 claims Combining the risks with 3 claims and the risks with 4 claims None of the above combinations will result in a conclusion different from the one reached in the previous question.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

39. HYPOTHESIS TESTS: CHI-SQUARE

776

39.16. You are given the following claim frequency data: Frequency Number of Insureds

0 500

1 390

2 100

3 10

4+ 0

A Poisson distribution is fitted to the data, using maximum likelihood. Calculate the chi-square statistic for measuring goodness to fit, using 4 classes. 39.17. [4B-S99:11] (3 points) You are given the following: •

One hundred claims greater than 3,000 have been recorded as follows: Interval Number of Claims (3,000, 5,000] (5,000, 10,000] (10,000, 25,000] (25,000, ∞)

6 29 39 26



Claims of 3,000 or less have not been recorded.



The null hypothesis, H0 , is that claim sizes follow a Pareto distribution with parameters α  2 and θ  25, 000. A chi-square test is performed using Pearson’s goodness-of-fit statistic with four classes. Using the chi-square table shown below, determine which of the following statements is true.

(A) (B) (C) (D) (E)

Degrees of Freedom

0.10

2 3 4 5

4.61 6.25 7.78 9.24

Significance Level 0.05 0.02 0.01 5.99 7.81 9.49 11.07

7.82 9.84 11.67 13.39

9.21 11.34 13.28 15.09

H0 will be rejected at the 0.01 significance level. H0 will be rejected at the 0.02 significance level, but not at the 0.01 level. H0 will be rejected at the 0.05 significance level, but not at the 0.02 level. H0 will be rejected at the 0.10 significance level, but not at the 0.05 level. H0 will be accepted at the 0.10 significance level.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 39

777

39.18. [160-83-97:12] For a complete study of 150 patients diagnosed with a fatal disease, the number of deaths each duration after diagnosis is as follows: Time interval

Number of deaths

(0, 1] (1, 2] (2, 3] (3, 4]

21 27 39 63

The chi-square statistic Q is used to test the fit of the survival model S (t )  1 −

t ( t + 1) , 20

0 ≤ t ≤ 4.

The appropriate number of degrees of freedom is denoted by k. Calculate Q/k. (A) 0.5

(B) 0.6

(C) 0.7

(D) 0.9

(E) 1.2

39.19. [4B-F96:9] (3 points) You are given the following: •

The observed number of claims for a group of 1,000 risks has been recorded as follows: Number of Claims Number of Risks 0 1 2 3 or more

729 242 29 0



The null hypothesis, H0 , is that the number of claims per risk follows a Poisson distribution.



A chi-square test is performed using Pearson’s goodness-of-fit statistic with three classes. The first class contains those risks with 0 claims, the second contains those risks with 1 claim, and the third contains those risks with 2 or more claims.



The minimum chi-square estimate of the mean of the Poisson distribution is 0.3055. Using the chi-square table shown below, determine which of the following statements is true. Degrees of Freedom 1 2 3 4

(A) (B) (C) (D) (E)

Significance Level 0.10 0.05 0.02 0.01 2.71 4.61 6.25 7.78

3.84 5.99 7.81 9.49

5.41 7.82 9.84 11.67

6.63 9.21 11.34 13.28

H0 will be rejected at the 0.01 significance level. H0 will be rejected at the 0.02 significance level, but not at the 0.01 level. H0 will be rejected at the 0.05 significance level, but not at the 0.02 level. H0 will be rejected at the 0.10 significance level, but not at the 0.05 level. H0 will be accepted at the 0.10 significance level.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

39. HYPOTHESIS TESTS: CHI-SQUARE

778

39.20. [4-S01:19] During a one-year period, the number of accidents per day was distributed as follows: Number of Accidents 0 1 2 3 4 5

Days 209 111 33 7 3 2

You use a chi-square test to measure the fit of a Poisson distribution with mean 0.60. The minimum expected number of observations in any group should be 5. The maximum possible number of groups should be used. Determine the chi-square statistic. (A) 1

(B) 3

(C) 10

(D) 13

(E) 32

39.21. [4-F01:25] You are investigating insurance fraud that manifests itself through claimants who file claims with respect to auto accidents with which they were not involved. Your evidence consists of a distribution of the observed number of claimants per accident and a standard distribution for accidents on which fraud is known to be absent. The two distributions are summarized below: Number of Claimants per Accident

Standard Probability

Observed Number of Accidents

1 2 3 4 5 6+

0.25 0.35 0.24 0.11 0.04 0.01

235 335 250 111 47 22

Total

1.00

1000

Determine the result of a chi-square test of the null hypothesis that there is no fraud in the observed accidents. (A) (B) (C) (D) (E)

Reject at the 0.005 significance level. Reject at the 0.010 significance level, but not at the 0.005 level. Reject at the 0.025 significance level, but not at the 0.010 level. Reject at the 0.050 significance level, but not at the 0.025 level. Do not reject at the 0.050 significance level.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 39

779

39.22. [4-F03:16] A particular line of business has three types of claims. The historical probability and the number of claims for each type in the current year are: Historical Probability 0.2744 0.3512 0.3744

Type A B C

Number of Claims in Current Year 112 180 138

You test the null hypothesis that the probability of each type of claim in the current year is the same as the historical probability. Calculate the chi-square goodness-of-fit test statistic. (A) (B) (C) (D) (E)

Less than 9 At least 9, but less than 10 At least 10, but less than 11 At least 11, but less than 12 At least 12

39.23. [1999 C4 Sample:9] Summary statistics for a sample of 100 losses are: Interval

Number of Losses

Sum

Sum of Squares

(0, 2,000] (2,000, 4,000] (4,000, 8,000] (8,000, 15,000] (15, 000, ∞]

39 22 17 12 10

328,065 63,816 96,447 137,595 331,831

52,170,078 194,241,387 572,753,313 1,628,670,023 17,906,839,238

100

667,754

20,354,674,039

Total

A Pareto distribution is fit to this data. When a similar study was conducted on a different data set, the estimated parameters were αˆ  2.5 and θˆ  10,000. Determine the chi-square statistic and the number of degrees of freedom for a test (with five groups) to assess the acceptability of fit of the data above to these parameters.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

39. HYPOTHESIS TESTS: CHI-SQUARE

780

39.24. [4B-F92:22] (3 points) You are given the following information for 10,000 risks grouped by number of claims. •

A Poisson distribution was fit to the grouped risks.



Minimum chi-square estimation has been used to estimate the Poisson distribution parameter.



The results are as follows: Number of Claims

Actual Number Of Risks

Estimated Number Of Risks Using Poisson

0 1 2 3 or more

7,673 2,035 262 30

7,788 1,947 243 22

Total

10,000

10,000

You are given the following chi-square table: Degrees Of Freedom 2 3 4 5

Level Of significance α 0.050 0.025 0.010 0.005 5.99 7.81 9.49 11.10

7.38 9.35 11.10 12.80

9.21 11.30 13.30 15.10

10.60 12.80 14.90 16.70

You are to use Pearson’s chi-square statistic to test the hypothesis, H0 , that the Poisson provides an acceptable fit. Which of the following is true? (A) (B) (C) (D) (E)

Reject H0 at α  0.005 Accept H0 at α  0.005, but reject H0 at α  0.010 Accept H0 at α  0.010, but reject H0 at α  0.025 Accept H0 at α  0.025, but reject H0 at α  0.050 Accept H0 at α  0.050

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 39

781

39.25. [4B-F97:20] (3 points) You are given the following: •

The observed number of claims for a group of 100 risks has been recorded as follows: Number of Claims Number of Risks 0 1

80 20



The null hypothesis, H0 , is that the number of claims per risk follows a Bernoulli distribution with mean p.



A chi-square test is performed using Pearson’s goodness-of-fit statistic.

Using the chi-square table shown below, determine the smallest value of p for which H0 will be accepted at the 0.01 significance level.

(A) (B) (C) (D) (E)

Degrees of Freedom

Significance Level 0.01

1 2 3

6.63 9.21 11.34

Less than 0.08 At least 0.08, but less than 0.09 At least 0.09, but less than 0.10 At least 0.10, but less than 0.11 At least 0.11

39.26. [160-F87:18] In the following table, the values of t| q x are calculated from a fully specified survival model, and the values of d x+t are observed deaths from the complete mortality experience of 100 cancer patients age x at entry to the study. t

t| q x

d x+t

0 1 2 3 4 5

0.10 0.25 0.25 0.20 0.15 0.05

15 30 20 15 10 10

You hypothesize that the mortality of cancer patients is governed by the specified model. Let Q be the value of the chi-square statistic used to test the validity of this model, and let n be the degrees of freedom. Determine Q − n. (A) 6.4

(B) 7.4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 8.4

(D) 45.3

(E) 46.3

Exercises continue on the next page . . .

39. HYPOTHESIS TESTS: CHI-SQUARE

782

39.27. [4-S00:29 and 4-F02:28] You are given the following observed claim frequency data collected over a period of 365 days: Number of Claims per Day

Observed Number of Days

0 1 2 3 4+

50 122 101 92 0

Fit a Poisson distribution to the above data, using the method of maximum likelihood. Group the data by number of claims per day into four groups: 0

1

2

3 or more

Apply the chi-square goodness-of-fit test to evaluate the null hypothesis that the claims follow a Poisson distribution. Determine the result of the chi-square test. (A) (B) (C) (D) (E)

Reject at the 0.005 significance level. Reject at the 0.010 significance level, but not at the 0.005 level. Reject at the 0.025 significance level, but not at the 0.010 level. Reject at the 0.050 significance level, but not at the 0.025 level. Do not reject at the 0.050 significance level.

39.28. [4-F04:10] You are given the following random sample of 30 auto claims: 54 2,450 7,200

140 2,500 7,390

230 2,580 11,750

560 2,910 12,000

600 3,800 15,000

1,100 3,800 25,000

1,500 3,810 30,000

1,800 3,870 32,300

1,920 4,000 35,000

2,000 4,800 55,000

You test the hypothesis that auto claims follow a continuous distribution F ( x ) with the following percentiles: x F (x )

310 0.16

500 0.27

2,498 0.55

4,876 0.81

7,498 0.90

12,930 0.95

You group the data using the largest number of groups such that the expected number of claims in each group is at least 5. Calculate the chi-square goodness-of-fit statistic. (A) (B) (C) (D) (E)

Less than 7 At least 7, but less than 10 At least 10, but less than 13 At least 13, but less than 16 At least 16

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 39

783

39.29. [C-S05:19] Which of the following statements is true? (A) (B) (C) (D) (E)

For a null hypothesis that the population follows a particular distribution, using sample data to estimate the parameters of the distribution tends to decrease the probability of a Type II error. The Kolmogorov-Smirnov test can be used on individual or grouped data. The Anderson-Darling test tends to place more emphasis on a good fit in the middle rather than in the tails of the distribution. For a given number of cells, the critical value for the chi-square goodness-of-fit test becomes larger with increased sample size. None of (A), (B), (C) or (D) is true.

39.30. [C-S05:33] You test the hypothesis that a given set of data comes from a known distribution with distribution function F ( x ) . The following data were collected: Interval x 3.638)  e −3.638/2  0.1622



Example 40C You are given a sample with five observations: 0.5

5 5 X

10

20

30

ln x i  9.6158

i1

You are considering two models for the underlying distribution: I. Exponential fit with maximum likelihood. II. Weibull fit with maximum likelihood. The loglikelihood function at the optimal value of the parameters θ and τ is −17.8625. Calculate the likelihood ratio statistic, and using a chi-square table, determine the significance levels at which the first model is accepted. Answer: An exponential is a special case of a Weibull with τ  1, so Model I has 1 free parameter (θ) and Model II has two free parameters(θ, τ). The likelihood ratio test statistic will therefore have one degree of freedom. The exponential maximum likelihood fit sets θˆ  x¯  13.1. The density and loglikelihood for an exponential are e −x/θ θ P xi n x¯ l (θ)  − − n ln θ  − − n ln θ θ θ l ( x¯ )  −n (1 + ln x¯ )  −5 (1 + ln 13.1)  −17.8631

f ( x; θ ) 

The likelihood ratio statistic is 2 (−17.8625 + 17.8631)  0.0012. It has one degree of freedom. Model I is accepted at almost any degree of significance.  The likelihood ratio test can be used to decide whether to combine two data sets into one model or to model them separately. If the data sets are combined, then the number of free parameters of the overall model is the number of parameters in the single model. If they are not combined, the number of free parameters of the overall model is the sum of the number of parameters of the two models. Twice the logarithm of the likelihood ratio should exceed the chi-square distribution with degrees of freedom equal to the difference in the number of free parameters in the overall models. C/4 Study Manual—17th edition Copyright ©2014 ASM

40.1. LIKELIHOOD RATIO TEST AND ALGORITHM

799

Example 40D For a group covered by insurance, there are 200 members and 20 claims in 2001. There are 250 members and 30 claims in 2002. Two alternative models are proposed: 1. A Poisson distribution with parameter λ 1 to model 2001 claim frequency and a Poisson distribution with parameter λ2 to model 2002 claim frequency. 2. A Poisson distribution with parameter λ to model 2001 claim frequency and a Poisson distribution with parameter 1.1λ to model 2002 claim frequency. Calculate the likelihood ratio statistic to decide whether to prefer the first model to the second one. Answer: We must calculate the maximum likelihood of each model. Suppose that in a given year there are n members each having x1 , x 2 , . . . , x n claims, with the total P i number of claims ni1 x i  m. The density function for a Poisson with parameter λ is f ( i )  e −λ λi! , so the likelihood function and loglikelihood functions are L (λ)  e

−nλ

λ

Pn

i1

Qn

i1

xi

xi !

λm  e −nλ Qn

l ( λ )  −nλ + m ln λ − ln

i1 x i ! n Y

xi !

i1

In the first model, we know that the MLE of each Poisson distribution is the sample mean, or loglikelihood for each year j  1, 2 then becomes: l ( λ )  −m j + m j ln

mj nj

− ln

nj Y

m n.

The

xi j !

i1

with n1  200, m 1  20, n2  250, m 2  30. Adding these together for the two years, we get l ( λ1 , λ2 )  −20 + 20 ln

Y 30 20 xi j ! − 30 + 30 ln − ln 200 250

 −159.66 − ln

1≤ j≤2 1≤i≤n j

Y

xi j !

1≤ j≤2 1≤i≤n j

Let’s calculate the MLE of the second model. The likelihoods in the first year are e −λ λ x i /x i !, and in the second year they are e −1.1λ (1.1λ ) x i /x i !. Temporarily ignoring the constants x i !, we have L ( λ )  e −200λ−250 (1.1) λ ( λ 20 )(1.1) 30 ( λ 30 )  e −475λ λ50 (1.1) 30 l ( λ )  −475λ + 50 ln λ + 30 ln 1.1 dl 50  −475 + 0 dλ λ 50 λˆ  475 C/4 Study Manual—17th edition Copyright ©2014 ASM

40. LIKELIHOOD RATIO TEST AND ALGORITHM, SCHWARZ BAYESIAN CRITERION

800

At 50/475, the loglikelihood is l

50  475

50  −50 + 50 ln 475 + 30 ln 1.1 − ln

 −159.71 − ln

Y

Y

xi j !

1≤ j≤2 1≤i≤n j

xi j !

1≤ j≤2 1≤i≤n j

The likelihood ratio statistic is twice the difference. Notice that the term ln

Q

1≤ j≤2 1≤i≤n j

x i j ! is the same in

both models, since the product in each case is over the number of claims of all members in both years, so the same items are being multiplied. Therefore this term drops out when subtracting the second model’s statistic from the first model’s statistic. Therefore, twice the difference is 2 (−159.66 + 159.71)  0.1 . Note that the number of degrees of freedom for the likelihood ratio test is 1, the difference between the number of free parameters in the first model (1) and the second model (2). The second model would be selected at almost any level of significance, since 0.1 is such a low number. 

40.2

Schwarz Bayesian Criterion

Loglikelihood is proportional to n, the size of the sample. The likelihood ratio algorithm thresholds will therefore be easier to meet as n grows, making it easier to justify more complex models. An alternative to the likelihood ratio algorithm is the Schwarz Bayesian Criterion. The Schwarz Bayesian Criterion (SBC) takes into account the sample size. It subtracts a penalty of ( r/2) ln n from the loglikelihood of each model before comparing them. This means that each additional parameter must increase the loglikelihood by (ln n ) /2 to justify its inclusion. Example 40E As in Example 40A, you have derived maximum loglikelihoods for several models and they are the ones presented in the table in that example. The data consists of 10 points. You select the model by using the Schwarz Bayesian Criterion. Which model do you select? Answer: (ln 10) /2  1.15, so you will charge a penalty of 1.15 for each parameter. This means that the penalized values of the four models are: Number of parameters

Penalized value

1 2 3 4

−321.32 − 1.15  −322.47 −319.93 − 2.30  −322.23 −319.12 − 3.45  −322.57 −318.12 − 4.61  −322.73

(On the 4-parameter line, 4.61 rather than 4.60 resulted when multiplying a more precise value of

(ln 10) /2 by 4 and rounding to two places.)

We see in this table that the highest value, after the penalty, occurs for the 2-parameter model. Therefore the 2-parameter model is selected . 

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 40

801

Exercises 40.1. (A) (B) (C) (D) (E) 40.2.

Which of the following is true? A high p-value tends to imply rejection of the null hypothesis. The significance level of a hypothesis test is the probability of making a type I error given that the null hypothesis is false. If the test statistic lies within the rejection region, the alternative hypothesis is accepted. If T is the test statistic from a likelihood ratio test, the test rejects the null hypothesis if T > c, where T has a chi-square distribution with the number of degrees of freedom equal to the number of free parameters in the model, and c is such that Pr (T > c )  α, where α is the significance level. The critical values for a hypothesis test do not depend on the significance level of the test. Claim sizes are fitted to several models, with the following results: Model

Negative Loglikelihood

Exponential Pareto Loglogistic Weibull Burr Generalized Pareto

613.7 613.3 613.1 612.4 610.9 610.6

Using the likelihood ratio algorithm at 5% significance, which model is preferred? Use the following information for questions 40.3 and 40.4: You fit various models to 20 loss observations using maximum likelihood. The fits maximizing the likelihood for a given number of parameters have the following loglikelihoods: Number of parameters

Loglikelihood

1 2 3 4 5

−142.32 −140.75 −139.40 −138.30 −137.40

40.3. Using the likelihood ratio algorithm at 95% confidence, how many parameters are in the selected model? 40.4.

Using the Schwarz Bayesian Criterion, how many parameters are in the selected model?

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

40. LIKELIHOOD RATIO TEST AND ALGORITHM, SCHWARZ BAYESIAN CRITERION

802

40.5.

196 claim sizes are fitted to several models, with the following results: Model

Negative Loglikelihood

Exponential Lognormal Burr Weibull Inverse Gaussian

425.3 423.9 421.6 421.0 420.0

Using the Schwarz Bayesian Criterion, which model is selected? 40.6.

95 claim sizes are fitted to several models, with the following results: Model

Negative Loglikelihood

Exponential Inverse exponential Gamma Inverse gamma Burr

487.0 487.5 487.0 484.1 482.0

Using the Schwarz Bayesian Criterion, which model is selected? 40.7. Your company sells auto collision coverage with a choice of two deductibles: a 500 deductible and a 1000 deductible. Last year, your experience was as follows: Deductible

Number of Claims

Average Claim Size (after deductible)

500 1000

55 45

700 1100

It is suspected that policyholders with the higher deductible pad their claims so that they are above the deductible, resulting in higher average loss size. To investigate this, you assume that losses on the coverage have an exponential distribution. You test the hypothesis that the mean of this distribution is the same for each coverage using the likelihood ratio test. Which of the following statements is true? (A) (B) (C) (D) (E)

The hypothesis is rejected at 0.5% significance. The hypothesis is accepted at 0.5% significance but rejected at 1% significance. The hypothesis is accepted at 1% significance but rejected at 2.5% significance. The hypothesis is accepted at 2.5% significance but rejected at 5% significance. The hypothesis is accepted at 5% significance.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 40

803

40.8. For an insurance coverage, you are given two populations. Claim sizes for both are assumed to follow a lognormal distribution, possibly with different parameters. Parameters for each population, and for both combined, are estimated using maximum likelihood, with the following results: Population

Estimated Parameters

Negative Loglikelihood

First population only Second population only Both populations combined

µ  6, σ  2.0 µ  5, σ  2.7 µ  5.8, σ  2.4

723.5 722.7 1449.8

The null hypothesis is that parameters are the same for both populations. Which of the following statements is true? (A) (B) (C) (D) (E)

The null hypothesis is rejected at 1% significance. The null hypothesis is accepted at 1% significance, but rejected at 2.5% significance. The null hypothesis is accepted at 2.5% significance, but rejected at 5% significance. The null hypothesis is accepted at 5% significance, but rejected at 10% significance. The null hypothesis is accepted at 10% significance.

40.9. The underlying distribution for a random variable X is lognormal with σ  2.5. The parameter µ is estimated from a sample of 100 observations using maximum likelihood. The resulting estimate is µˆ  6. Let l ( µ ) be the loglikelihood of the 100 observations based on a lognormal distribution with parameters µ and σ  2.5. Determine l (4) − l (6) . 40.10. The underlying distribution for a random variable X is lognormal with σ  2.5. The parameter µ is estimated as µˆ based on a sample of 100 observations using maximum likelihood. The true value of µ is 4. Let l ( µ ) be the loglikelihood of the 100 observations based on a lognormal distribution with parameters µ and σ  2.5. Determine EX [l (4) − l ( µˆ ) ]. 40.11. [4-S01:20] During a one-year period, the number of accidents per day was distributed as follows: Number of Accidents 0 1 2 3 4 5

Days 209 111 33 7 3 2

For these data, the maximum likelihood estimate for the Poisson distribution is λˆ  0.60, and for the negative binomial distribution, it is rˆ  2.9 and βˆ  0.21. The Poisson has a negative loglikelihood value of 385.9, and the negative binomial has a negative loglikelihood value of 382.4. Determine the likelihood ratio test statistic, treating the Poisson distribution as the null hypothesis. (A) −1

(B) 1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 3

(D) 5

(E) 7

Exercises continue on the next page . . .

40. LIKELIHOOD RATIO TEST AND ALGORITHM, SCHWARZ BAYESIAN CRITERION

804

40.12. You fit a two-parameter Pareto to a sample of 100 claim amounts x1 , . . . , x100 and use the likelihood ratio test to test the hypothesis H0 : θ  9 and α  3 against H1 : θ  9 and α , 3 You are given that

P100 i1

ln (9 + x i )  262.

Determine the result of the test. (A) (B) (C) (D) (E)

Reject at the 0.005 significance level. Reject at the 0.010 significance level, but not at the 0.005 level. Reject at the 0.025 significance level, but not at the 0.010 level. Reject at the 0.050 significance level, but not at the 0.025 level. Do not reject at the 0.050 significance level.

40.13. [4-F04:22] If the proposed model is appropriate, which of the following tends to zero as the sample size goes to infinity? (A) (B) (C) (D) (E)

Kolmogorov-Smirnov test statistic Anderson-Darling test statistic Chi-square goodness-of-fit test statistic Schwarz Bayesian adjustment None of (A), (B), (C) or (D)

40.14. You are given the following data for claim counts: Number of claims 0 1 2 3 or more

Number of policies 84 11 5 0

This data is fit to a Poisson distribution and to a negative binomial distribution using maximum likelihood. The negative binomial fit results in the parameters r  0.545165 and β  0.385204. Calculate the likelihood ratio statistic for determining whether to accept the negative binomial distribution fit in preference to the Poisson fit.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 40

805

40.15. You fit a Weibull distribution to a sample of 20 claim amounts. You test H0 : τ  2 against H1 : τ , 2 using the likelihood ratio statistic. You are given (i) ln x  73.6177 P 2 i (ii) x i  87,266 (iii) At the maximum likelihood estimate, the loglikelihood is −98.443

P

Determine the result of the test.

(A) (B) (C) (D) (E)

Reject at the 0.005 significance level. Reject at the 0.010 significance level, but not at the 0.005 level. Reject at the 0.025 significance level, but not at the 0.010 level. Reject at the 0.050 significance level, but not at the 0.025 level. Do not reject at the 0.050 significance level.

40.16. [4-F03:28] You fit a Pareto distribution to a sample of 200 claim amounts and use the likelihood ratio test to test the hypothesis that α  1.5 and θ  7.8. You are given: (i) The maximum likelihood estimates are αˆ  1.4 and θˆ  7.6. (ii) The natural logarithm of the likelihood function evaluated at the maximum likelihood estimates is −817.92. P (iii) ln ( x i + 7.8)  607.64 Determine the result of the test. (A) (B) (C) (D) (E)

Reject at the 0.005 significance level. Reject at the 0.010 significance level, but not at the 0.005 level. Reject at the 0.025 significance level, but not at the 0.010 level. Reject at the 0.050 significance level, but not at the 0.025 level. Do not reject at the 0.050 significance level.

Additional released exam questions: C-F05:25, C-F06:22, C-S07:14

Solutions 40.1. (A) would be correct if “high” is replaced with ”low”. (B) would be correct if “false” is replaced with “true”. (D) would be correct if the number of degrees of freedom is the number of free parameters of the model minus the number of free parameters of the null hypothesis. (E) would be correct if the words “do not” were removed. The answer is (C). 40.2. The lower the negative loglikelihood, the higher the loglikelihood and the better. Pareto, Weibull, and loglogistic are 2-parameter distributions. Weibull is the best 2-parameter distribution, the one with the lowest negative loglikelihood of the three, but 2 (613.7 − 612.4)  2.6 < 3.84, the C/4 Study Manual—17th edition Copyright ©2014 ASM

806

40. LIKELIHOOD RATIO TEST AND ALGORITHM, SCHWARZ BAYESIAN CRITERION

95% point with 1 degree of freedom, so it is not significantly better than exponential. Burr and generalized Pareto are 3-parameter distributions. The generalized Pareto is the better 3-parameter distribution, the one with the lowest negative loglikelihood of the two, and 2 (613.7 − 610.6)  6.2 > 5.99, the 95% point with 2 degrees of freedom, so it is better than the exponential. Generalized Pareto 40.3. At 95% confidence, the critical value of chi-square is 3.84 at 1 degree of freedom, 5.99 at 2 degrees of freedom, and 7.81 at 3 degrees of freedom. Dividing each of these by 2, we compare the likelihood ratio statistic against 1.92, 3.00, and 3.91 respectively. •

Comparing the 2-parameter model to the 1-parameter model, −140.75 + 142.32  1.57 < 1.92, so reject the 2-parameter model.



Comparing the 3-parameter model to the 1-parameter model, −139.40 + 142.32  2.92 < 3.00, so reject the 3-parameter model.



Comparing the 4-parameter model to the 1-parameter model, −138.30 + 142.32  4.02 ≥ 3.91, so select the 4-parameter model.



Comparing the 5-parameter model to the 4-parameter model, −137.40 + 138.30  0.90 < 1.92, so reject the 5-parameter model.

There are 4 parameters in the selected model. 40.4.

(ln 20) /2  1.50, so the adjusted loglikelihoods are: 1-parameter model 2-parameter model 3-parameter model 4-parameter model 5-parameter model

−142.32 − 1.5  −143.82 −140.75 − 3.0  −143.75 −139.40 − 4.5  −143.90 −138.30 − 6.0  −144.30 −137.40 − 7.5  −144.90

There are 2 parameters in the selected model. 40.5. Negative loglikelihoods are given, so we are minimizing and must add the penalty function, which is a multiple of (ln 196) /2  2.64. The only 1-parameter distribution is the exponential, for which 425.3 + 2.64  427.94. The 2-parameter distribution with lowest negative loglikelihood is the inverse Gaussian (whose negative loglikelihood is even lower than the 3-parameter Burr); 420.0 + 2 (2.64)  425.28. The inverse Gaussian is selected. . 40.6. The penalties are multiples of (ln 95) /2  2.3. The best 1-parameter distribution is the exponential; 487.0 + 2.3  489.3. The best 2-parameter distribution is the inverse gamma; 484.1 + 4.6  488.7. The 3-parameter Burr yields 482.0 + 6.8  488.8. The inverse gamma is selected. 40.7. For an exponential with mean θ, the conditional density function of claim sizes x i after applying the deductible, given that the loss is over a deductible d is f (xi + d )  1 − F (d )

1 − ( x i +d ) /θ θe e −d/θ



1 −x i /θ e θ

This is, of course, the memoryless property of the exponential distribution: f ( x i + d | x i > d )  f ( x i ) . If there are n claims of sizes x1 , . . . , x n , multiplying this results in likelihood and loglikelihood functions of 1 − Pn x i /θ 1 ¯ e i1  n e −n x/θ n θ θ Pn xi x¯ l ( θ )  −n ln θ − i1  −n ln θ − n θ θ

L (θ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 40

807

The MLE for an exponential is the sample mean, as discussed in shortcut #1 on page 601. Plugging in θ  x¯ in the formula for l ( θ ) , we get: l ( x¯ )  −n ln x¯ − n

Thus, if the models are estimated separately, the MLE for the 500 deductible is 700 and the MLE for the 1000 deductible is 1100. The loglikelihood function for all the data would be the sum of the loglikelihoods of each deductible, or −55 ln 700 − 55 − 45 ln 1100 − 45. If the combined data is used to create a single model, since the density function is the same for both deductibles (due to the lack of memory), the likelihood function is the same too. The mean for the combined data is 55 (700) + 45 (1100)  880 x¯  45 + 55 so the loglikelihood is −100 ln 880 − 100. The loglikelihood of the separate models minus the loglikelihood of the combined model is −55 ln 700 − 45 ln 1100 + 100 ln 880  2.545.

Doubling this, we obtain as the loglikelihood statistic 2 (2.545)  5.090. The combined model has one parameter (the parameter of the exponential distribution), while the separate models have two parameters (each one has one parameter of the exponential distribution), so there is 1 degree of freedom in the statistic. Looking up the chi-square table, we see that 5.02 ≤ 5.09 < 6.64. We therefore accept at 1% significance but reject at 2.5%. (C) 40.8. The alternativehypothesis has two freeparameters additional to the null hypothesis. Using the likelihood ratio test, 2 1449.8 − (723.5 + 722.7)  7.2, with 2 degrees of freedom. Hence accept at 2.5%, reject at 5%. (C) 40.9.

The likelihood and loglikelihood are (note that 2 (2.52 )  12.5) L (µ) 

e−

P

(ln x i −µ ) 2 /2σ2

2

e − (ln x i −µ) /12.5  Q 2.5100 (2π ) 50 x i P

√ Q ( σ 2π ) n x i P Y (ln x i − µ ) 2 l (µ)  − − 100 ln 2.5 − 50 ln 2π − ln xi 12.5

In the difference l (4) − l (6) , only the first term survives: l (4) − l (6) 

P

P

12.5

P 

(ln x i − 6) 2 −

(ln x i −

6) 2



(ln x i − 4) 2

P

(ln x i − 6) + (6 − 4)

2

12.5 100 (6 − 4) 2 −  −32 12.5

In the second line, the cross term 2 (ln x i − 6)(6 − 4)  0, since µˆ is the maximum likelihood estimator, and we know that the maximum likelihood estimator for a lognormal is the average of the logs of the data, P so µˆ  ( ln x i ) /n  6. The final answer is negative, since the loglikelihood is maximized at 6.

P

40.10. The likelihood and loglikelihood are 2

2

e − (ln x i −µ) /2σ L ( µ )  100 Q σ (2π ) 50 x i P Y (ln x i − µ ) 2 l (µ)  − − 100 ln σ − 50 ln 2π − ln xi 2σ2 P

C/4 Study Manual—17th edition Copyright ©2014 ASM

808

40. LIKELIHOOD RATIO TEST AND ALGORITHM, SCHWARZ BAYESIAN CRITERION

In the difference l (4) − l ( µˆ ) , only the first term survives: l (4) − l ( µˆ ) 

P

(ln x i − µˆ ) 2 −

P

2σ2

(ln x i − 4) 2

Since x i is lognormal, ln x i follows a normal distribution with µ  4, so the expected value of (ln x i − 4) 2 is n times the true variance of the normal distribution, or nσ2  nσ 2 . The maximum likelihood estimator of the lognormal µ is the average of the ln x i ’s of the sample, as discussed in Subsection 33.3.1, page 604. P Therefore, (ln x i − µˆ ) 2 is the biased sample variance of the ln x i ’s, or n − 1 times the unbiased sample variance of the normal distribution. Its expected value is ( n − 1) σ2 . The difference is therefore −σ 2 , and

P

E[l (4) − l ( µˆ ) ] 

−σ2  −0.5 2σ2

40.11. The statistic is twice the difference of loglikelihoods, or 2 (385.9 − 382.4)  7 . (E) 40.12. The density and loglikelihood functions for a Pareto are f ( x; α, θ ) 

αθ α ( θ + x ) α+1

l ( α, θ )  n ln α + nα ln θ − ( α + 1)

X

ln ( θ + x i )

At ( α, θ )  (3, 9) , this is l (3, 9)  100 ln 3 + 100 (3) ln 9 − 4 (262)  −278.971 For θ  9, the optimal value of α using maximum likelihood is, based on the formula in Subsection 33.4.2, K  100 ln 9 − αˆ  −

X

ln (9 + x i )  219.7225 − 262  −42.2775

100  2.365322 −42.2775

The loglikelihood at (2.365322, 9) is

l (2.365322)  100 ln 2.365322 + 100 (2.365322) ln 9 − 3.365322 (262)  −275.909 The difference in the number of free parameters between the two models is 1: there are no free parameters in H0 since they’re all specified, while there is 1 free parameter in H1 , since one is not specified. The likelihood ratio statistic is 2 (−275.909 + 278.971)  6.124 which at one degree of freedom is between the 97.5th percentile (5.024) and the 99th percentile (6.635), so the answer is (C). 40.13. Only the Kolmogorov-Smirnov critical value gets smaller based on the size of the sample; the critical value is a constant over the square-root of the size of the sample. For an appropriate model, the statistic should be less than the critical value, and thus must go to zero as the sample size goes to infinity. The Anderson-Darling and chi-square statistics are independent of the size of the sample, and the Schwarz Bayesian adjustment increases with the size of the sample (logarithmically). (A) 40.14. The sample mean is 0.21, so the Poisson fit is λ  0.21. The loglikelihood for the Poisson fit is l ( λ )  −100λ + 11 + 5 (2) ln λ − 5 ln 2!





l (0.21)  −100 (0.21) + 21 ln 0.21 − 5 ln 2  −57.2393 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 40

809

The loglikelihood for the negative binomial fit is l ( r, β )  −100r ln (1 + β ) + 21 ln β − 21 ln (1 + β ) + 11 ln r + 5 ln r ( r + 1) − 5 ln 2

l (0.545164, 0.385204)  −54.5165 ln 1.385204 + 21 ln 0.385204 − 21 ln 1.385204 + 16 ln 0.545165 + 5 ln 1.545165 − 5 ln 2  −55.6372

The likelihood ratio statistic is 2 (−55.6372 + 57.2393)  3.2042 .

40.15. We need find the optimal θ for a Weibull with τ  2. Use formula (33.1) with τ  2, no censored data, and d i  0 for all i: s θˆ 

P

x 2i

20



p

87,266/20  66.0553

The density and loglikelihood functions are f ( x; τ, θ ) 

τx τ−1 − ( x/θ ) τ e θτ

l ( τ, θ )  n ln τ + ( τ − 1)

X

P ln x i −

x iτ

θτ

− nτ ln θ

For τ  2 and the corresponding optimal value of θ,

P

ˆ 2)  20 ln 2 + 73.6177 − P l ( θ,

x 2i

x 2i /20

− 20 (2) ln 66.0553

 20 ln 2 + 73.6177 − 20 − 40 ln 66.0553  −100.139 The likelihood ratio statistic is 2 (−98.443 + 100.139)  3.392. H0 has no free parameter and H1 has one free parameter, so the difference is 1 degree of freedom. The critical values of chi-square at 1 degree of freedom are 2.706 at 90% and 3.841 at 95%, so the answer is (E). 40.16. The Pareto density is

αθ α

( θ+x ) α+1

. The likelihood of 200 claim amounts with α  1.5 and θ  7.8 is 1.5200 7.8200 (1.5) L (1.5, 7.8)  Q200 2.5 i1 (7.8 + x i )

The loglikelihood is l (1.5, 7.8)  200 ln 1.5 + 300 ln 7.8 − 2.5

X

ln ( x i + 7.8)

 81.0930 + 616.2371 − 2.5 (607.64)  −821.77

Twice the difference between −821.77 and −817.92 is 7.70. You are restricting two parameters, so the critical value is the chi-square critical value at 2 degrees of freedom. 7.70 is greater than the 2.5% significance level (7.378) but less than the 1.0% significance level (9.210). (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

810

40. LIKELIHOOD RATIO TEST AND ALGORITHM, SCHWARZ BAYESIAN CRITERION

Quiz Solutions 40-1. The Burr model has 3 parameters and the exponential model has 1 parameter. The inverse Pareto model has 2 parameters and is inferior to the 2-parameter paralogistic model since its likelihood is lower, so it is rejected. In order to prefer the paralogistic to the exponential, 2 (−77.2 + 79.5)  4.6 must be at least as high as the chi-square statistic with 1 degree of freedom. It is greater at 90% and 95% but not at higher levels of confidence. In order to prefer the paralogistic to the Burr, 2 (−75.8 + 77.2)  2.8 must be less than the chi-square statistic with 1 degree of freedom. It is less than 3.841, at 95% confidence, but not less than 2.706, at 90% confidence. We conclude that the paralogistic model is preferred only at 95% confidence. (B) Although not requested, you can show that the exponential is preferred at 99% and 99.5%, while the Burr is preferred at 90% and 97.5%.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 41

Supplementary Questions: Parametric Models 41.1.

Claim counts on an insurance coverage are as follows: Number of Claims

Number of Policies

0 1 2 3 4 5+

325 325 225 100 25 0

Total

1000

A Poisson with mean 1.2 is fitted to the data. Calculate the chi-square statistic. (A) (B) (C) (D) (E) 41.2.

Less than 9 At least 9, but less than 11 At least 11, but less than 13 At least 13, but less than 15 At least 15 In a mortality study, deaths occur at the following times: 2

3

8

11

13

Survival time is fitted to a mixture of a uniform distribution on [0, 10] with weight w and a uniform distribution on [10, 15] with weight 1 − w, using maximum likelihood. Determine the fitted mean.

(A) 6.0

(B) 7.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 8.0

(D) 8.4

811

(E) 8.75

Exercises continue on the next page . . .

41. SUPPLEMENTARY QUESTIONS: PARAMETRIC MODELS

812

41.3. On a homeowners coverage with deductible 500, the following six payments on insurance losses are observed: 200,

400,

600,

1000,

2000,

5000

A Pareto with α  2 is fitted to the ground-up loss distribution to match the fitted median to the smoothed empirical median. Determine θ. (A) 224

(B) 724

(C) 1431

(D) 1931

(E) 3138

Claim sizes are 2, 5, 6, 9, 25. They are fitted to a lognormal distribution using maximum likelihood.

41.4.

Determine the mean of the fitted distribution. (A) 7.2

(B) 7.8

(C) 8.2

(D) 8.4

(E) 9.4

41.5. For a coverage with deductible 500 and maximum covered loss of 10,000, the following are loss sizes (including the deductible): 1000

2000

5000

7500

and 4 losses at the limit

The data are fitted to a two-parameter Pareto with α  1 and θ  2500. Calculate the Kolmogorov-Smirnov statistic. (A) 0.16

(B) 0.325

(C) 0.35

(D) 0.375

(E) 0.40

Experience for loss sizes in two classes is as follows:

41.6.

Class A: 3, 4, 6, 10 Class B: 4, 5, 8 The data are fitted to an inverse exponential with parameter θ subject to the constraint that θ for Class B is 2 more than θ for Class A, using maximum likelihood. Determine θ for Class A. (A) (B) (C) (D) (E)

Less than 5 At least 5, but less than 6 At least 6, but less than 7 At least 7, but less than 8 At least 8

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

41. SUPPLEMENTARY QUESTIONS: PARAMETRIC MODELS

813

41.7. A Pareto distribution is fitted to data. The estimated parameters are αˆ  4 and θˆ  30. The covariˆ θˆ is ance matrix for the estimators α, ! 0.01 −0.1 Σ −0.1 10 Estimate the variance of the estimated third quartile of the distribution. (A) (B) (C) (D) (E) 41.8.

Less than 2.5 At least 2.5, but less than 3.5 At least 3.5, but less than 4.5 At least 4.5, but less than 5.5 At least 5.5 The observations 4, 8, 18, 21, 49 are fitted to a distribution with density f ( x; θ, d ) 

1 − ( x−d )/θ e θ

x≥d

by matching the first and second moments. Determine the median of the fitted distribution. (A) 11

(B) 13

(C) 14

(D) 15

(E) 16

41.9. Loss sizes for Group I follow an exponential distribution with mean θ. Loss sizes for Group II follow a two-parameter Pareto distribution with parameters α  3 and θ. The parameter θ is the same for both groups. You have the following observations from Group I: 234,

302,

355

and the following observation from Group II: 260 Determine the θ that maximizes the likelihood of this sample. (A) 244

(B) 299

(C) 377

(D) 409

(E) 464

41.10. You are given the following observations of claim sizes: 2

3

5

8

13

18

24

30

45

The data are fitted to a parametric distribution and a p–p plot is drawn. The plot passes through the points (0.4, 0.3) , (0.5, 0.4) , and (0.6, 0.5) . Determine the 40th percentile of the fitted distribution. (A) 5

(B) 8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 10 92

(D) 13

(E) 18

Exercises continue on the next page . . .

41. SUPPLEMENTARY QUESTIONS: PARAMETRIC MODELS

814

41.11. You are given a sample of four loss sizes. These are fitted to a Weibull with τ  2. The parameter θ is estimated using maximum likelihood. Determine the asymptotic variance of the estimator for θ. (A) θ 2 /2

(B) θ 2 /4

(C) θ 2 /16

(D) θ 2 /20

(E) θ 2 /32

41.12. For an insurance coverage, claim size has the following probability density function: x f ( x )  30 θ

!3

θ−x θ

!2

1 x

0 1 Two-parameter Pareto with α > 2

For fixed probability and range parameters, rank the full credibility standards for these distributions from lowest to highest. (A) I < II < III

(B) I < III < II

(C) II < I < III

(D) II < III < I

(E) III < II < I

42.11. Claim counts follow a Poisson distribution. Claims size follows a Weibull distribution with τ  0.5. Using the methods of limited fluctuation credibility, a full credibility standard for aggregate losses of 2,500 expected claims has been established. Determine the full credibility standard for aggregate losses if the claim size distribution assumption is changed to an inverse gamma distribution with α  6. 42.12. [4B-F92:1] (2 points) You are given the following: •

The number of claims is Poisson distributed.



Number of claims and claim severity are independent.



Claim severity has the following distribution: Claim Size

Probability

1 2 10

0.50 0.30 0.20

Determine the number of claims needed so that the total cost of claims is within 10% of the expected cost with 90% probability. (A) (B) (C) (D) (E)

Less than 625 At least 625, but less than 825 At least 825, but less than 1025 At least 1025, but less than 1225 At least 1225

42.13. [4B-S96:13] (1 point) Using the methods of limited fluctuation credibility, a full credibility standard of 1,000 expected claims has been established using a range parameter of 0.05. Determine the number of expected claims that would be required for full credibility if the range parameter were changed to 0.01. (A) 40

(B) 200

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1,000

(D) 5,000

(E) 25,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 42

837

42.14. [4B-S93:10] (2 points) You are given the following: •

The number of claims for a single insured follows a Poisson distribution.



The coefficient of variation of the severity distribution is 2.



The number of claims and claim severity distributions are independent.



Claim size amounts are independent and identically distributed.



Based on the methods of limited fluctuation credibility, the standard for full credibility is 3415 claims. With this standard, the observed pure premium will be within k% of the expected pure premium 95% of the time. Determine k.

(A) (B) (C) (D) (E)

Less than 5.75% At least 5.75%, but less than 6.25% At least 6.25%, but less than 6.75% At least 6.75%, but less than 7.25% At least 7.25%

42.15. [4B-S97:2] (2 points) Claim counts follow a Poisson distribution. Using the methods of limited fluctuation credibility, a full credibility standard of 1,200 expected claims has been established for aggregate claim costs. Determine the number of expected claims that would be required for full credibility if the coefficient of variation of the claim size distribution were changed from 2 to 4 and the range parameter were doubled. (A) 500

(B) 1000

(C) 1020

(D) 1200

(E) 2040

42.16. [4B-F93:11] (3 points) You are given the following: •

Number of claims follows a Poisson distribution.



Claim severity is independent of the number of claims and has the following probability density distribution f ( x )  5x −6 , x > 1.

A full credibility standard has been determined so that the total cost of claims is within 5% of the expected cost with a probability of 90%. If the same number of claims for full credibility of total cost is applied to frequency only, the actual number of claims would be within 100k% of the expected number of claims with a probability of 95%. Using the normal approximation of the aggregate loss distribution, determine k. (A) (B) (C) (D) (E)

Less than 0.0545 At least 0.0545, but less than 0.0565 At least 0.0565, but less than 0.0585 At least 0.0585, but less than 0.0605 At least 0.0605

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

42. LIMITED FLUCTUATION CREDIBILITY: POISSON FREQUENCY

838

42.17. [4B-S94:13] (2 points) You are given the following: •

120,000 exposures are needed for full credibility.



The 120,000 exposures standard was selected so that the actual total cost of claims is within 5% of the expected total 95% of the time.



The number of claims per exposure follows a Poisson distribution with mean m.



m was estimated from the following observed data using the method of moments: Year

Exposures

Claims

1 2 3

18,467 26,531 20,002

1293 1592 1418

If mean claim severity is 5000, determine the standard deviation of the claim severity distribution. (A) (B) (C) (D) (E)

Less than 9000 At least 9000, but less than 12,000 At least 12,000, but less than 15,000 At least 15,000, but less than 18,000 At least 18,000

42.18. [4B-F94:11] (3 points) You are given the following: •

Number of claims follows a Poisson distribution with mean µ.



X is the random variable for claim severity, and has a Pareto distribution with parameters α  3.0 and θ  6000.



A standard of full credibility was developed so that the observed pure premium is within 10% of the expected pure premium 98% of the time.



Number of claims and claim severity are independent.

Using the methods of limited fluctuation credibility, determine the number of claims needed for full credibility for estimates of the pure premiums. (A) (B) (C) (D) (E)

Less than 600 At least 600, but less than 1200 At least 1200, but less than 1800 At least 1800, but less than 2400 At least 2400

42.19. Claim counts follow a Poisson distribution. Claim size follows a two-parameter Pareto distribution with α  5, θ unknown. Using the methods of limited fluctuation credibility, a full credibility standard of 2,000 expected claims has been established so that actual aggregate claim costs will be within 5% of expected aggregate claim costs p% of the time. Determine p.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 42

839

42.20. Claim counts follow a Poisson distribution. Claim size follows an inverse Gaussian distribution with parameters µ  1000, θ  4000. Using the methods of limited fluctuation credibility, a full credibility standard of 1500 expected claims has been established so that actual aggregate claim costs are within 5% of expected aggregate claim costs p% of the time. Determine p. 42.21. [4B-S95:10] (1 point) You are given the following: •

The number of claims follows a Poisson distribution.



The distribution of claim sizes has a mean of 5 and variance of 10.



The number of claims and claim sizes are independent.

According to the methods of limited fluctuation credibility, how many expected claims are needed to be 90% certain that actual claim costs will be within 10% of the expected claim costs? (A) (B) (C) (D) (E)

Less than 100 At least 100, but less than 300 At least 300, but less than 500 At least 500, but less than 700 At least 700

42.22. [4B-S95:26] (3 points) You are given the following: •

40,000 exposures are needed for full credibility



The 40,000 exposures standard was selected so that the actual total cost of claims is within 7.5% of the expected total 95% of the time.



The number of claims per exposure follows a Poisson distribution with mean m.



The claim size distribution is lognormal with parameters µ (unknown) and σ  1.5.



The lognormal distribution has the following moments: 2



mean: e µ+ ( σ /2) 2 2 variance: e 2µ+σ ( e σ − 1)

The number of claims per exposure and claim sizes are independent. Using the methods of limited fluctuation credibility, determine the value of m.

(A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

42. LIMITED FLUCTUATION CREDIBILITY: POISSON FREQUENCY

840

42.23. [4B-F96:2] (1 point) Using the methods of limited fluctuation credibility, a full credibility standard of 1,000 expected claims has been established so that actual claim costs will be within 100c% of expected claim costs 90% of the time. Determine the number of expected claims that would be required for full credibility if actual claim costs were to be within 100c% of expected claim costs 95% of the time. (A) (B) (C) (D) (E)

Less than 1,100 At least 1,100, but less than 1,300 At least 1,300, but less than 1,500 At least 1,500, but less than 1,700 At least 1,700

42.24. [4B-F97:24 and 1999 C4 Sample:15] (3 points) You are given the following: •

The number of claims per exposure follows a Poisson distribution with mean 0.01.



Claim sizes follow a lognormal distribution with parameters µ (unknown) and σ  1.



The number of claims per exposure and claim sizes are independent.



The full credibility standard has been selected so that actual aggregate claim costs will be within 10% of expected aggregate claim costs 95% of the time.

Using the methods of limited fluctuation credibility, determine the number of exposures required for full credibility. (A) (B) (C) (D) (E)

Less than 25,000 At least 25,000, but less than 50,000 At least 50,000, but less than 75,000 At least 75,000, but less than 100,000 At least 100,000

42.25. [4B-F98:5] (2 points) You are given the following: •

The number of claims follows a Poisson distribution.



The variance of the number of claims is 10.



The variance of the claim size distribution is 10.



The variance of aggregate claim costs is 500.



The number of claims and claim sizes are independent.



The full credibility standard has been selected so that actual aggregate claim costs will be within 5% of expected aggregate claim costs 95% of the time.

Using the methods of limited fluctuation credibility, determine the expected number of claims required for full credibility. (A) (B) (C) (D) (E)

Less than 2,000 At least 2,000, but less than 4,000 At least 4,000, but less than 6,000 At least 6,000, but less than 8,000 At least 8,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 42

841

42.26. Claim frequency has a Poisson distribution. Claim size has a lognormal distribution with µ  8, σ  1. A full credibility standard is established so that aggregate claim costs should be within 5% of expected aggregate claim costs 90% of the time. Determine the expected aggregate claim cost necessary to qualify for full credibility under this standard. 42.27. The claim size distribution has a mean of 2,500 and a variance of 8,000,000. Claim frequency has a Poisson distribution with a mean of 0.2. Using the methods of limited fluctuation credibility, the standard for full credibility of aggregate claim costs is expected aggregate claim costs of 6,200,000. Using the same standard, n exposures would be required for full credibility for the number of claims. Determine n. 42.28. [4B-F98:29] (3 points) You are given the following: (i) (ii) (iii) (iv) (v)

The number of claims follows a Poisson distribution. Claim sizes follow a Burr distribution with parameters θ (unkown), α  6, and γ  0.5. The number of claims and claim sizes are independent. 6,000 expected claims are needed for full credibility. The full credibility standard has been selected so that actual aggregate claim costs will be within 10% of expected aggregate claim costs P% of the time.

Using the methods of limited fluctuation credibility, determine the value of P. (A) (B) (C) (D) (E)

Less than 80 At least 80, but less than 85 At least 85, but less than 90 At least 90, but less than 95 At least 95

42.29. [4B-F99:2] (2 points) You are given the following: •

The number of claims follows a Poisson distribution.



Claim sizes follow a lognormal distribution with parameters µ and σ.



The number of claims and claim sizes are independent.



13,000 expected claims are needed for full credibility.



The full credibility standard has been selected so that actual aggregate claim costs will be within 5% of expected aggregate claim costs 90% of the time. Determine σ.

(A) (B) (C) (D) (E)

Less than 1.2 At least 1.2, but less than 1.4 At least 1.4, but less than 1.6 At least 1.6, but less than 1.8 At least 1.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

42. LIMITED FLUCTUATION CREDIBILITY: POISSON FREQUENCY

842

42.30. [4-F03:3] You are given: (i) The number of claims has a Poisson distribution. (ii) Claim sizes have a Pareto distribution with parameters θ  0.5 and α  6. (iii) The number of claims and claim sizes are independent. (iv) The observed pure premium should be within 2% of the expected pure premium 90% of the time. Determine the expected number of claims needed for full credibility. (A) (B) (C) (D) (E)

Less than 7,000 At least 7,000, but less than 10,000 At least 10,000, but less than 13,000 At least 13,000, but less than 16,000 At least 16,000

42.31. [SOA3-F03:4] The following question is a repeat of exercise 15.19. But this time, work it out using limited fluctuation credibility concepts. Computer maintenance costs for a department are modeled as follows: (i)

The distribution of the number of maintenance calls each machine will need in a year is Poisson with mean 3. (ii) The cost for a maintenance call has mean 80 and standard deviation 200. (iii) The number of maintenance calls and the costs of the maintenance calls are all mutually independent. The department must buy a maintenance contract to cover repairs if there is at least a 10% probability that aggregate maintenance costs in a given year will exceed 120% of the expected costs. Using the normal approximation for the distribution of the aggregate maintenance costs, calculate the minimum number of computers needed to avoid purchasing a maintenance contract. (A) 80

(B) 90

(C) 100

(D) 110

(E) 120

Additional released exam questions: C-F05:35, C-F06:30

Solutions 42.1.

√ Use formula (42.1). The coefficient of variation of aggregate claims is 3,000,000/2000, so 2.326 eF  0.1

42.2.

!2

3,000,000  405.8 20002

!

Use formula (42.1). 2θ 2 − E[X]2  θ 2 − E[X]2 (2)(1) θ 2 − ( θ/2) 2 CV2  3 ( θ/2) 2

Var ( S ) 

!2

1.96 eF  (3)  4610 0.05

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 42

843

42.3. We want credibility for aggregate losses using exposure as the basis, but have no separate information on frequency and severity experience. We must use the general formula (42.1). 1.96 0.05

eF  42.4.

!2

15,000 12,000

!2  2401

The appropriate formula is the one in Table 42.1 for expected number of claims √needed for cred-

ibility of pure premium, n F  n0 (1 + CV2s ) . µ s  50,000, σS2  1.96 0.05

2 

1+

4 12



100,0002 , 12

CVs 

100,000/ 12 50,000



√2 . 12

NF 

 2048.85 (C)

42.5. The appropriate formula is the one in Table 42.1 for expected number of claims needed for credibility of pure premium, n F  n0 (1 + CV2s ) .

!2

1.96 1000  (1 + CV2s ) ⇒ CV2s  1.60308 ⇒ CVs  1.2661 0.1

(B)

42.6. The appropriate formula is the one in Table 42.1 for expected number of claims needed for credibility of pure premium, n F  n0 (1 + CV2s ) . 3025  42.7.

!2

1.96 (1 + CV2s ) ⇒ CV2s  6.8743 ⇒ CVs  2.6219 0.1

The severity distribution is a single-parameter Pareto with α  αθ 5/2 5   α − 1 3/2 3 f g αθ2 5/2 E X2   5 α − 2 1/2 25 20  σs2  5 − 9 9 √ √ 20/9 20 CVs    0.8944 5/3 5

5 2

(B)

and θ  1.

E[X] 

(C)

42.8. The formula for number of claims/number of claims in Table 42.1 is n0  (2.326/0.05) 2 whereas 2 the formula for number of claims/pure premium is 1.96 1 + CV2s . In the previous exercise we got k CVs 



20 5 ,

so CV2s 

20 52

 45 , and we have 2.326 0.05

!2

2.326 0.05

1.96  nF  k

!2

4 1+ 5



5 1.962  9 k2

!

k2  k

C/4 Study Manual—17th edition Copyright ©2014 ASM

!2 

(0.052 )(9)(1.962 ) (2.3262 )(5) 0.05 (3)(1.96) √ 2.326 5

 0.05653

(C)

844

42. LIMITED FLUCTUATION CREDIBILITY: POISSON FREQUENCY

42.9. The appropriate formula is the one for number of claims/pure premium in Table 42.1, n F  n0 (1 + CV2s ) . !2 !2 7500 + 1.645 * 1+  19,544 (C) nF  0.06 1500

,

-

42.10. Using Table 42.2, 1 + CV2s is 2 for an exponential, which is a gamma with α  1, and less than 2 for a gamma with α > 1, while it is more than 2 for a two-parameter Pareto. So the answer is II < I < III . (C) 42.11. For Weibull, µ s  θΓ (3)  2θ and E[X 2 ]  θ 2 Γ (5)  24θ 2 , so 1 + CV2s  6. For inverse gamma 2 (we’ll use the letter Z for this) µ Z  θ5 and E[Z 2 ]  θ20 , so 1 + CV2Z  1.25. We therefore have: 2500  n0 (6) 2500 n0  6

2500 (1.25)  520.8 6 42.12. µ s  3.1 σs2  0.5 (2.12 ) + 0.3 (1.12 ) + 0.2 (6.92 )  12.09 1.645 nF  0.1

!2 

1+

12.09  611 (A) 9.61



42.13. Let p be the probability parameter and k the range parameter. Then n0  ( y p /k ) 2 . If k is divided by 5 (changed from 0.05 to 0.01), n 0 is multiplied by 25 The answer is 25 (1000)  25,000 . (E) 42.14. n F  3415 

!2

1.96 ( 1 + 22 ) k

!2

1.96 k 1.96 k√  0.0750 683

683 

(E)

42.15. 1 + 22  5 becomes 1 + 42  17, so the result is multiplied by 17 5  3.4. The range parameter is 2 multiplied by 2, so the result is divided by 2  4. The answer is (1200)(3.4)(0.25)  1020 . (C) 42.16. The severity distribution is single-parameter Pareto with α  5, θ  1. µ s  5/3 16 1 + CV2s  (5/4  15 . Setting the two standards equal: )2 1.645 0.05

!2

r k

C/4 Study Manual—17th edition Copyright ©2014 ASM

16 1.96  1154.57  15 k

!

1.962  0.05768 1154.57

!2

(C)

5 4

and E[X 2 ]  35 , so

EXERCISE SOLUTIONS FOR LESSON 42

845

42.17. If e F  120,000, then n F  e F m  120,000m. 1.96 n F  120,000m  0.05 m σs 1+ 5000

!2

!2

!2 *1 + σ s + 5000 , -

1293 + 1592 + 1418  0.0662 18467 + 26531 + 20002

 5.16972

(B)

σs  10,210

42.18. Calculate the mean and second moment of the Pareto θ 6000   3000 α−1 2 2θ 2 2 (60002 ) E[X 2 ]    36,000,000 ( α − 1)( α − 2) (2)(1) E[X] 

Then the full credibility standard is

E[X 2 ] 2.326  0.1 E[X]2

!

n F  n0

!2

36 · 106  2164.11 9 · 106

!

42.19. θ 4 2 µ2s 2µ s  E[X 2 ]  4·3 6 2 /6 µ 8 s 1 + CV2s  2  µ s /16 3 µs 

2000  y p2

yp 0.05

!2

8 3

!

 1.875

y p  1.369 p  100 2 (0.9147) − 1  82.94





42.20. µ s  1000 10003 4000 10003 CV2s   0.25 4000 · 10002 ! yp 2 1500  (1 + 0.25) 0.05 σs2 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

42. LIMITED FLUCTUATION CREDIBILITY: POISSON FREQUENCY

846

y p2  3 y p  1.732 p  100 2 (0.9582) − 1 91.64





42.21. n F  n 0 (1 + 42.22.

CV2s )

1.645  0.1

!2 

1+

10  378.84 25



(C)

e F  40,000, so n F  40,000m. First we’ll calculate the coefficient of variation of severity. e 2µ+2.25 ( e 2.25 − 1)  e 2.25 − 1 e 2µ+2.25

CV2s  Now we can calculate n F and back out m.

!2

1.96 (1 + e 2.25 − 1)  6479.66 n F  40,000m  0.075 6479.66 (D) m  0.1620 40,000 42.23. The counterpart of the previous exercise. Now y p is changed from 1.645 to 1.96, so n0 is multiplied by (1.96/1.645) 2 . The answer is 1000 (1.96/1.645) 2  1419.65 . (C) 42.24.

2

CV2s  e σ − 1  1.71828

Let e F be the number of exposures needed for full credibility. We use the exposure units/pure premium formula from Table 42.1. !2 1 1.96 eF  (1 + 1.71828)  104,425 (E) 0.01 0.1 42.25. We use Var ( PP )  E[N] Var ( X ) + Var ( N ) E[X]2 to back out µ s : 500  10 (10 + µ2s ) µ2s  40

nF 

1.96 0.05

!2 

1+

10  1920.8 40



(A)

2

42.26. By Table 42.2, 1 + CV2  e σ  e. Let s F be the full credibility standard expressed in terms of aggregate claims. s F  1082.41 (2.71828)( e 8.5 )  14,460,700 42.27. 6,200,000  2500n0

8,000,000 1+ 2,5002

n0  1087.72 1087.12  5438.60 n 0.2 C/4 Study Manual—17th edition Copyright ©2014 ASM

!

EXERCISE SOLUTIONS FOR LESSON 42

847

42.28. The mean of the Burr distribution is θΓ (1 + 1/γ ) Γ (6 − 1/γ ) θΓ (3) Γ (4) 2 · 6 θ   θ Γ (6) Γ (6) 120 10 The second moment of the Burr distribution is θ 2 Γ (1 + 2/γ ) Γ (6 − 2/γ ) θ 2 Γ (5) Γ (2) 4! 2 θ 2   θ  Γ(α) Γ (6) 5! 5 Thus 1 + CV2X  102 /5  20. Full credibility requires z (1+0.01P )/2 6000  0.1

!2

(20)

Solving for P, z (1+0.01P )/2 0.1 z (1+0.01P )/2 1 + 0.01P 2 P



√ 300

 1.73  0.9582  91.6

(D)

42.29. The credibility standard is expressed in terms of expected number of claims, and credibility is for aggregate claim costs, so the appropriate formula from Table 42.1 is n0 (1 + CV2s ) . Since the probability 2 parameter is 90% (normal coefficient 1.645) and the range parameter is 5%, n0  1.645  1082.41. Using 0.05 2 Table 42.2 for a lognormal distribution, 1 + CV2s  e σ . We now set the full credibility standard equal to 13,000 and solve for σ. 1082.41 e σ



2



 13,000

13,000  12.0102 1082.41 σ2  ln 12.0102  2.4858 √ σ  2.4858  1.5766 2

eσ 

(C)

42.30. For the Pareto claim size X, 0.5  0.1 5 2 (0.52 )  0.025 E[X 2 ]  (5)(4) 0.025 1 + CV2s   2.5 0.12 E[X] 

!2

1.645 nF  (2.5)  16,913 0.02

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

42. LIMITED FLUCTUATION CREDIBILITY: POISSON FREQUENCY

848

42.31. The question is asking for the full credibility standard, in terms of exposures, so that aggregate claim costs are not above 120% of expected 90% of the time. Unlike most credibility questions, the interval is a one-sided interval, so we use the 90th percentile of the standard normal distribution instead of the 2  41.0881. 95th percentile. So n0  1.282 0.2 The coefficient of variation of “severity”, the cost of a maintenance call, is 200/80; squaring that, (200/80) 2  6.25. We apply the Poisson limited fluctuation credibility formula. Number of exposures needed for full credibility is n0 (1 + CVs2 ) (41.0881)(1 + 6.25)   99.30 λ 3 Unlike other usual credibility questions, we want minimum number of machines to satisfy this, so instead of rounding to the nearest integer, we round up to 100 . (C)

Quiz Solutions 42-1.

In all cases, n0  (1.645/0.05) 2  1082.41.

1.

“Exposure units” row, “Number of claims” column: 1082.41/0.1  10,824 .

2.

“Number of claims” row, “Aggregate losses/pure premium” column: 1082.41 (1 + 22 )  5,412 .

3.

“Aggregate losses” row, “Number of claims” column: 1082.41 (1000)  1,082,410 .

42-2. n0 

2.576 0.1

!2

 663.58

E[X]  θΓ (1 + 5)  θ (5!) E[X 2 ]  θ 2 Γ (1 + 10)  θ 2 (10!) 10! n F  663.58 2  663.58 (252)  167,222 5!

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 43

Limited Fluctuation Credibility: Non-Poisson Frequency Reading: Loss Models Fourth Edition 17.3 or SN C-21-01 2.1–2.5 As we saw in the previous lesson (formula (42.1)), the general formula for the standard for full credibility in exposure units is !2 σ eF  n0 µ where e F is the standard measured in exposure units, not claims, µ is the mean and σ is the standard deviation of the item you are measuring the credibility of. To obtain n F , the standard measured in expected claims, we multiply e F by the mean frequency of claims, µ f . This general formula is used to determine the number of exposure units (e.g., policy years) needed for full credibility of the pure premium if you’re only given the mean and variance of aggregate claims, but not the separate means and variances of frequency and severity. If you are establishing a standard for full credibility of claim sizes (severity) in terms of exposure units, the exposure unit is a claim, so the standard expressed in exposure units is the same as the standard expressed as (actual) number of claims. Formula (42.1) translates into a standard for full credibility of e F  n0 CV2s .

(43.1)

where e F means actual (not expected) number of claims, which is the exposure unit for severity. If you are establishing a standard for full credibility of claim frequency in terms of the number of exposures, formula (42.1) translates into σ2f ! eF  n0 2 (43.2) µf where µ f is the mean of frequency and σ2f is the variance of frequency. To express this standard in terms of number of expected claims, we multiply both sides by µ f to obtain n F  n0

σ2f ! µf

(43.3)

Notice how this generalizes the formula of the previous lesson for the Poisson, where σ2f  µ f . This standard can be expressed in terms of exposure units by dividing n F by µ f , the mean claim frequency. If you are establishing a standard for full credibility of pure premium, assuming that claim counts and claim sizes are independent, the formula for the (expected) number of claims needed for full credibility can be derived as follows. By the compound mean formula, equation (14.1): E[S]  E[N] E[X]  µ f µ s By the compound variance formula, equation (14.2): Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  µ f σs2 + σ2f µ2s C/4 Study Manual—17th edition Copyright ©2014 ASM

849

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

850

Experience expressed in

Credibility for Claim size (severity)

Number of claims

Policyholder-years

n0

Number of claims

n0

Aggregate losses

n0 µ s

σ 2f !

n0

µ2f σ 2f !

!

µ2s µ f

n0

µf

σs2 σs2

Aggregate losses/ Pure premium σ2

! n0 .

*

µ2s

σ2f

µf

,

σ2f !

n0

µf

σs2

!

µs

2

σ + * f n0 . 2 + 2 s / µ µs µ f , f +

σ2

σs2 +

µ2s

/ 2

* f + σs +/ n0 µ s . µf µ2s ,

Table 43.1: Limited Fluctuation Credibility formulas

From formula (42.1), noting that n F  µ f e F : n F  µ f n0  n0 .

* ,

µ f σs2 + σ2f µ2s

σ2f µf

µ2f µ2s + CV2s /

+

(43.4)

-

Notice the asymmetry: the denominator for frequency is µ f not squared, whereas CV2s  σs2 /µ2s . This standard can be expressed in terms of exposure units by dividing n F by µ f . Example 43A You are given: (i) Claim counts follow a negative binomial distribution with r  2 and β  0.5. (ii) Claim sizes have coefficient of variation equal to 2.5. (iii) Claim counts and claim sizes are independent. The standard for full credibility of aggregate losses is set so that actual aggregate losses are within 5% of expected 95% of the time. Determine the number of expected claims needed for full credibility. Answer: The ratio of the variance of a negative binomial, rβ (1 + β ) , to its mean, rβ is 1 + β, which is 1.5 here. Using formula (43.4),

!2

nF 

1.96 (1.5 + 2.52 )  11,909 0.05



A summary of the formulas for all possible combinations of experience units used and what the credibility is for is shown in Table 43.1. To make the formulas parallel, I’ve avoided using the coefficient of variation. The two most common formulas are shaded.

Which distribution to use for frequency

C/4 Study Manual—17th edition Copyright ©2014 ASM

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

851

It is important to understand that in limited fluctuation credibility, the same distribution is assumed to apply to any individual randomly selected. If each individual’s frequency or severity has a parametric distribution where the parameter varies by individual, the mixed distribution must be used in all formulas. Example 43B For a group dental plan, each individual’s number of claims follows a binomial distribution with parameters m  3 and Q. Q varies by individual, and for the group as a whole has a uniform distribution on [0, 1]. Claim size follows an inverse Gaussian distribution with parameters µ  1000 and θ  5. Classical credibility techniques are used. The standard for full credibility of aggregate loss experience is set so that the probability of observed claims being within 5% of expected claims is 90%. Determine the number of expected claims required for full credibility. Answer: The mean of the binomial distribution is 3Q, and the mean of Q is 0.5, so the overall mean is 1.5. This is the mean number of claims for a randomly selected insured. µ f  3 (0.5)  1.5 The variance for a randomly selected insured must be calculated using the conditional variance formula. The variance used in σ f is not the variance of a binomial distribution with parameters m  3 and q  0.5. σ2f  E Var ( N | Q ) + Var E[N | Q]

f

g





 E 3Q (1 − Q ) + Var (3Q )

f

3

1 2

g

−3

1 3

+ 32

1 12

 1.25

There is nothing new about the calculation of µ s and σs . µ s  1000 10003 5 3 /5 1000 CV2s   200 10002 σ2f + * n F  1082.41 . + CV2s / µf σs2 

, 



1.25  1082.41 + 200  217,384 1.5



Naturally, you would have no trouble with this problem. You’d be more likely to get confused when the individual distribution is Poisson, but the parameter varies among the group. In this case, the group distribution is not Poisson, and in fact the variance must be greater than the mean by the conditional variance formula. You must not use the formulas from the previous lesson, but must use the formulas from this lesson. Example 43C For a group dental plan, each individual’s number of claims follows a Poisson distribution with parameter Λ. Λ varies by individual in accordance with a gamma distribution with parameters α  2, θ  1. Claim sizes follow an inverse Gaussian distribution with parameters µ  1000 and θ  5. Classical credibility techniques are used. The standard for full credibility of aggregate loss experience is set so that the probability of observed claims being within 5% of expected claims is 90%. Determine the number of expected claims required for full credibility. C/4 Study Manual—17th edition Copyright ©2014 ASM

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

852

Answer: µ f  2. In Lesson 12, we learned that in this case, overall claim frequency is a negative binomial with the same parameters as the gamma: r  α and β  θ. However, even if you don’t know that, you can use the conditional variance formulas to calculate the variance: σ2f  E[Λ] + Var (Λ)  2 + 2  4. We already calculated µ s  1000 and σs2  10003 /5 in the previous example, resulting in CV2s  200. Therefore   4 n F  1082.41 + 200  218,647  2

Coverage of this material in the three syllabus options The material in this lesson is covered by two of the syllabus reading options: Klugman et al and MahlerDean. Herzog never mentions the possibility of using a frequency distribution other than Poisson in limited fluctuation credibility. Thus, the only formulas he has are n F  n0 for credibility of claim counts and n F  n0 1 + CV2s



(43.5)



for credibility of aggregate claims. He never mentions any other formula, such as (42.1) or (43.4). So are you responsible for these formulas? On the Spring 2005 exam, the first exam since 2000 for which Herzog was an option, they asked a non-Poisson frequency question. Also there are a few questions with non-Poisson frequency (#39 and #148) in the set of 289 sample questions. In addition, there have been student reports about questions on recent exams involving non-Poisson frequency, or at least requiring formula (42.1). So I recommend knowing this material. As an alternative, if they do ask a question with non-Poisson frequency, you can always use the general formula (42.1) in conjunction with compound variance. Herzog doesn’t provide that formula either, but you should definitely know it if only because it will be useful in simulation (Lesson 62).

Exercises 43.1. The methods of limited fluctuation credibility are used. The standard for full credibility is that the item measured should be within 100k% of the true mean with probability p. Order the following items from lowest to highest. In each case, assume the distribution is nondegenerate (in other words, that the random variable is not a constant). I.

Number of losses for full credibility of individual losses if losses follow an exponential distribution.

II.

Number of expected claims for full credibility of claim counts if claim counts follow a binomial distribution with m  1.

III.

Number of expected claims for full credibility of aggregate losses if claim counts follow a Poisson distribution and loss sizes follow a two-parameter Pareto distribution.

(A) I < II < III

(B) I < III < II

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) II < I < III

(D) II < III < I

(E) III < II < I

Exercises continue on the next page . . .

EXERCISES FOR LESSON 43

853

43.2. Claim size follows a two parameter Pareto distribution with parameters α  3.5 and θ. The full credibility standard for claim size is set so that actual average claim size is within 5% of expected claim size with probability 98%. Determine the number of claims needed for full credibility. 43.3. Claim size follows an inverse Gaussian distribution with parameters µ  1000, θ  500. You set a standard for full credibility of claim size, using the methods of limited fluctuation credibility, so that expected claim size is within 100k% of actual claim size with probability 90%. Under this standard, 1500 claims are needed for full credibility. Determine k. 43.4.

You are given the following information for a risk. Claim frequency: Claim severity:

mean  0.2, variance  0.3 gamma distribution with α  2, θ  10,000.

Using the methods of limited fluctuation credibility, determine the number of expected claims needed so that aggregate claims experienced are within 5% of expected claims with probability 90%. (A) (B) (C) (D) (E)

Less than 1800 At least 1800, but less than 1900 At least 1900, but less than 2000 At least 2000, but less than 2100 At least 2100

43.5. For a certain coverage, claim frequency has a negative binomial distribution with β  0.25. The full credibility standard is set so that the actual number of claims is within 6% of the expected number with probability 95%. Determine the number of expected claims needed for full credibility. 43.6.

For an insurance portfolio, you are given the following:

(i) Claim counts for each individual have a negative binomial distribution with parameters r and β. (ii) r does not vary by insured. (iii) β varies by insured. Its distribution is an exponential distribution with mean 2. (iv) For full credibility of number of claims, expected number of claims must be within 10% of actual 95% of the time. (v) 3073 expected claims are required to meet the standard of full credibility. Determine r. 43.7. For an automobile liability coverage, claim frequency has a negative binomial distribution with β  0.5. The full credibility standard for aggregate losses is 2,000 expected claims. Determine the full credibility standard for aggregate losses if the coefficient of variation of the claim size distribution is increased from 3 to 5. 43.8. Claim frequency follows a Bernoulli distribution with mean 0.3. Claim size has a single parameter Pareto distribution with α  4 and θ  1. Using the methods of limited fluctuation credibility, the full credibility standard is set so that actual aggregate claims are within 8% of expected aggregate claims 90% of the time. Determine the number of exposures needed for full credibility. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

854

For an insurance portfolio, you are given the following:

43.9.

(i) Claim count for each individual follows a Poisson distribution. (ii) Mean claim count for each individual varies. The distribution of the means is a Weibull distribution. (iii) For 500 insureds, the following claim counts are observed: Number of claims Number of insureds (iv)

0 401

1 72

2 16

3 11

For full credibility of claim count, expected number of claims must be within 5% of actual with probability 95%.

Estimate the number of exposures needed for full credibility of claim counts. 43.10. Claim frequency follows a binomial distribution with parameters m and q. Claim size has a lognormal distribution with parameters µ and σ  1. A full credibility standard is established so that actual aggregate claims are within 5% of expected aggregate claims 90% of the time. 2800 expected claims are required for full credibility. Determine q. 43.11. For a group of insureds, claim frequency for each insured follows a Poisson distribution with mean Λ. Λ varies according to a Pareto distribution with α  5 and θ  4. Claim size follows a gamma distribution with α  50, θ  1, and is independent of claim frequency. The methods of limited fluctuation credibility are used to assign credibility to this group. The full credibility standard requires aggregate claims to be within 6% of expected with probability 95%. Determine the expected number of claims needed for full credibility. 43.12. [4B-F94:15] (3 points) You are given the following: •

Y represents the number of independent homogeneous exposures in an insurance portfolio.



The claim frequency rate per exposure is a random variable with mean  0.025 and variance  0.0025.



A full credibility standard is devised that requires the observed sample frequency rate per exposure to be within 5% of the expected population frequency rate per exposure 90% of the time. Determine the value of Y needed to produce full credibility for the portfolio’s experience.

(A) (B) (C) (D) (E)

Less than 900 At least 900, but less than 1500 At least 1500, but less than 3000 At least 3000, but less than 4500 At least 4500

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 43

855

43.13. [4-F02:14] You are given the following information about a commercial auto liability book of business: (i)

Each insured’s claim count has a Poisson distribution with mean λ, where λ has a gamma distribution with α  1.5 and θ  0.2. (ii) Individual claim size amounts are independent and exponentially distributed with mean 5000. (iii) The full credibility standard is for aggregate losses to be within 5% of the expected with probability 0.90. Using classical credibility, determine the expected number of claims required for full credibility. (A) 2165

(B) 2381

(C) 3514

(D) 7216

(E) 7938

43.14. [4-F04:21] You are given: (i)

The number of claims has probability function: m x p (x )  q (1 − q ) m−x x

!

x  0, 1, 2, . . . , m

(ii)

The actual number of claims must be within 1% of the expected number of claims with probability 0.95. (iii) The expected number of claims for full credibility is 34,574. Determine q. (A) 0.05

(B) 0.10

(C) 0.20

(D) 0.40

(E) 0.80

43.15. [4-F00:14] For an insurance portfolio, you are given: (i) For each individual insured, the number of claims follows a Poisson distribution. (ii) The mean claim count varies by insured, and the distribution of mean claim counts follows a gamma distribution. (iii) For a random sample of 1000 insureds, the observed claim counts are as follows: Number Of Claims, n Number Of Insureds, f n

X

n f n  750

0 512

X

1 307

2 123

3 41

4 11

5 6

n 2 f n  1494

(iv) Claim sizes follow a Pareto distribution with mean 1500 and variance 6,750,000. (v) Claim sizes and claim counts are independent. (vi) The full credibility standard is to be within 5% of the expected aggregate loss 95% of the time. Determine the minimum number of insureds needed for the aggregate loss to be fully credible. (A) (B) (C) (D) (E)

Less than 8300 At least 8300, but less than 8400 At least 8400, but less than 8500 At least 8500, but less than 8600 At least 8600

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

856

43.16. [C-S05:2] You are given: (i) The number of claims follows a negative binomial distribution with parameters r and β  3. (ii) Claim severity has the following distribution: Claim Size 1 10 100 (iii)

Probability 0.4 0.4 0.2

The number of claims is independent of the severity of claims.

Determine the expected number of claims needed for aggregate losses to be within 10% of expected aggregate losses with 95% probability. (A) (B) (C) (D) (E)

Less than 1200 At least 1200, but less than 1600 At least 1600, but less than 2000 At least 2000, but less than 2400 At least 2400

Solutions 43.1. For exponential X with mean θ, the variance is θ 2 so the coefficient of variation squared is θ 2 /θ 2  1 and the standard for full credibility, using the general formula, is n0 CV2  n0 . For a binomial with parameters m  1 and q, the coefficient of variation squared is q (1 − q ) /q 2  (1 − q ) /q. The number of exposures for full credibility, by the general formula, is n0 (1 − q ) /q, and the number of expected claims per exposure is q, so the number of expected claims needed for full credibility is n0 (1 − q ) . Since q > 0, this is less than I. (If q  0, the distribution is degenerate; the random variable is the constant 0.) For a two-parameter Pareto, the standard for full credibility in terms of expected claims is 2n0 ( α − 1) / ( α − 2) , based on Table 42.2. This is greater than 2n0 , hence greater than I. (C) 43.2. Since we want credibility for severity using number of claims (exposure), formula (43.1) is the appropriate one. We calculate CVs . µs  σs2  CV2s

θ 2.5

2θ 2 θ2 28 2  − θ (2.5)(1.5) 2.52 75

σ  µ

!2

28 25 7   75 4 3

!

Using the formula, we conclude 2.326 nF  0.05

C/4 Study Manual—17th edition Copyright ©2014 ASM

!2

7  5049.6 3

!

EXERCISE SOLUTIONS FOR LESSON 43

857

43.3. Since we want credibility for severity using number of claims (exposure), formula (43.1) is the appropriate one. To back out the k, we must first calculate CVs . µ s  1000 109 500 2 CVs  2 σs2 

Now we equate 1500 to e F .

!2

1.645 1500  (2) k 1500k 2  (1.6452 )(2) 750k 2  1.6452 1.645  0.06 k √ 750 43.4. We want credibility for aggregate claims using number of expected claims as the basis. With separate information on frequency and severity, formula (43.4) applies. We must calculate CVs . µ s  2 (10,000)  20,000 σs2  2 (10,0002 )  2 · 108 CV2s 

2 · 108 1  20,0002 2

We now apply formula (43.4). n F  1082.41



0.3 1 +  2164.82 0.2 2



(E)

43.5. We want credibility for frequency using number of expected claims as the basis. Formula (43.3) applies. σ2f µf

 1.25

!2

nF 

1.96 (1.25)  1333.89 0.06

43.6. We want credibility for frequency using number of claims as the basis. Formula (43.3) applies. We must calculate the variance of a randomly picked insured using the conditional distribution variance formula, formula (4.2), conditioning on β. E[N]  E[rβ]  2r Var ( N )  Eβ [Var ( N | β ) ] + Varβ E[N | β]



 E[rβ (1 + β ) ] + Var ( rβ )

 r E[β] + r E[β 2 ] + r 2 Var ( β ) C/4 Study Manual—17th edition Copyright ©2014 ASM



858

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

and since E[β]  2 and Var ( β )  22  4 and E[β 2 ]  2 (22 )  8, Var ( N )  r (2 + 8) + r 2 (4) Var ( N ) 10r + 4r 2   5 + 2r E[N] 2r Now we are ready to equate n F to 3073.

!2

1.96 (5 + 2r )  3073 0.1 3073 8 5 + 2r  384.16 r  1.5 43.7. We are using credibility for aggregate claims with number of claims as a basis, so formula (43.4) applies. 2000  n0 (1.5 + 9) . Changing 9 to 25, we have n0 (1.5 + 25)  2000 (26.5/10.5)  5047.62 . 43.8. We want credibility for aggregate claims, but using exposures as a basis. We first calculate n F using formula (43.4). CV2s 

4/ (4 − 2) − 42 /32 1  8 42 /32

1.645 nF  0.08

!2 

0.21 1 +  348.82 0.3 8



Now we translate this into exposures needed, e F , by dividing through by the mean claims per exposure. eF 

348.82  1162.75 0.3

43.9. This question is modeled after a published old exam question. They tried to fool you into using a Poisson distribution for frequency. We want credibility for claim count using exposure as the basis, so we use formula (43.3) and adjust at the end from number of claims to exposures. We use the sample mean and the sample variance, since the true mean and variance are not given. First we calculate the empirical mean and variance. 72 (1) + 16 (2) + 11 (3)  0.274 500 72 (12 ) + 16 (22 ) + 11 (32 ) µˆ 2   0.470 500 σˆ 2  0.470 − 0.2742  0.394924 x¯ 

We multiply the empirical variance by n/ ( n − 1) to generate the unbiased sample variance. This is what the official solution said to do when this type of problem appeared on an exam. If you decided not to do this adjustment on the exam, your answer would have still been in the correct range of the five choices. Here too, using the empirical variance makes the answer only slightly lower.

C/4 Study Manual—17th edition Copyright ©2014 ASM

s2 

500 (0.394924)  0.39572 499

nF 

1.96 0.05

!

!2

0.39572  1536.64 (1.444)  2219.27 0.274

!

EXERCISE SOLUTIONS FOR LESSON 43

859

Now we divide through by mean claims per exposures to obtain e F . eF

2219.27  8100 0.274

43.10. We want credibility for aggregate losses using number of claims as the basis. Formula (43.4) applies. CV2s  e − 1

(See Table 42.2.)

2800  1082.41 (1 − q + e − 1) 2800 qe−  0.1315 1082.41 43.11. We want credibility for aggregate losses using number of claims as the basis. Formula (43.4) applies. But first, we must use conditional variance formula (4.2) to calculate the variance of frequency. µf 

4 1 5−1

σ2f  E[Λ] + Var (Λ)  1 + Now we are ready to calculate n F .

2 (42 ) 8 −1 (5 − 1)(5 − 2) 3

1  0.02 α !2  1.96  8 nF  + 0.02  2866.97 3 0.06

CV2s 

43.12. We want credibility for frequency using exposures as the basis. Formula (43.3) applies. 0.0025  108.241 n F  1082.41 0.025 108.241 Y  eF   4329.64 0.025

!

(D)

43.13. Overall, claim counts have a negative binomial distribution with parameters r  α  1.5 and β  θ  0.2, so the variance divided by the mean is 1 + β  1.2. Alternatively, you can calculate the variance using the conditional variance formula as E[λ] + Var ( λ )  0.3 + 0.06  0.36, and the mean is 0.3, making the quotient 1.2. Claim sizes have variance 50002 and mean 5000, so the coefficient of variation is 1. Then n F  1082.41 (1.2 + 1)  2381.30 43.14. We have

1.96 λF  0.01

!2

σ2 µ

(B)

!

and for a binomial, σ 2  mq (1 − q ) and µ  mq so the quotient σ2 /µ  1 − q. Then 1962 (1 − q )  34,574

38,416 (1 − q )  34,574 34,574 q 1−  0.1 38,416 C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

860

43. LIMITED FLUCTUATION CREDIBILITY: NON-POISSON FREQUENCY

43.15. The coefficient of variation for severity squared is 6,750,000/15002  3. For frequency, we use the summary statistics to estimate the variance over the mean. Estimated mean is 750/1000  0.75. Estimated variance is 1494/1000 − 0.752  0.9315. If you wish, you can multiply this by 1000/999 (so that the sample variance is divided by n − 1 instead of by n), but it hardly makes a difference. So we have

!2 

0.9315 1.96 + 3  6518.43 0.05 0.75 6518.43 eF   8691.24 (E) 0.75

nF 



43.16. The variance of number of claims divided by the mean is 1 + β  4. The mean of claim size is 0.4 (1) + 0.4 (10) + 0.2 (100)  24.4. The second moment is 0.4 (1) + 0.4 (100) + 0.2 (10,000)  2040.4. The variance is then 2040.4−24.42  1445.04, and the coefficient of variation squared is 1445.04/24.42  2.4272. The answer is then

!2

1.96 (4 + 2.4272)  2469 nF  0.1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

Lesson 44

Limited Fluctuation Credibility: Partial Credibility Reading: Loss Models Fourth Edition 17.4–17.6 or SN C-21-01 2.6 or Introduction to Credibility Theory 5.4–5.5 When there is inadequate experience for full credibility, we must determine Z, the credibility factor. This will be used to determine the credibility premium PC : PC  Z X¯ + (1 − Z ) M

(44.1)

PC  M + Z ( X¯ − M )

(44.2)

where M is the manual premium, the premium initially assumed if there is no credibility. For calculator purposes, it is easier to use this formula in the form since you don’t need any memory, and entering M twice is likely to be easier than entering Z, which is usually some sort of fraction, twice. This alternative form is also intuitive; you are modifying M by adding the difference between actual experience and M, multiplied by the credibility assigned to the experience. We saw in the story of Ventnor Manufacturing at the beginning of Lesson 42 that to multiply the √ variance of the results by α, we must multiply the results by α. Therefore, the credibility factor for n expected claims is r n Z (44.3) nF where n F is the number of expected claims needed for full credibility. The corresponding square root rule would apply to expressing credibility in exposures in terms of e F , or credibility in terms of aggregate claims in terms of the amount needed for full credibility. The partial credibility function is concave down; it grows rapidly for small numbers, then slows down. Figure 44.1 illustrates the curve if we assume 1082.41 claims are needed for full credibility. Let’s see how the Ventnor case fits into this formula. We established on page √ 832 that 1125 expected claims were needed for full credibility. We have 160 claims. Therefore Z  160/1125  0.3771, which matches the result we initially computed on page 828.

?

Quiz 44-1 If 250 expected claims result in 50% credibility, how many expected claims are needed for 20% credibility? Example 44A (Version of 4B-S91:23) (1 point) Claim counts for a group follow a Poisson distribution. The standard for full credibility is 19,544 expected claims. We observe 6000 claims and a total loss of 15,600,000 for a group of insureds. If our prior estimate of the total loss is 16,500,000, determine the limited fluctuation credibility estimate of the total loss for the group of insureds. (A) Less than 15,780,000 (B) At least 15,780,000, but less than 15,870,000 (C) At least 15,870,000, but less than 15,960,000 (D) At least 15,960,000, but less than 16,050,000 (E) At least 16,050,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

861

44. LIMITED FLUCTUATION CREDIBILITY: PARTIAL CREDIBILITY

862

1.0

Credibility

0.8 0.6 0.4 0.2

0

200

400

600

800

1000

1200

Expected Claims Figure 44.1: Partial credibility if n F  1082.41

Answer: The credibility factor is the estimate is:

q

p

n/n f , with n  6000 and n f  19,544, so Z 

r

6,000 (15,600,000 − 16,500,000)  16,001,332 . 19,544

PC  16,500,000 +

6,000 19,544

 0.55408, and

(D)



Coverage of this material in the three syllabus options This material is required. It is covered in all three syllabus reading options.

Exercises 44.1.

You are given the following:

(i) Number of claims follows a Poisson distribution. (ii) Limited fluctuation credibility methods are used. (iii) The standard for credibility is set so that the actual aggregate losses are within 5% of expected losses 90% of the time. (iv) 605 expected claims are required for 50% credibility. Determine the coefficient of variation for the claim size distribution. (A) (B) (C) (D) (E)

Less than 1.50 At least 1.50, but less than 2.00 At least 2.00, but less than 2.50 At least 2.50, but less than 3.00 At least 3.00

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 44

863

Use the following information for questions 44.2 and 44.3: You are given the following: •

The number of claims follows a Poisson distribution.



Claim sizes follow a Pareto distribution with parameters θ  3000 and α  4.



The number of claims and claim sizes are independent.



2000 expected claims are needed for full credibility.



The full credibility standard has been selected so that actual claim costs will be within 5% of expected claim costs P% of the time. [4B-F95:11] (2 points) Using the methods of limited fluctuation credibility, determine the value of

44.2. P. (A) (B) (C) (D) (E)

Less than 82.5 At least 82.5, but less than 87.5 At least 87.5, but less than 92.5 At least 92.5, but less than 97.5 At least 97.5

44.3. [4B-F95:12] (1 point) Using the methods of limited fluctuation credibility, determine the number of expected claims needed for 60% credibility. (A) (B) (C) (D) (E)

Less than 700 At least 700, but less than 900 At least 900, but less than 1100 At least 1100, but less than 1300 At least 1300 [4B-S92:6] (1 point) You are given the following information for a group of insureds:

44.4.

Prior estimate of expected total losses Observed total losses Observed number of claims Required number of claims for full credibility

20,000,000 25,000,000 10,000 17,500

Using the methods of limited fluctuation credibility, determine the estimate for the group’s expected total losses based upon the latest observation. (A) (B) (C) (D) (E)

Less than 21,000,000 At least 21,000,000, but less than 22,000,000 At least 22,000,000, but less than 23,000,000 At least 23,000,000, but less than 24,000,000 At least 24,000,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

44. LIMITED FLUCTUATION CREDIBILITY: PARTIAL CREDIBILITY

864

Use the following information for questions 44.5 and 44.6: You are given the following: •

The number of claims follows a Poisson distribution.



Claim sizes follow a gamma distribution with parameters α  1 and θ (unknown).



The number of claims and claim sizes are independent.

44.5. [4B-S96:27] (2 points) The full credibility standard has been selected so that actual claim costs will be within 5% of expected claim costs 90% of the time. Using the methods of limited fluctuation credibility, determine the expected number of claims required for full credibility. (A) (B) (C) (D) (E)

Less than 1000 At least 1000, but less than 2000 At least 2000, but less than 3000 At least 3000 Cannot be determined from the given information

44.6. [4B-S96:28] (1 point) The full credibility standard has been selected so that the actual number of claims will be within 5% of the expected number of claims 90% of the time. Using the methods of limited fluctuation credibility, determine the credibility to be given to the experience if 500 claims are expected. (A) (B) (C) (D) (E)

Less than 0.20 At least 0.20, but less than 0.40 At least 0.40, but less than 0.60 At least 0.60, but less than 0.80 At least 0.80 [4B-F93:20] (2 points) You are given the following:

44.7. •

P  Prior estimate of pure premium for a particular class of business.



O  Observed pure premium during latest experience period for same class of business.



R  Revised estimate of pure premium for same class following observations.



F  Number of claims required for full credibility of pure premium.

Based on the methods of limited fluctuation credibility, determine the number of claims used as the basis for determining R. (A) F

R−P O−P

!

(B) F

C/4 Study Manual—17th edition Copyright ©2014 ASM

R−P O−P

!2

(C)



F

R−P O−P

!

(D)



F

R−P O−P

!2

(E) F 2

R−P O−P

!

Exercises continue on the next page . . .

EXERCISES FOR LESSON 44

865

Use the following information for questions 44.8 and 44.9: You are given the following: •

The number of claims follows a Poisson distribution.



Claim sizes are discrete and follow a Poisson distribution with mean 4.



The number of claims and claim sizes are independent.

44.8. [4B-F96:28] (2 points) The full credibility standard has been selected so that actual claim costs will be within 10% of expected claim costs 95% of the time. Using the methods of limited fluctuation credibility, determine the expected number of claims required for full credibility. (A) (B) (C) (D) (E)

Less than 400 At least 400, but less than 600 At least 600, but less than 800 At least 800, but less than 1,000 At least 1,000

44.9. [4B-F96:29] (1 point) The full credibility standard has been selected so that the actual number of claims will be within 10% of the expected number of claims 95% of the time. Using the methods of limited fluctuation credibility, determine the number of expected claims needed for 40% credibility. (A) (B) (C) (D) (E)

Less than 100 At least 100, but less than 200 At least 200, but less than 300 At least 300, but less than 400 At least 400

44.10. [4-F03:35] You are given: (i) (ii) (iii) (iv)

Xpartial  pure premium calculated from partially credible data. µ  E[Xpartial ] Fluctuations are limited to ±kµ of the mean with probability P. Z  credibility factor

Which of the following is equal to P? (A) (B) (C) (D) (E)

Pr ( µ − kµ ≤ Xpartial ≤ µ + kµ ) Pr ( Zµ − kµ ≤ ZXpartial ≤ Zµ + k ) Pr ( Zµ − µ ≤ ZXpartial ≤ Zµ + µ ) Pr (1 − k ≤ ZXpartial + (1 − Z ) µ ≤ 1 + k ) Pr ( µ − kµ ≤ ZXpartial + (1 − Z ) µ ≤ µ + kµ )

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

44. LIMITED FLUCTUATION CREDIBILITY: PARTIAL CREDIBILITY

866

44.11. [4B-F92:15] (2 points) You are given the following: •

X is the random variable for claim size.



N is the random variable for number of claims and has a Poisson distribution.



X and N are independent.



n0 is the standard for full credibility based only on number of claims.



n f is the standard for full credibility based on total cost of claims.



n is the observed number of claims.



C is the random variable for total cost of claims.



Z is the amount of credibility to be assigned to total cost of claims. According to the methods of limited fluctuation credibility, which of the following are true?

1.

Var ( C )  E[N] · Var ( X ) + E[X] · Var ( N )



2.

n f  n0

3.

Z

p

(A) 1 only



E[X]2 + Var ( X ) E[X]2





!

n/n f (B) 2 only

(C) 1,2 only

(D) 2,3 only

(E) 1,2,3

Use the following information for questions 44.12 and 44.13: You are given the following: •

The number of claims follow a Poisson distribution.



The coefficient of variation of the claim size distribution is 2.



The number of claims and claim sizes are independent.



1,000 expected claims are needed for full credibility.



The full credibility standard has been selected so that the actual number of claims will be with k% of the expected number of claims P% of the time.

44.12. [4B-S99:18] (1 point) Using the methods of limited fluctuation credibility, determine the number of expected claims needed for 50% credibility. (A) (B) (C) (D) (E)

Less than 200 At least 200, but less than 400 At least 400, but less than 600 At least 600, but less than 800 At least 800

44.13. [4B-S99:19] (1 point) Using the methods of limited fluctuation credibility , determine the number of expected claims that would be needed for full credibility if the full credibility standard were selected so that actual aggregate claim costs will be within k% of expected aggregate claim costs P% of the time. (A) 1,000

(B) 1,250

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2,000

(D) 2,500

(E) 5,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 44

867

44.14. The manual pure premium for an insurance coverage is 1,200. Average experience on a rate class is 1,000. Using the methods of limited fluctuation credibility, a revised manual pure premium rate of 1,124 is used. The full credibility standard requires the actual number of claims to be within 5% of the expected number of claims 95% of the time. Claim counts on the coverage follow a Poisson distribution. Determine the number of claims observed for the rate class. 44.15. A full credibility standard is established, using the methods of limited fluctuation credibility, so that actual aggregate claim costs are within 10% of expected aggregate claim costs 100p% of the time. Claim frequency has a negative binomial distribution with β  0.4. Claim sizes have a coefficient of variation of 1.5. Partial credibility of 88% is given to 1,000 claims. Determine p. 44.16. [4-S00:26] You are given: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii)

Claim counts follow a Poisson distribution. Claim sizes follow a lognormal distribution with coefficient of variation 3. Claim sizes and claim counts are independent. The number of claims in the first year was 1000. The aggregate loss in the first year was 6.75 million. The manual premium for the first year was 5.00 million. The exposure in the second year is identical to the exposure in the first year. The full credibility standard is to be within 5% of the expected aggregate loss 95% of the time.

Determine the limited fluctuation credibility net premium (in millions) for the second year. (A) (B) (C) (D) (E)

Less than 5.5 At least 5.5, but less than 5.7 At least 5.7, but less than 5.9 At least 5.9, but less than 6.1 At least 6.1

44.17. [4-F01:15] You are given the following information about a general liability book of business comprised of 2500 insureds:

P Ni

Yi j is a random variable representing the annual loss of the i th insured.

(i)

Xi 

(ii)

N1 , N2 , . . . , N2500 are independent and identically distributed random variables following a negative binomial distribution with parameters r  2 and β  0.2. Yi1 , Yi2 , . . . , YiN j are independent and identically distributed random variables following a Pareto distribution with α  3.0 and θ  1000. The full credibility standard is to be within 5% of the expected aggregate losses 90% of the time.

(iii) (iv)

j1

Using classical credibility theory, determine the partial credibility of the annual loss experience for this book of business. (A) 0.35

(B) 0.42

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.47

(D) 0.50

(E) 0.53

Exercises continue on the next page . . .

44. LIMITED FLUCTUATION CREDIBILITY: PARTIAL CREDIBILITY

868

44.18. For an insurance portfolio, you are given the following: (i) The number of claims for each insured follows a Poisson distribution. (ii) The mean claim count for each insured varies. The distribution of mean claim counts is a gamma distribution with α  0.5 and θ  4. (iii) The size of claims for each insured follows a Pareto distribution with parameters α  3 and θ  6000. (iv) The credibility standard is that aggregate claims must be within 10% of expected p of the time. (v) 1812 claims were observed, resulting in 80% credibility. Determine p.

Solutions 44.1.

n F  n0 (1 + CV2s )  1082.41 (1 + CV2s ) , and Z  0.5 

q

605 nF ,

s 0.5 

so

605 1082.41 (1 + CV2s )

0.25 1082.41 (1 + CV2s )  605





CV2s  1.23575 CVs  1.1116 44.2.

(A)

The full credibility standard is n0 (1 + CV2s ) . By Table 42.2, 2 (4 − 1) 3 4−2 !2 y0.01P 2000  (3) 0.05

1 + CV2s 

2 y0.01P 

(2000)(0.052 )

y0.01P  1.29

3

 1.667

P  100 2 (0.9015) − 1  80.3



44.3. 44.4.



(A)

2000 (0.6) 2  720 . (B) √ Z  10,000/17,500, and PC  20,000,000 + Z (25,000,000 − 20,000,000)  23,779,645 . (D)

44.5. “Actual claim costs” is the same as aggregate losses. Therefore, the appropriate formula for full credibility is the one on the “Number of claims” row, last column, of Table 42.1. αθ 2 1 α2 θ2 n F  n0 (1 + CV2s )  1082.41 (1 + 1)  2164.82

CV2s 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

EXERCISE SOLUTIONS FOR LESSON 44

869

44.6. Since we want credibility for number of claims, the appropriate formula for full credibility is the one on the “Number of claims” row, “Number of claims” column of Table 42.1. We have n F  n0  1082.41

r Z 44.7.

n  nF

r

500  0.6797 1082.41

(D)

Let n be the number of claims.

r R  P + Z (O − P )  P + R−P  O−P

r

n F

n (O − P ) F

n F R−P O−P

!2 (B)

44.8. Since we want credibility for actual claim costs, the appropriate formula for full credibility is the one on the “Number of claims” row, last column of Table 42.1. We have CV2s 

4  0.25 42

n F  n0 (1 + CV2s ) 

!2

1.96 (1 + 0.25)  480.2 0.1

(B)

44.9. Since we want credibility for number of claims the appropriate formula for full credibility is the one on the “Number of claims” row, “Number of claims” column of Table 42.1. We have 1.96 n F  n0  0.1 384.16 (0.42 )  61.47

!2

 384.16

(A)

44.10. The idea of partial credibility is that we only take Z times the observed mean (Xpartial ), and take (1 − Z ) times the prior mean (µ), which has no variance. Thus ZXpartial + (1 − Z ) µ is to be limited to being within kµ of the true mean µ. That’s exactly what (E) says. 44.11. 1 should have E[X]2 instead of E[X]. 2 and 3, however, are correct. (D) √ √ 44.12. The credibility factor is Z  n/n F . We want 0.5  n/1000, so n  1000 (0.52 )  250 . (B) 44.13. The full credibility standard for claim counts in terms of expected claims is n 0 (refer to Table 42.1, page 831), and we are given that  this is 1000. The full credibility standard for aggregate claims in terms of 2 expected claims is n0 1 + CV , and we’re given that CV  2, so the standard is 1000 (1 + 22 )  5000 . (E) √ 1200 − 1124 44.14. Z   0.38  n/n F , and n F  (1.96/0.05) 2  1536.64, so n  1536.64 (0.382 )  222 . 1200 − 1000 44.15. We use equation (43.4) and back out y p . nF 

yp 0.1

!2

(1.4 + 1.52 )  365y p2

1000  n F (0.882 ) C/4 Study Manual—17th edition Copyright ©2014 ASM

44. LIMITED FLUCTUATION CREDIBILITY: PARTIAL CREDIBILITY

870

n F  1291.32 1291.32 y p2   3.53 365 y p  1.88 p  2 (0.9699) − 1  0.9398 44.16. Expected claims needed for full credibility are:

!2

1.96 nF  (1 + 32 )  15,366.4 0.05 Credibility is therefore Z 



1000/15,366.4  0.2551. The credibility premium is (in millions) PC  5 + 0.2551 (6.75 − 5)  5.4464

(A)

44.17. The number of expected claims needed for full credibility is σ2

2

* f + σs +/ λ F  1082.41 . µf µ2s , and σ2f µf



rβ (1 + β )  1 + β  1.2 rβ

µ s  500 σs2  10002 − 5002  750,000

n F  1082.41 (1.2 + 3)  4546.122 Expected claims from 2500 insureds is 2500rβ  1000, so credibility is Z 



1000/4546.122  0.4690 . (C)

44.18. Claim size does not vary by insured, so the group’s claim size distribution is Pareto. However, claim frequency does vary. Let Λ be the parameter of the Poisson. We first need to calculate the variance of claim count. We can either use the fact that claim count is a negative binomial with parameters r  0.5 and β  4, so Var ( N )  (0.5)(4)(5)  10 (as discussed in Lesson 12) or use the conditional variance formula (4.2) as follows: µf  2 σ2f  E[Λ] + Var (Λ)  0.5 (4) + 0.5 (42 )  10 Now we use the credibility formula for full credibility of aggregate claims with a claim count base, (43.4). 6000  3000 2 σs2  60002 − 30002

µs 

CV2s  nF  C/4 Study Manual—17th edition Copyright ©2014 ASM

60002 − 30002 3 30002

yp 1812  2831.25  2 0.1 0.8

!2 

10 +3 2



QUIZ SOLUTIONS FOR LESSON 44

871

2831.25 (0.12 )  3.539 8 y p  1.88  Φ−1 (0.9699) y p2 

p  2 (0.9699) − 1  0.9398

Quiz Solutions 44-1. 40 .

We want



n/n F  0.2 and are given that

C/4 Study Manual—17th edition Copyright ©2014 ASM



250/n F  0.5, so n F  250/0.52 and n  250 (0.2/0.5) 2 

872

C/4 Study Manual—17th edition Copyright ©2014 ASM

44. LIMITED FLUCTUATION CREDIBILITY: PARTIAL CREDIBILITY

Lesson 45

Bayesian Estimation and Credibility—Discrete Prior Reading: Loss Models Fourth Edition 18.1–18.3 or SN C-21-01 4 or Introduction to Credibility Theory 4 Expect three questions on the exam on Bayesian methods, the topic of this and the next lesson. The Bayesian statistician assumes that the universe follows a parametric model, with unknown parameters. The distribution of the model given the value of the parameters is called the model distribution. Unlike the frequentist who estimates the parameters from the data, the Bayesian assigns a prior probability distribution to the parameters—“I don’t know what θ is, but it is equally likely to be 10 and 20”, or, “it is distributed like an exponential with mean 100”. After observing data, a new distribution, the posterior distribution, is developed for the parameters, using Bayes’ Theorem, which for discrete priors and models is Pr ( B | A ) Pr ( A ) Pr ( A | B )  (45.1) Pr ( B ) where the left side P is the posterior probability, B is the observations, and A is the prior distribution. The difference between Bayesian estimation of parameters and Bayesian credibility is subtle. In Bayesian estimation of parameters, the prior distribution is for the entire universe. In Bayesian credibility, the prior distribution is for the block of business that is being insured. This is not a mathematical distinction, and it’s hard to classify exam problems as estimation versus credibility. So we’ll study both together. This lesson deals with discrete priors, for which you may use mechanical techniques if you wish. The next lesson will deal with continuous priors, where the sum representing Pr ( B ) becomes an integral. A typical Bayesian credibility problem will start out with a breakdown of risks into several categories. The losses (or number of claims, or aggregate claims) for each risk follows some distribution. The distribution is not the same for each risk, but varies. Thus there are two levels of variation: the loss distribution for each insured varies, and even if you know the loss distribution for a given insured, you do not know what the insured’s losses will be because the loss distribution itself introduces randomness. Now, you select a risk at random and observe some loss experience for him. You are asked to deduce one of two things: 1. What is the probability that this risk belongs to some class? 2. What is the expected size of the next loss for this risk? We will begin assuming that there are only a finite number of classes of risks. Most exam problems have only a finite number of classes, since problems with an infinite number of classes are difficult to work out. We will describe a mechanical approach for solving this sort of problem. After you develop some experience, you will be able to use shortcuts to speed up your work. We will construct a 4-line table to solve the first type of problem (what is the probability), with 2 additional lines for solving the second type of problem (what is the expectation). The table will have one column for each type of risk. 1. In the first row, enter the prior probability that the risk is in each class. This is given in the problem. C/4 Study Manual—17th edition Copyright ©2014 ASM

873

874

45. BAYESIAN METHODS—DISCRETE PRIOR

2. In the second row, enter the likelihood of the experience given the class. This is also given in the problem. 3. The third row is the product of the first two rows. It is the probability of being in the class and having the observed experience, or the joint probability. Sum up the entries of the third row. Each entry of the third row is a numerator in the expression for the posterior probability of being in the class given the experience given by Bayes’ Theorem, while the sum is the denominator in this expression. 4. The fourth row is the quotient of the third row over its sum. It is the posterior probability of being in each class given the experience—the answer to the first question. 5. In the fifth row, enter the expected value, given that the risk is in the class. These are known as the hypothetical means. 6. In the sixth row, enter the product of the fourth and fifth rows. Sum up the entries of the sixth row. This sum is the expected size of the next loss for this risk, given the experience, also known as the Bayesian premium1—the answer to the second question. Example 45A Your company sells an automobile collision coverage. The portfolio of insureds has good drivers and bad drivers. There are 3 times as many good drivers as bad drivers. For good drivers, the number of claims follows a Poisson distribution with mean 0.1. The size of claims is 1000 with probability 0.9 and 5000 with probability 0.1. For bad drivers, the number of claims follows a Poisson distribution with mean 0.3. The size of claims is 1000 with probability 0.8 and 5000 with probability 0.2. For an insured selected at random, you have three years of experience. The insured made no claims in the first 2 years and one claim for 1000 in the third year. Determine aggregate claim costs for this insured in the next year. Answer: The table has two columns: good drivers and bad drivers. In the first row, we insert the probabilities 3/4 and 1/4, since there are 3 times as many good drivers as there are bad drivers. The probability that a good driver  would have the three years of experience indicated by (0, 0, 1000) is ( e −0.1 )( e −0.1 ) e −0.1 (0.1)(0.9)  0.06667, where parentheses have been placed around each year’s probability. Similarly, the probability of a bad driver having the indicated experience is   ( e −0.3 )( e −0.3 ) e −0.3 (0.3)(0.8)  0.09758. These two numbers go in the second row. The third row is the products of the first two rows, or 0.05001, 0.02439. The sum is 0.07440. The fourth row is the quotients of the third row over the row sum, or 0.6721, 0.3279. These are conditional probabilities of being a good or bad driver, and must add up to 1. In the fifth row, we enter the expected claim experience of the two classes. For good drivers, this is (0.1)(1400)  140, and for bad drivers this is (0.3)(1800)  540. In the sixth row, we enter (0.6721)(140) and (0.3279)(540) and sum these up. The final answer is then (0.6721)(140) + (0.3279)(540)  271.15 . The table then looks like this:

1Bayesian premium refers to the predicted expected value of the next trial. It is not necessarily a premium in the usual sense; it may refer to expected claim count, expected claim size, or expected aggregate loss. It is the expected value of the predictive distribution of whatever you’re working with. C/4 Study Manual—17th edition Copyright ©2014 ASM

45. BAYESIAN METHODS—DISCRETE PRIOR

Prior probabilities Likelihood of experience Joint probabilities Posterior probabilities Hypothetical means Bayesian premium

Good Drivers 0.75 0.06667 0.05001 0.6721 140 94.10

875

Bad Drivers 0.25 0.09758 0.02439 0.3279 540 177.06

0.07440 271.15



?

Quiz 45-1 For a certain insurance coverage, only one claim per year can be submitted. There are two types of group. In a good group, the expected annual number of claims from each risk is 0.1. In a bad group, the expected annual number of claims from each risk is 0.2. The probability that a group is good is 70%. A group of 10 risks submits 2 claims in one year. Determine the expected number of claims submitted by this group in the following year. If any distribution is continuous, use densities instead of probabilities. Example 45B Your company sells an automobile collision coverage. The portfolio of insureds has good drivers and bad drivers. There are three times as many good drivers as bad drivers. For good drivers, the number of claims follows a Poisson distribution with mean 0.1. The size of claims follows a Pareto distribution with parameters α  2, θ  1000. For bad drivers, the number of claims follows a Poisson distribution with mean 0.3. The size of claims follows a Pareto distribution with parameters α  2, θ  2000. For an insured selected at random, you have three years of experience. The insured made no claims in the first 2 years and one claim for 1000 in the third year. Determine aggregate claim costs for this insured in the next year. 2

1000 Answer: The likelihood of a claim size of 1000 for a good driver is 2 (1000+1000  0.00025 while for a bad )3 2

2000  0.0002963. The likelihood of 2 years of no claims and 1 year of 1 claim for a good driver it is 2 (2000+1000 )3 −0.1 −0.1 driver is ( e )( e )(0.1e −0.1 )  0.07408, while for a bad driver it is ( e −0.3 )( e −0.3 )(0.3e −0.3 )  0.1220. Therefore, the likelihood of 2 years with no claims and 1 year of a claim of 1000 is the product of the claim size likelihood and the claim frequency likelihood, or (0.00025)(0.07408)  0.00001852 for a good driver and (0.0002963)(0.1220)  0.00003614 for a bad driver. These numbers are therefore entered on the second row. On the fifth row, since the means of the Paretos are θ/ ( α − 1) or 1000 and 2000, the hypothetical means are (0.1)(1000)  100 and (0.3)(2000)  600. So the revised table looks like this:

Prior probabilities Likelihood of experience Joint probabilities Posterior probabilities Hypothetical means Bayesian premium The final answer is 297.05 .

Good Drivers 0.75 0.00001852 0.00001389 0.6059 100 60.59

Bad Drivers 0.25 0.00003614 0.00000903 0.3941 600 236.46

0.00002293 297.05



Two shortcuts are available: 1. If the probabilities of being in each of the classes is equal, you can skip the first line and treat the second line as if it’s the third line. This is because the first line is only used to weight the second line. C/4 Study Manual—17th edition Copyright ©2014 ASM

45. BAYESIAN METHODS—DISCRETE PRIOR

876

On an exam question, you are often told all classes have the same number of risks, which means that the probabilities of each of the classes is equal. 2. If you are not interested in the posterior probabilities but only in the predictive expected value, you can skip line 4. Simply weight the hypothetical means, the numbers on line 5, using line 3 as weights (instead of line 4). For example, in the previous example, the answer could have been expressed as 0.00001389 (100) + 0.00000903 (600)  297.05 0.00001389 + 0.00000903 We will use the second shortcut in the next example. Example 45C An automobile liability coverage is sold in three territories, A, B, and C. 50% of the business is sold in A, 20% in B, and 30% in C. Claim frequencies on this coverage are given in the following table: Territory A B C

Number of Claims 0 1 2 0.6 0.3 0.1 0.7 0.2 0.1 0.2 0.7 0.1

An insured selected at random has no claims in one period. 1. Determine the probability of one claim from this insured in the next period. 2. Determine the expected number of claims from this insured in the next period. Answer: 1. This question is a little different from the “expected value” question, but is solved in the same way. The probability of one claim from the insured will be the weighted average of the probability of one claim from any insured, with the weights proportionate to the posterior probabilities. The joint probabilities of no claims and being in a territory (line 3) are (0.5)(0.6)  0.3 for A, (0.2)(0.7)  0.14 for B, and (0.3)(0.2)  0.06 for C. The sum of these is 0.3 + 0.14 + 0.06  0.5. Line 5, instead of having conditional expected values, will have conditional probabilities of 1 claim given the territory, which can be read off the table: 0.3 for A, 0.2 for B, and 0.7 for C. We now compute line 6, the weighted average of line 5: P ( N2  1 | N1  0 ) 

0.3 (0.3) + 0.14 (0.2) + 0.06 (0.7)  0.32 . 0.3 + 0.14 + 0.06

2. The only difference between this and the first problem is that line 5 will have the hypothetical means. We calculate the hypothetical means: 0.6 (0) + 0.3 (1) + 0.1 (2)  0.5 for A, 0.7 (0) + 0.2 (1) + 0.1 (2)  0.4 for B, and 0.2 (0) + 0.7 (1) + 0.1 (2)  0.9 for C. Using the same weights as before, we have: E[N2 | N1  0] 

0.3 (0.5) + 0.14 (0.4) + 0.06 (0.9)  0.52 . 0.3 + 0.14 + 0.06

Coverage of this material in the three syllabus options This material is required. It is covered in all three syllabus reading options.

C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISES FOR LESSON 45

877

Exercises [4B-S92:4] (2 points) You have selected a die at random from the two dice described below.

45.1.

Die A

Die B

2 sides labeled 1 2 sides labeled 2 2 sides labeled 3

4 sides labeled 1 1 side labeled 2 1 side labeled 3

The following outcomes from five tosses of the selected die are observed: 1, 1, 2, 3, 1. Determine the probability that you selected Die A. (A) (B) (C) (D) (E) 45.2.

Less than 0.20 At least 0.20, but less than 0.30 At least 0.30, but less than 0.40 At least 0.40, but less than 0.50 At least 0.50 [4B-S94:5] (2 points) Two honest, six-sided dice are rolled, and the results D1 and D2 are observed.

Let S  D1 + D2 . Which of the following are true concerning the conditional distribution of D1 given that S < 6? 1.

The mean is less than the median.

2.

The mode is less than the mean.

3.

The probability that D1  2 is 13 .

(A) 2 (B) 3 (C) 1,2 (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D) 2,3

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

878

Use the following information for questions 45.3 and 45.4: Two dice, A1 and A2 , are used to determine the number of claims. Each side of both dice is marked with either a 0 or a 1, where 0 represents no claim and 1 represents a claim. The probability of a claim for each die is: Die

Probability of Claim

A1 A2

1/6 3/6

In addition, there are two spinners, B1 and B2 , representing claim severity. Each spinner has two areas marked 2 and 14. The probabilities for each claim size are: Claim Size Spinner

2

14

B1 B2

5/6 3/6

1/6 3/6

A die is randomly selected from A1 and A2 and a spinner is randomly selected from B1 and B2 . The selected die is rolled and if a claim occurs, the selected spinner is spun. 45.3. [4B-S93:13] (2 points) Determine E[X1 ], where X1 is the first observation from the selected die and spinner. (A)

2 3

(B)

4 3

(C) 2

(D) 4

(E) 8

45.4. [4B-S93:14] (2 points) For the same selected die and spinner, determine the limit of E[X n | X1  X2  · · ·  X n−1  0] as n goes to infinity. (A) (B) (C) (D) (E)

Less than 0.75 At least 0.75, but less than 1.50 At least 1.50, but less than 2.25 At least 2.25, but less than 3.00 At least 3.00

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 45

879

Use the following information for questions 45.5 and 45.6: Two dice, A and B, are used to determine the number of claims. The faces of each die are marked with either a 1 or 2, where 1 represents 1 claim and 2 represents 2 claims. The probabilities for each die are: Die A B

Probability of 1 Claim 2/3 1/3

Probability of 2 Claims 1/3 2/3

In addition, there are two spinners, X and Y, which are used to determine claim size. Each spinner has two areas marked 2 and 5. The probabilities for each spinner are: Spinner X Y

Probability that Claim Size = 2 2/3 1/3

Probability that Claim Size = 5 1/3 2/3

For the first trial, a die is randomly selected from A and B and rolled. If 1 claim occurs, spinner X is spun. If 2 claims occur, both spinner X and spinner Y are spun. For the second trial, the same die selected in the first trial is rolled again. If 1 claim occurs, spinner X is spun. If 2 claims occur, both spinner X and spinner Y are spun. 45.5. (A) (B) (C) (D) (E)

[4B-S96:18] (2 points) Determine the expected amount of total losses for the first trial. Less than 4.8 At least 4.8, but less than 5.1 At least 5.1, but less than 5.4 At least 5.4, but less than 5.7 At least 5.7

45.6. [4B-S96:19] (2 points) If the first trial yielded total losses of 5, determine the expected number of claims for the second trial. (A) (B) (C) (D) (E)

Less than 1.38 At least 1.38, but less than 1.46 At least 1.46, but less than 1.54 At least 1.54, but less than 1.62 At least 1.62

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

880

Use the following information for questions 45.7 and 45.8: Two dice, A and B, are used to determine the number of claims. The faces of each die are marked with either a 0 or a 1, where 0 represents 0 claims and 1 represents 1 claim. The probabilities for each die are: Die A B

Probability of 0 Claims 2/3 1/3

Probability of 1 Claim 1/3 2/3

In addition, there are 2 spinners, X and Y, which are used to determine claim size. Spinner X has two areas marked 2 and 8. Spinner Y has only one area marked 2. The probabilities for each spinner are: Spinner X Y

Probability that Claim Size = 2 1/3 1

Probability that Claim Size = 8 2/3 0

For the first trial, a die is randomly selected from A and B and rolled. If a claim occurs, a spinner is randomly selected from X and Y and spun. 45.7. (A) (B) (C) (D) (E)

[4B-F96:6] (1 point) Determine the expected amount of total losses for the first trial. Less than 1.4 At least 1.4, but less than 1.8 At least 1.8, but less than 2.2 At least 2.2, but less than 2.6 At least 2.6

45.8. [4B-F96:7] (2 points) For each subsequent trial, the same die selected in the first trial is rolled again. If a claim occurs, a spinner is again randomly selected from X and Y and spun. Determine the limit of the Bayesian analysis estimate of the expected amount of total losses for the n th trial as n goes to infinity if the first n − 1 trials each yielded total losses of 2. (A) (B) (C) (D) (E)

Less than 1.4 At least 1.4, but less than 1.8 At least 1.8, but less than 2.2 At least 2.2, but less than 2.6 At least 2.6

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 45

881

Use the following information for questions 45.9 and 45.10: You are given the following: •

Two urns each contain three marbles.



One urn contains two red marbles and one black marble.



The other urn contains one red marble and two black marbles.

An urn is randomly selected and designated Urn A. The other urn is designated Urn B. One marble is randomly drawn from Urn A. The selected marble is placed in Urn B. One marble is randomly drawn from Urn B. This selected marble is placed in Urn A. One marble is randomly drawn from Urn A. This third selected marble is placed in Urn B. One marble is randomly drawn from Urn B. This fourth selected marble is placed in Urn A. This process is continued indefinitely, with marbles alternatively drawn from Urn A and Urn B. [4B-F97:27] (1 point) The first two selected marbles are red.

45.9.

Determine the Bayesian analysis estimate of the probability that the third selected marble will be red. (A) 1/2

(B) 11/21

(C) 4/7

(D) 2/3

(E) 1

45.10. [4B-F97:28] (2 points) Determine the limit as n goes to infinity of the Bayesian analysis estimate of the probability that the (2n + 1) st selected marble will be red if the first 2n selected marbles are red (where n is an integer). (A) 1/2

(B) 11/21

(C) 4/7

(D) 2/3

(E) 1

45.11. [4B-S97:18] (3 points) You are given the following: •

12 urns each contain 10 marbles.



n of the urns contain 3 red marbles and 7 black marbles.



The remaining 12 − n urns contain 6 red marbles and 4 black marbles.

An urn is randomly selected, and one marble is randomly drawn from it. The selected marble is red. The marble is replaced, and a marble is again randomly drawn from the same urn. The Bayesian analysis estimate of the probability that the second selected marble is red is 0.54. Determine n. (A) 4

(B) 5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 6

(D) 7

(E) 8

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

882

45.12. [4B-S98:24] (3 points) You are given the following: •

A portfolio consists of 100 independent risks.



25 of the risks have a policy with a $5,000 per claim policy limit, 25 of the risks have a policy with a $10,000 per claim policy limit, and 50 of the risks have a policy with a $20,000 per claim policy limit.



The risks have identical claim count distributions.



Prior to censoring by policy limits, claim sizes for each risk follow a Pareto distribution with parameters θ  5,000 and α  2.



A claims report is available which shows the number of claims in various claim size ranges for each policy after censoring by policy limits, but does not identify the policy limit associated with each policy.

The claims report shows exactly one claim for a policy selected at random. This claim falls in the claim size range of $9,000–$11,000. Determine the probability that this policy has a $10,000 policy limit. (A) (B) (C) (D) (E)

Less than 0.35 At least 0.35, but less than 0.55 At least 0.55, but less than 0.75 At least 0.75, but less than 0.95 At least 0.95

45.13. [4B-F95:18, 1999 C4 Sample:11] You are given: •

A portfolio consists of 150 independent risks.



100 of the risks each have a policy with a $100,000 per claim policy limit, and 50 of the risks each have a policy with a $1,000,000 per claim policy limit.



The risks have identical claim count distributions.



Prior to censoring by policy limits, the claim size distribution for each risk is as follows:



Claim Size

Probability

$10,000 $50,000 $100,000 $1,000,000

1/2 1/4 1/5 1/20

A claims report is available that shows actual claim sizes incurred for each policy after censoring by policy limits, but does not identify the policy limit associated with each policy.

The claims report shows exactly three claims for a policy selected at random. Two of the claims are $100,000, but the amount of the third is illegible. Determine the expected value of this illegible number. (A) (B) (C) (D) (E)

Less than $45,000 At least $45,000, but less than $50,000 At least $50,000, but less than $55,000 At least $55,000, but less than $60,000 At least $60,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 45

883

45.14. [4B-F98:16] (2 points) You are given the following: •

A portfolio of automobile risks consists of 900 youthful drivers and 800 nonyouthful drivers.



A youthful driver is twice as likely as a nonyouthful driver to incur at least one claim during the next year.



The expected number of youthful drivers (n) who will be claim-free during the next year is equal to the expected number of nonyouthful drivers who will be claim-free during the next year. Determine n.

(A) (B) (C) (D) (E)

Less than 150 At least 150, but less than 350 At least 350, but less than 550 At least 550, but less than 750 At least 750

45.15. [4B-S99:2] (2 points) Each of two urns contains two fair, six-sided dice. Three of the four dice have faces marked with 1, 2, 3, 4, 5, and 6. The other die has faces marked with 1, 1, 1, 2, 2, and 2. One urn is randomly selected, and the dice in it are rolled. The total on the two dice is 3. Determine the Bayesian analysis estimate of the expected value of the total on the same two dice on the next roll. (A) 5.0

(B) 5.5

(C) 6.0

(D) 6.5

(E) 7.0

45.16. [4B-F99:16] (2 points) You are given the following: •

A red urn and a blue urn each contain 100 balls.



Each ball is labeled with both a letter and a number.



The distribution of letters and numbers on the balls is as follows: Red Urn Blue Urn



Letter A 90 60

Letter B 10 40

Number 1 90 10

Number 2 10 90

Within each urn, the appearance of the letter A on a ball is independent of the appearance of the number 1 on a ball.

One ball is drawn randomly from a randomly selected urn, observed to be labeled A-2, and then replaced. Determine the expected value of the number on another ball drawn randomly from the same urn. (A) (B) (C) (D) (E)

Less than 1.2 At least 1.2, but less than 1.4 At least 1.4, but less than 1.6 At least 1.6, but less than 1.8 At least 1.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

884

45.17. [4-S00:22] You are given: (i) A portfolio of independent risks is divided into two classes, Class A and Class B. (ii) There are twice as many risks in Class A as in Class B. (iii) The number of claims for each insured during a single year follows a Bernoulli distribution. (iv) Classes A and B have claim size distributions as follows: Claim Size 50,000 100,000 (v)

Class A 0.60 0.40

Class B 0.36 0.64

The expected number of claims per year is 0.22 for Class A and 0.11 for Class B.

One insured is chosen at random. The insured’s loss for two years combined is 100,000. Calculate the probability that the selected insured belongs to Class A. (A) 0.55

(B) 0.57

(C) 0.67

(D) 0.71

(E) 0.73

45.18. [4-F00:33] A car manufacturer is testing the ability of safety devices to limit damages in car accidents. You are given: (i) A test car has either front air bags or side air bags (but not both), each type being equally likely. (ii) The test car will be driven into either a wall or a lake, with each accident type being equally likely. (iii) The manufacturer randomly selects 1, 2, 3, or 4 crash test dummies to put into a car with front air bags. (iv) The manufacturer randomly selects 2 or 4 crash test dummies to put into a car with side air bags. (v) Each crash test dummy in a wall-impact accident suffers damage randomly equal to 0.5 or 1, with damage to each dummy being independent of damage to the others. (vi) Each crash test dummy in a lake-impact accident suffers damage randomly equal to either 1 or 2, with damage to each dummy being independent of damage to the others. One test car is selected at random, and a test accident produces total damage of 1. Determine the expected value of the total damage for the next test accident, given that the kind of safety device (front or side air bags) and accident type (wall or lake) remain the same. (A) 2.44

(B) 2.46

(C) 2.52

(D) 2.63

(E) 3.09

45.19. [4-F03:39] You are given: (i) Each risk has at most one claim each year. (ii) Type of Risk I II III

Prior Probability 0.7 0.2 0.1

Annual Claim Probability 0.1 0.2 0.4

One randomly chosen risk has three claims during Years 1–6. Determine the posterior probability of a claim for this risk in Year 7. (A) 0.22

(B) 0.28

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.33

(D) 0.40

(E) 0.46

Exercises continue on the next page . . .

EXERCISES FOR LESSON 45

885

45.20. [4-S01:10] (i)

The claim count and claim size distributions for risks of type A are: Number of Claims 0 1 2

(ii)

Probabilities

Claim Size 500 1235

4/9 4/9 1/9

Probabilities 1/3 2/3

The claim count and claim size distributions for risks of type B are: Number of Claims 0 1 2

Probabilities

Claim Size 250 328

1/9 4/9 4/9

Probabilities 2/3 1/3

(iii) Risks are equally likely to be type A or type B. (iv) Claim counts and claim sizes are independent within each risk type. (v) The variance of the total losses is 296,962. A randomly selected risk is observed to have total annual losses of 500. Determine the Bayesian premium for the next year for this same risk. (A) 493

(B) 500

(C) 510

(D) 513

(E) 514

45.21. [4-S01:28] Two eight-sided dice, A and B, are used to determine the number of claims for an insured. The faces of each die are marked with either 0 or 1, representing the number of claims for that insured for the year. Die

Pr(Claims=0)

Pr(Claims=1)

A B

1/4 3/4

3/4 1/4

Two spinners, X and Y, are used to determine claim cost. Spinner X has two areas marked 12 and c. Spinner Y has only one area marked 12. Spinner

Pr(Cost=12)

Pr(Cost=c)

X Y

1/2 1

1/2 0

To determine the losses for the year, a die is randomly selected from A and B and rolled. If a claim occurs, a spinner is randomly selected from X and Y and spun. For subsequent years, the same die and spinner are used to determine losses. Losses for the first year are 12. 10.

Based upon the results of the first year, you determine that the expected losses for the second year are Calculate c.

(A) 4

(B) 8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 12

(D) 24

(E) 36

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

886

45.22. [4-F01:7] You are given the following information about six coins: Coin 1–4 5 6

Probability of Heads 0.50 0.25 0.75

A coin is selected at random and then flipped repeatedly. X i denotes the outcome of the i th flip, where “1” indicates heads and “0” indicates tails. The following sequence is obtained: S  {X1 , X2 , X3 , X4 }  {1, 1, 0, 1} Determine E[X5 | S] using Bayesian analysis. (A) 0.52

(B) 0.54

(C) 0.56

(D) 0.59

(E) 0.63

45.23. [4-F02:39] You are given: Class

Number of Insureds

0

1 2 3

3000 2000 1000

1/3 0 0

Claim Count Probabilities 1 2 3 1/3 1/6 0

1/3 2/3 1/6

0 1/6 2/3

4 0 0 1/6

A randomly selected insured has one claim in Year 1. Determine the expected number of claims in Year 2 for that insured. (A) 1.00

(B) 1.25

(C) 1.33

(D) 1.67

(E) 1.75

45.24. [4-F04:5] You are given: (i)

Two classes of policyholders have the following severity distributions: Claim Amount 250 2,500 60,000

(ii)

Probability of Claim Amount for Class 1 0.5 0.3 0.2

Probability of Claim Amount for Class 2 0.7 0.2 0.1

Class 1 has twice as many claims as Class 2.

A claim of 250 is observed. Determine the Bayesian estimate of the expected value of a second claim from the same policyholder. (A) (B) (C) (D) (E)

Less than 10,200 At least 10,200, but less than 10,400 At least 10,400, but less than 10,600 At least 10,600, but less than 10,800 At least 10,800

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 45

887

45.25. [4B-S99:16] (2 points) You are given the following: •

The number of claims per year for Risk A follows a Poisson distribution with mean m.



The number of claims per year for Risk B follows a Poisson distribution with mean m + 1.



The probability of selecting Risk A is equal to the probability of selecting Risk B. One of the risks is randomly selected, and zero claims are observed for this risk during one year. Determine the posterior probability that the selected risk is risk A.

(A) (B) (C) (D) (E)

Less than 0.3 At least 0.3, but less than 0.5 At least 0.5, but less than 0.7 At least 0.7, but less than 0.9 At least 0.9

45.26. [4B-F99:28] (2 points) You are given the following: •

The number of claims per year for Risk A follows a Poisson distribution with mean m.



The number of claims per year for Risk B follows a Poisson distribution with mean 2m.



The probability of selecting Risk A is equal to the probability of selecting Risk B.

One of the risks is randomly selected, and zero claims are observed for this risk during one year. Determine the posterior probability that the selected risk will have at least one claim during the next year. (A) (B) (C) (D) (E)

1 − e −m 1 + e −m 1 − e −3m 1 + e −m 1 − e −m 1 − e −2m 1 − e −2m − e −4m

45.27. [4-F00:3] You are given the following for a dental insurer: (i) Claim counts for individual insureds follow a Poisson distribution. (ii) Half of the insureds are expected to have 2.0 claims per year. (iii) The other half of the insureds are expected to have 4.0 claims per year. A randomly selected insured has made 4 claims in each of the first two policy years. Determine the Bayesian estimate of this insured’s claim count in the next (third) policy year. (A) 3.2

(B) 3.4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 3.6

(D) 3.8

(E) 4.0

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

888

45.28. [C-S05:35] You are given: (i) The annual number of claims on a given policy has the geometric distribution with parameter β. (ii) One-third of the policies have β  2, and the remaining two-thirds have β  5. A randomly selected policy had two claims in Year 1. Calculate the Bayesian expected number of claims for the selected policy in Year 2. (A) 3.4

(B) 3.6

(C) 3.8

(D) 4.0

(E) 4.2

45.29. [4-S00:7] You are given the following information about two classes of risks: (i) (ii) (iii) (iv) (v) (vi)

Risks in Class A have a Poisson claim count distribution with a mean of 1.0 per year. Risks in Class B have a Poisson claim count distribution with a mean of 3.0 per year. Risks in Class A have an exponential severity distribution with a mean of 1.0. Risks in Class B have an exponential severity distribution with a mean of 3.0. Each class has the same number of risks. Within each class, severities and claim counts are independent.

A risk is randomly selected and observed to have two claims during one year. The observed claim amounts were 1.0 and 3.0. Calculate the posterior expected value of the aggregate loss for this risk during the next year. (A) (B) (C) (D) (E)

Less than 2.0 At least 2.0, but less than 4.0 At least 4.0, but less than 6.0 At least 6.0, but less than 8.0 At least 8.0

45.30. [4B-F92:24] (2 points) A portfolio of three risks exists with the following characteristics: •

The claim frequency for each risk is normally distributed with mean and standard deviations:

Risk A B C •

Distribution of Claim Frequency Mean Standard Deviation 0.10 0.03 0.50 0.05 0.90 0.01

A frequency of 0.12 is observed for an unknown risk in the portfolio. Determine the Bayesian estimate of the same risk’s expected claim frequency.

(A) 0.10

(B) 0.12

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.13

(D) 0.50

(E) 0.90

Exercises continue on the next page . . .

EXERCISES FOR LESSON 45

889

45.31. [4B-F96:12] (3 points) You are given the following: •

75% of claims are of Type A and the other 25% are of Type B.



Type A claim sizes follow a normal distribution with mean 3,000 and variance 1,000,000.



Type B claim sizes follow a normal distribution with mean 4,000 and variance 1,000,000.

A claim file exists for each of the claims, and one of them is randomly selected. The claim file selected is incomplete and indicates only that its associated claim size is greater than 5,000. Determine the posterior probability that a Type A claim was selected. (A) (B) (C) (D) (E)

Less than 0.15 At least 0.15, but less than 0.25 At least 0.25, but less than 0.35 At least 0.35, but less than 0.45 At least 0.45

45.32. [4B-S99:26] (3 points) A company invests in a newly offered stock that it judges will turn out to be one of three types with equal probability. •

The annual dividend for Type A stocks has a normal distribution with mean 10 and variance 1.



The annual dividend for Type B stocks has a normal distribution with mean 10 and variance 4.



The annual dividend for Type C stocks has a normal distribution with mean 10 and variance 16.

After the company has held the stock for one year, the stock pays a dividend of amount d. The company then determines that the posterior probability that the stock is of Type B is greater than either the posterior probability that the stock is of Type A or the posterior probability that the stock is of Type C. Determine all the values of d for which this would be true. Hint: The density function for a normal distribution is f ( x ) 

q

|d − 10| < 2

2 ln 2 3

(B)

q

|d − 10| < 4

2 ln 2 3

(C)

2

(A)

q

2 ln 2 3

q

< |d − 10| < 4

q q

(D)

|d − 10| > 2

2 ln 2 3

(E)

|d − 10| > 4

2 ln 2 3

C/4 Study Manual—17th edition Copyright ©2014 ASM

− √1 e σ 2π

( x−µ ) 2 2σ 2

.

2 ln 2 3

Exercises continue on the next page . . .

45. BAYESIAN METHODS—DISCRETE PRIOR

890

45.33. [4B-F98:9] (2 points) You are given the following: •

A portfolio consists of 75 liability risks and 25 property risks.



The risks have identical claim count distributions.



Loss sizes for liability risks follow a Pareto distribution with parameters θ  300 and α  4.



Loss sizes for property risks follow a Pareto distribution with parameters θ  1,000 and α  3. A risk is randomly selected from the portfolio and a claim of size k is observed. Determine the limit of the posterior probability that this risk is a liability risk as k goes to zero.

(A) 3/4

(B) 10/49

(C) 10/11

(D) 40/43

(E) 1

45.34. [4-F03:14] You are given: (i)

Losses on a company’s insurance policies follow a Pareto distribution with probability density function: θ , 0 5000 | A ) Pr ( A ) + Pr ( X > 5000 | B ) Pr ( B ) ! 5000 − 3000  1 − Φ (2)  0.0228 Pr ( X > 5000 | A )  1 − Φ 1000 Pr ( A | X > 5000) 

5000 − 4000 Pr ( X > 5000 | B )  1 − Φ  0.1587 1000 (0.0228)(0.75)  0.3012 Pr ( A | X > 5000)  (0.0228)(0.75) + (0.1587)(0.25)

!

(C)

The tabular form would be Type A

Type B

0.75

0.25

Likelihood of experience

0.0228

0.1587

Joint probabilities

0.0171

0.039675

Posterior probabilities

0.3012

0.6988

Prior probabilities

0.056775

45.32. Let X be the event of a dividend of d. P (B | d ) 

f (d | B) f (d | A) + f (d | B ) + f (d | C )

We omit P ( A ) , P ( B ) , and P ( C ) because they are all 13 and cancel out. Since the denominators of P ( A | d ) , P ( B | d ) , and P ( C | d ) are the same, it suffices to compare numerators f ( d | A ) , f ( d | B ) , and f ( d | C ) . 2 1 f ( d | A )  √ e − ( d−10) /2 2π 2 1 f ( d | B )  √ e − ( d−10) /8 2 2π

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 45

905

2 1 f ( d | C )  √ e − ( d−10) /32 4 2π

f ( d | B ) > f ( d | A ) ⇒ 12 e − ( d−10)

2 /8

> e − ( d−10)

2 /2

− ( d − 10) 2 − ( d − 10) 2 − ln 2 > 8 2 ⇒ ln 2 < 83 ( d − 10) 2 ⇒

⇒ ( d − 10) 2 > ⇒ |d − 10| >

f ( d | B ) > f ( d | C ) ⇒ 12 e

8 3

q

− ( d−10) 2 /8

ln 2

8 3

ln 2

> 41 e − ( d−10)

2 /32

− ( d − 10) 2 − ( d − 10) 2 > − ln 2 8 32 3 ( d − 10) 2 < ln 2 ⇒ 32 ⇒

⇒ ( d − 10) 2 < ⇒ |d − 10|
8) Bayesian premiums

θ3 0.5 3/82 3/128 27/43 3/11 0.17125

43/1152

0.2126

45.35. The Pareto pdf with θ  10, x  20, is f (20) 

α (10) α 30α+1

Plugging in α  1, 2, 3, the 3 likelihoods are 3 10 1   302 90 270 200 2  303 270 3000 1  4 270 30 Since each α is equally likely, the posterior distribution assigns probabilities of (after canceling out the 3 2 1 270’s) 3+2+1  36 , 3+2+1  62 , and 3+2+1  16 to the 3 α’s 1, 2, 3 respectively. The probabilities of a claim greater than 30 for the 3 α’s are 10 40

!1

10 40

!2

10 40

!3



1 4



1 16



1 64

Weighting these with the posterior probabilities of α  1, 2, 3, we get the posterior (in the sense of predictive) probability of a claim greater than 30: 3 The tabular form is

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 4

+2

1 16

6

+1

1 64

 0.1484375

(C)

QUIZ SOLUTIONS FOR LESSON 45

907

α1 Prior probabilities Likelihood of experience Joint probabilities Posterior probabilities

α2

1/3 10 302

1/3

 0.011111

0.0037037

α3

2 (102 ) 303

1/3

 0.007407

0.0024691

3 (103 ) 304

 0.003704

0.0012346

1/2

1/3

1/6

Pr ( X > 30)

1/4

(1/4) 2

(1/4) 3

Product

1/8

1/48

1/384

0.0074074

0.1484375

Quiz Solutions 2 8 45-1. The likelihood of 2 claims is 10 2 (0.1 )(0.9 )  0.19371 for the first group and 0.30199 for the second group. The following table results:



Prior probabilities Likelihood of experience Joint probabilities Posterior probabilities Hypothetical means Bayesian premium

Good Drivers 0.7 0.19371 0.135597 0.599473 0.1 0.05995

Bad Drivers 0.3 0.30199 0.090597 0.400527 0.2 0.08010

10 2 8 2 (0.2 )(0.8 )



0.22619 0.14005

That is the expected number of claims of one person. Multiplying by 10, the expected number of claims for the group is 1.4005 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

908

C/4 Study Manual—17th edition Copyright ©2014 ASM

45. BAYESIAN METHODS—DISCRETE PRIOR

Lesson 46

Bayesian Estimation and Credibility—Continuous Prior Reading: Loss Models Fourth Edition 15 plus one of the following three: Loss Models Fourth Edition 18.1– 18.3, or SN C-21-01 4, or Introduction to Credibility Theory 4, 8.2, 8.4.1 and optionally Introduction to Credibility Theory 3

46.1

Calculating posterior and predictive distributions

If the prior distribution is continuous, we adjust the equation we used in the discrete case: Pr ( A | B ) 

Pr ( B | A ) Pr ( A ) Pr ( B )

(45.1)

by replacing probabilities with density functions and sums with integrals. The result is π ( θ | x1 , . . . , x n ) 

π ( θ ) f ( x1 , . . . , x n | θ ) π ( θ ) f ( x1 , . . . , x n | θ ) R f ( x1 , . . . , x n ) π ( θ ) f ( x1 , . . . , x n | θ ) dθ

(46.1)

In this equation: π ( θ ) is the prior density, the initial density function for the parameter that varies in the model. It is permissible to use an improper prior density, one which is nonnegative but whose integral is infinite. For example, π ( θ )  θ1 , an improper prior, is a good prior for a scale parameter θ when you have no idea of the distribution of θ. f ( x | θ ) is the conditional probability density function of the model. It defines the model’s probability given the parameter θ. It does not appear in Bayes’ Theorem, but may be needed to calculate the next item. f ( x1 , . . . , x n | θ ) is the conditional joint probability density function of the data given θ. Typically the observations x i are assumed to be independent given θ (but not unconditionally independent), and then f ( x1 , . . . , x n | θ ) 

n Y i1

f (xi | θ)

f ( x1 , . . . , x n ) is the unconditional joint density function of the data x1 , . . . , x n . It can be calculated from the conditional joint density function by integrating over the prior density of θ:

Z f ( x1 , . . . x n )  C/4 Study Manual—17th edition Copyright ©2014 ASM

f ( x1 , . . . x n | θ ) π ( θ ) dθ

909

46. BAYESIAN METHODS—CONTINUOUS PRIOR

910

π ( θ | x 1 , . . . , x n ) is the posterior density, the revised density function for the parameter θ based on the observations x 1 , . . . , x n . Even though we use the same letter π as for the prior, it is distinguished from π ( θ ) by the fact that it is conditional on x1 , . . . , x n . In the textbook, πΘ is used for the prior and πΘ|x is used for the posterior, where x is the vector x1 , . . . x n . Here is another important function not included in equation (46.1): f ( x n+1 | x1 , . . . , x n ) is the predictive density, the revised unconditional f ( x ) based on x1 , . . . , x n . It can be calculated by integrating the conditional probability density function over the posterior density of θ: Z f ( x n+1 | x1 , . . . , x n ) 

f ( x n+1 | θ ) π ( θ | x 1 , . . . , x n ) dθ

(46.2)

Notice that the prior and posterior density functions are functions of the parameter (usually called θ), whereas the model and predictive density functions are functions of x. If you are asked questions about θ, you will use the posterior density. If you are asked questions about x, like expected number of claims/claim sizes/probability claim size is less than 10000, you will use the predictive density. Equation (46.1) calculates the posterior, which is a function of θ. In the numerator, we are multiplying π ( θ ) , the prior distribution, by f ( x1 , . . . , x n | θ ) , the likelihood function of the data. We already encountered the likelihood function when studying maximum likelihood estimation. In the denominator, we integrate this product over the range of θ. The denominator is a constant, and θ is a dummy variable; it is not the same θ as the argument of the posterior function on the left side. We use the same letter θ in the denominator to suggest the method of calculating it: whatever the numerator is, integrate it over θ. To calculate the posterior distribution, multiply the likelihood of the data by the prior distribution. Divide this product by the integral, over the prior parameter, of the product. Since the product is being divided by its integral, you may drop multiplicative constants from the product. To calculate the predictive distribution, integrate the model over the posterior distribution. Typical Bayesian credibility questions are: 1. Calculate the posterior expected value of the parameter. This can be calculated by integrating the parameter θ over the posterior function. Often there will be easier ways to obtain it. For example, if the posterior distribution happens to be a distribution listed in the tables, look up the expected value (the first moment) in the tables. 2. Calculate the expected value of the next claim, or the expected value of the predictive distribution, or the Bayesian premium. This can be calculated by integrating the claim variable x over the predictive distribution. However, it is usually easier to integrate E[X | θ] over the posterior distribution, where E[X | θ] is the expected value of the model given the parameter θ. For some models (e.g. Poisson, exponential) E[X | θ]  θ so that the expected value of the predictive equals the expected value of the posterior. 3. Calculate the posterior probability that the parameter is in a certain range. This is calculated by integrating the posterior density function over that range. 4. Calculate the probability that the next claim is in a certain range. This is calculated by integrating the predictive density function over that range. The following example illustrates a Bayesian credibility calculation. Example 46A You are given: (i) Given λ, the number of claims follows a Poisson distribution with parameter λ (ii) λ has the following density function: π (λ)  C/4 Study Manual—17th edition Copyright ©2014 ASM

1 λ2

λ>1

46.1. CALCULATING POSTERIOR AND PREDICTIVE DISTRIBUTIONS

911

(iii) From a randomly selected insured, you observe 2 claims in year 1, 0 claims in year 2, and 0 claims in year 3. Calculate the expected value of the next claim. Answer: This question is asking for the predictive expected value. Since the model is Poisson, E[X | λ]  λ, so the predictive expected value equals the posterior expected value. We only need to calculate the posterior density. The likelihood of the three observations, the product of three Poisson probabilities, is L ( x; λ )  e −3λ

λ2 2!0!0!

and we can drop the 1/ (2!0!0!) multiplicative constant. The product of the likelihood and the prior after dropping the constant is π ( λ ) f ( x1 , x2 , x3 | λ )  The integral of this product is



Z 1

1 −3λ 2 e λ  e −3λ λ2

e −3λ dλ 

λ>1

e −3 3

and the posterior function is π ( λ | x1 , x2 , x3 ) 

e −3λ  3e −3 ( λ−1) e −3 /3

λ>1

We could integrate λ times this function. However, it is faster to recognize this density function as a shifted exponential, an exponential with parameter θ  1/3 that has been shifted by 1. Its mean is the shift plus θ, or E[λ]  1 + 1/3  4/3. Therefore the predictive mean is also 4/3 .  Calculating the integrals for Bayesian credibility can be difficult. Typical exam questions will use easyto-integrate distributions, like exponential, uniform, or single-parameter Pareto. The next example illustrates how special care is needed when the model distribution has finite support; for example, when it is uniform or a single-parameter Pareto. For example, if the model distribution is uniform on [a, θ] and there is a claim for 5, you immediately know that θ must be at least 5. Example 46B Claim size follows a single-parameter Pareto distribution with parameters α  3 and Θ. Over all insureds, Θ has a uniform distribution on [1, 4]. An insured selected at random submits 4 claims of sizes 2, 3, 5, and 7. Calculate (i) (ii) (iii) (iv)

The posterior mean of Θ. The expected size of the next claim. The probability that the next claim will be greater than 3. The probability that the next claim will be less than 1.5.

Answer: Notice that nowhere in this question does it say to use the Bayesian method. The Bayesian method is a method to obtain exact answers, based on the assumptions provided; use it to calculate posterior or predictive results unless you’re told otherwise. The prior density is π ( θ )  13 1≤θ≤4 C/4 Study Manual—17th edition Copyright ©2014 ASM

46. BAYESIAN METHODS—CONTINUOUS PRIOR

912

and the model density is

So the likelihood of the four claims is

f (x | θ) 

3θ 3 x4

x≥θ

 34 θ 12    (2 · 3 · 5 · 7) 4 f (2, 3, 5, 7 | θ )     0 

θ≤2 otherwise

Note that if θ > 2, we could not possibly observe 2, so the likelihood would be zero. The posterior density is then: 13θ 12 θ 12 1≤θ≤2  13 π ( θ | x1 , x2 , x3 , x4 )  R 2 θ 12 dθ 2 − 1 1

since all the constants in the numerator and denominator (34 , (2 · 3 · 5 · 7) 4 ) are the same and cancel. Note that the posterior density is only a function of the number of observations and their minimum; the answer would have been the same if the observations were 2, 325, 1000, and 100,000. This is characteristic of the uniform prior. We will write x for {x1 , x2 , x3 , x4 } The posterior mean is 2

Z E[Θ | x]   

1

13θ 13 dθ 213 − 1

2

13 14 θ 14 (213 − 1) 1 13 14

!

214 − 1  1.85726 213 − 1

!

We can calculate the expected size f of the nextgclaim without calculating the predictive distribution by using E[X n1 | θ], since E[X n+1 ]  E E[X n+1 | θ] , and E[X n+1 | θ], the model distribution’s expectation (which never changes with more data) is the single-parameter Pareto’s mean αθ/ ( α − 1)  1.5θ. So the expected size of the next claim is E[1.5θ]  2.78589 . For the predictive probabilities, we will have to calculate the predictive distribution. The predictive distribution’s density function, by formula (46.2), is

Z f ( x | x) 

f ( x | θ ) π ( θ | x1 , . . . , x n ) dθ 

min (2,x )

Z 1

3θ 3 x4

!

13 θ 12 dθ 13 2 −1



Notice the upper limit of the integral; under all circumstances, θ ≤ x or else the model density is zero.

Z x       1 f ( x | x)   Z 2       1

(3)(13) θ15 213

−1

x4

(3)(13) θ15 213 − 1 x 4

dθ  dθ 

(213

39 x 16 − 1 − 1)(16) x 4

39 216 − 1 (213 − 1)(16) x 4

1≤x≤2 x≥2

To calculate the predictive probability of a claim greater than 3 (question (iii)), we use the second integral: Pr ( X n+1

C/4 Study Manual—17th edition Copyright ©2014 ASM

∞ 1 39 (216 − 1) > 3 | x)  13 dx (2 − 1)(16) 3 x 4 19.50208   0.240766 3 (33 )

Z

46.1. CALCULATING POSTERIOR AND PREDICTIVE DISTRIBUTIONS

7

1.4

6

1.2

5

1

4

0.8

3

0.6

2

0.4

1

0.2 1

1.2

1.4

1.6

(a) Posterior density

1.8

2

1

2

913

3

(b) Predictive density

4

5

Figure 46.1: Posterior and predictive density functions in Example 46B

To calculate the predictive probability of a claim less than 1.5 (question (iv)), we use the first integral: Pr ( X n+1

39 ≤ 1.5 | x)  13 (2 − 1)(16)

1.5

Z 1.5

1

x 16 − 1 dx x4

1  0.0002975 x − 4 dx x 1 ! 1.513 1 1 1  0.0002975 − + − 13 13 3 (1.53 ) 3

Z





12

 0.004362



Figure 46.1 graphs the posterior and predictive densities. Example 46C X follows a binomial distribution with parameters m  3 and Q. The parameter Q is distributed as follows: (1 − q ) 3 0 ≤ q ≤ 0.4 π (q )  0.2176 Two observations of X are 0 and 1. Calculate the predictive mean of X. Answer: Drop the 1/0.2176 constant from the prior and the 31 constant from the second observation,   and write the prior times the likelihood as (1 − q ) 3 (1 − q ) 3 q (1 − q ) 2  q (1 − q ) 8 . You can integrate this by parts or use the substitution u  1 − q, which is what we’ll do.



0.4

Z 0

q (1 − q ) 8 dq 

Z

1

0.6

(1 − u ) u 8 du 1

u 9 u 10 − 9 10 0.6 1 − 0.69 1 − 0.610  −  0.010596 9 10

!



C/4 Study Manual—17th edition Copyright ©2014 ASM

46. BAYESIAN METHODS—CONTINUOUS PRIOR

914

The posterior is therefore

q (1 − q ) 8 0 ≤ q ≤ 0.4 0.010596 The mean of the posterior is obtained by integrating this times q, and we’ll once again substitute u  1 − q to integrate the numerator. π ( q | x) 

1

Z

0.6

1

u 9 2u 10 u 11  0.00177997 − + 9 10 11 0.6

!

2 8

(1 − u ) u du 

So the posterior mean of Q is 0.00177997/0.010596  0.167984, and since E[X]  3Q, the predictive mean  of X is 3 (0.167984)  0.50395 .

?

Quiz 46-1 Annual claim counts follow a geometric distribution with parameter β. The parameter β follows the improper prior 1 0 a )  0.95, so a is the 5th percentile: a

Z 1

a

p 13

13θ 12 dθ  0.05 213 − 1 a 13 − 1  0.05 213 − 1

0.05 (213 − 1) + 1  1.58865

so the interval [1.58865, 2] is a 95% HPD credibility set. Typically the posterior has a mode someplace in the middle, and is not necessarily symmetric, so it’s harder to calculate the HPD credibility set. You must select a and b such that a is less than the mode, b is greater than the mode, π ( a | x)  π ( b | x) , and Pr (Θ ∈ [a, b] | x)  1 − α. These are two equations in the two unknowns a and b, but they may be difficult to solve. As an alternative, you may use the Bayesian central limit theorem, which says that the posterior converges to the normal distribution. Then you would use the posterior’s mean and add and subtract a standard normal coefficient corresponding to significance level α (like 1.96 for α  0.05) times the standard deviation of the posterior to get an approximate HPD credibility set. In example 46B, this would require calculating the posterior’s variance (we’ve already calculated the mean). We calculate the second moment first: 2 13 E[Θ | x]  13 θ 14 dθ 2 −1 1 13 (215 − 1)  3.46698  15 (213 − 1) 2

Z

So Var (Θ | x)  3.46698 − 1.857262  0.01758, and the 95% credibility interval is √ 1.85726 ± 1.96 0.01758  (1.5974, 2.1172) ,

a questionable interval since it goes beyond 2, and Θ ≤ 2. One could also calculate a credibility interval for the predictive distribution. The Bayesian central limit theorem would not apply, since the predictive distribution has only one member (the next loss). Since the predictive distribution is calculated using the posterior distribution, the posterior distribution could be replaced with a normal distribution using the Bayesian central limit theorem. In Example 46B, calculating the HPD credibility interval for the predictive distribution would require finding points a and b such that f ( a | x)  f ( b | x) and the integral of f from a to b would be the credibility level (e.g., 95%). This would require numerical techniques. C/4 Study Manual—17th edition Copyright ©2014 ASM

46.5. THE LINEAR EXPONENTIAL FAMILY AND CONJUGATE PRIORS

46.5

917

The linear exponential family and conjugate priors

I doubt you will be asked any theoretical question on material discussed in Loss Models 15.3. However, you should know what a conjugate prior distribution is. Bayesian estimation is easy if the posterior distribution is in the same family as the prior distribution, just with different parameters. For example, the prior distribution could be a gamma with parameters α and β, and the posterior a gamma with parameters α0 and β0. This can happen for certain models! When it happens, the prior distribution is called the conjugate prior for the model. Here’s an example: Example 46F X follows an inverse exponential distribution: f (x ) 

θe −θ/x x2

The parameter θ follows a gamma distribution with parameters α  3 and scale parameter 10: π (θ) 

θ 2 e −θ/10 2000

Five observations of X are 10, 20, 40, 50, 100. Determine the posterior distribution of θ. Answer: Prior times likelihood, dropping constants, is π ( θ ) f (10, 20, 40, 50, 100 | θ ) ∼ θ 2 e −θ/10 θ 5 e −θ (0.1+0.05+0.025+0.02+0.01)  θ 7 e −0.305θ which we recognize as a gamma with parameters α  8 and shape parameter 1/0.305. So the gamma is a conjugate prior for the inverse exponential.  As discussed in Section 2.4, a parametrized distribution f ( x; θ ) belongs to the linear exponential family if it can be parametrized in the form p ( x ) e −xθ f ( x; θ )  (46.3) q (θ) This definition applies to both discrete and continuous distributions. For models in the linear exponential family, conjugate prior distributions can be found. The textbook derives the formula, which you don’t have to know. Even nicer, if the model is in the linear exponential family and its conjugate prior is used as the prior distribution, and if the prior mean exists, then the Bayesian credibility estimate—the mean of the posterior distribution—can be expressed as a weighted sum of the mean of the observations and the prior mean. This is like limited fluctuation credibility; a factor Z is multiplied by the mean of the observations and 1 − Z is multiplied by the prior mean, and the two are added up. The next few lessons will discuss important examples of this phenomenon.

Coverage of this material in the three syllabus options This material is required. The material in the first section is part of Bayesian credibility, and is covered by all three syllabus reading options. The other sections’ material is part of parametric estimation, a topic for which you have no option other than the Loss Models textbook. This being said, there haven’t been any released questions on the material in Section 46.3 since before 2000, and there have never been any released questions on the material in Section 46.4.

C/4 Study Manual—17th edition Copyright ©2014 ASM

46. BAYESIAN METHODS—CONTINUOUS PRIOR

918

Exercises Posterior and predictive distributions Use the following information for questions 46.1 through 46.3: The number of claims has a Poisson distribution with mean λ. The parameter λ varies by insured, and is uniformly distributed on [0, 1]. 46.1.

Determine the prior probability of 1 claim in the first year for a randomly selected insured.

46.2.

A certain insured selected at random has no claims in each of three years of experience.

Determine the expected number of claims for the next year for the selected insured. A certain insured selected at random has no claims in each of three years of experience.

46.3.

Determine the probability of 1 claim in the next year for the selected insured. [4B-S98:13] (3 points) You are given the following:

46.4. •

The number of claims for a given risk follows a distribution with probability function p ( n )  ( e λ − 1) −1



λn , n!

n  1, 2, . . . ,

λ>0

Claim sizes for this risk follow a distribution with density function f ( x )  e −x ,



0 < x < ∞.

For this risk, the number of claims and claim sizes are independent. Determine the probability that the largest claim for this risk is less than k.

(A)

eλ − 1

(B)

    ( e λ − 1) exp λ (1 − e −k ) − 1    ( e λ − 1) exp λ (1 − e −k )     ( e λ − 1) −1 exp λ (1 − e −k ) − 1    ( e λ − 1) −1 exp λ (1 − e −k )

(C) (D) (E) 46.5.

[4-S01:37] You are given the following information about workers’ compensation coverage:

(i)

The number of claims for an employee during the year follows a Poisson distribution with mean

(100 − p ) /100 (ii)

where p is the salary (in thousands) for the employee. The distribution of p is uniform on the interval (0, 100].

An employee is selected at random. No claims were observed for this employee during the year. Determine the posterior probability that the selected employee has salary greater than 50 thousand. (A) 0.5

(B) 0.6

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.7

(D) 0.8

(E) 0.9 Exercises continue on the next page . . .

EXERCISES FOR LESSON 46

919

[4-F01:34] You are given:

46.6.

(i) The annual number of claims for each policyholder follows a Poisson distribution with mean θ . (ii) The distribution of θ across all policyholders has probability density function: f ( θ )  θe −θ , ∞

Z

(iii)

0

θe −nθ dθ 

θ>0

1 n2

A randomly selected policyholder is known to have had at least one claim last year. Determine the posterior probability that this same policyholder will have at least one claim this year. (A) 0.70 46.7.

(B) 0.75

(C) 0.78

(D) 0.81

(E) 0.86

[4-F04:13] You are given:

(i) The number of claims observed in a 1-year period has a Poisson distribution with mean θ. (ii) The prior density is: e −θ π (θ)  , 0 0.

A claim of $5 is observed in the current year. Determine the posterior distribution of t.

(A) t 2 e −5t

(B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

125 2 −5t 2 t e

(C) t 2 e −6t

(D) 108t 2 e −6t

(E) 36t 2 e −6t

Exercises continue on the next page . . .

46. BAYESIAN METHODS—CONTINUOUS PRIOR

920

[4B-S98:8] (2 points) You are given the following:

46.9. •

The number of claims during one exposure period follows a Bernoulli distribution with mean p.



The prior density function of p is assumed to be f (p ) 

πp π sin , 2 2

0 < p < 1.

The claims experience is observed for one exposure period and no claims are observed. Determine the posterior density function of p. Hint: (A) (B) (C) (D) (E)

1

Z 0

πp πp 2 sin dp  and 2 2 π

1

Z 0

πp 2 πp 4 sin dp  2 ( π − 2) . 2 2 π

πp π sin , 0 0)  Pr ( X2 > 0 | X1  0) Pr ( X1  0) + Pr ( X2 > 0 | X1 > 0) Pr ( X1 > 0) The left hand side, the prior probability of X > 0 (since it is not conditioned on anything), is a negative binomial with the parameters of the prior gamma, r  α  2 and β  1, so 1 Pr ( X2  0)  Pr ( X1  0)  2

C/4 Study Manual—17th edition Copyright ©2014 ASM

!2 

1 4

46. BAYESIAN METHODS—CONTINUOUS PRIOR

928

The first term on the right hand side is based on a posterior gamma with α ∗  α  2, γ∗  γ + 1  1 + 1  2, 2 so the negative binomial parameters are r  α∗  2, β  γ1∗  12 , and Pr ( X2  0 | X1  0)  23  49 . Plugging this into the equation for Pr ( X2 > 0) ,



1 4 1−  1− 4 9





1 3 + Pr ( X2 > 0 | X1 > 0) 4 4

!



!

and solving for Pr ( X2 > 0 | X1 > 0) gets the value 22/27.

46.7. The conditional probability of zero claims given θ is p0 for the Poisson, or e −θ . We integrate this over all values of θ to obtain the unconditional probability. 0.575 

k

Z

e −2θ dθ 1 − e −k

0

−

k

e −2θ 2 (1 − e −k ) 0

1 − e −2k 2 (1 − e −k ) 1 + e −k  2  2 (0.575) − 1  0.15 

e −k

k  − ln (0.15)  1.8971

(C)

46.8. πT |X ( t | x )  R πT |X ( t | x  5)  R

te −t · te −tx

∞ 0

te −tx te −t dt

t 2 e −6t

∞ 2 −6t t e dt 0

To evaluate the integral of the denominator, one method is to recognize the general form as the second moment of an exponential with parameter θ  1/6. Such an exponential would require the constant 6 in its probability density function. In other words E[X 2 ] 



Z 0

6t 2 e −6t dt

According to the distribution tables, the second moment of an exponential is 2θ 2 , so E[X 2 ]  so



Z 0

and

C/4 Study Manual—17th edition Copyright ©2014 ASM



Z 0

6t 2 e −6t dt  2

2 −6t

t e

1 dt  6

!

1 6

!2 

2 36

2 1  36 108

!

πT |X ( t | x  5)  108t 2 e −6t

(D)

EXERCISE SOLUTIONS FOR LESSON 46

929

46.9. Let X be the number of claims. This question uses f for the prior instead of π, so we’ll distinguish model density from prior density only by the subscripts.

 1 − p f X|P ( x | p )   p

1

Z 0

x0 x1    πp π (1 − p ) sin fP ( p ) f0|P (0 | p )  2 2 ( π/2)(1 − p ) sin ( πp/2) fP|X ( p | 0)  R 1 ( π/2)(1 − p ) sin ( πp/2) dp 0

πp π (1 − p ) sin dp  2 2

1

Z 0

πp π sin dp − 2 2

1

Z 0

πp πp sin dp 2 2

The first integral is the integral of 1 over a density function. The integral of any density function over its entire range is 1, so that integral is 1. The second integral is evaluated in the hint as 2/π. So 1

Z 0

π (1 − p ) πp 2 sin dp  1 − 2 2 π

!

So the posterior is fP|X ( p | 0)  

( π/2)(1 − p ) sin ( πp/2) π 2 (1

1 − 2/π

πp

− p)

2 ( π − 2)

sin

(E) 2

46.10. We’re going to try to recognize the posterior, as discussed in Section 46.2. Therefore, we multiply the likelihood f X|A (x | α ) by the prior π ( α ) , grouping all constants into a single constant C. α6 (1 + x i ) α+1 Y  ! α 6 e −α 6  Cα exp −α 1 + ln ( 1 + x ) f X|A (x | α ) π ( α )  Q i (1 + x i ) α+1 f X|A (x | α )  Q

where C is a constant. This is a gamma distribution with parameters β  7 and θ 

1+ln

Q1 (1+x i )

 0.215441.

(We use β, since α is already in use.) The mode is (from the appendix) θ ( β − 1)  0.215441 (6)  1.2926 .

46.11. The posterior density must have θ ≥ 6 to make the likelihood of the claim greater than zero. It is π (θ | x )  R since the prior density

1 5

6 ≤ θ ≤ 10

in the numerator and denominator cancels. 

The posterior mean of Θ is

1 1  θ (ln 10 − ln 6) θ ln 5/3

10

Z 6 C/4 Study Manual—17th edition Copyright ©2014 ASM

1/θ

10 (1/θ ) dθ 6

1 4 dθ  ln 5/3 ln 5/3

!

46. BAYESIAN METHODS—CONTINUOUS PRIOR

930

The predictive expectation, since the uniform model has E[X | Θ]  Θ/2, is 2  3.9152 ln 5/3

46.12. The model has Pr ( X > 5 | Θ)  1 − 5/Θ. We integrate this over the posterior: 10

Z

Pr ( X2 > 5 | X1 ) 



6

1  ln 5/3

5 θ

1−

10

Z 6

 

1 dθ θ ln 5/3

!

1 5 − 2 dθ θ θ



1 1 + 1 * . ln 5/3 + 5 − / ln 5/3 10 6







,

-

1 1−  0.347462 3 ln 5/3

46.13. This is very similar to Example 46B. The prior density of a claim of 200 given θ is, using the formula for a single parameter Pareto, 2θ 2 f (200 | θ )  . 2003 2

1 2θ The joint probability of being in a class and having this experience is 50 , since the uniform distri2003 1 . We integrate this over the range [50, 100]. The posterior distribution is then the bution has density 50 quotient of the joint probability over this integral. As we discussed, we can drop all constants like 1/50 and 2/2003 .



π ( θ ) f ( θ | 200) ∼ θ2

Z

100

50

θ 2 dθ 

πΘ|X ( θ | 200) 



100

θ 3 875,000  13 (1003 − 503 )  3 50 3

3θ 2 875,000

50 ≤ θ ≤ 100

The hypothetical mean of the single parameter Pareto is αθ/ ( α − 1)  2θ. The final answer is then obtained by integrating 2θ with the weights given by the posterior distribution: E[X2 | X1  200] 

Z

100 50

θ 3 dθ 

Z

100 50

(2θ )

3θ 2 dθ 875,000

100

θ 4 4 50

 14 (1004 − 504 )  23,437,500 2·3 E[X2 | X1  200]  (23,437,500)  160 57 875,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 46

931

46.14. This is the counterpart to Example 46B; the roles of the single-parameter Pareto and the uniform distribution have been reversed. The likelihood of the two claims is 1/θ 2 for θ > 600, but otherwise zero since there was a claim for 600 and no claim may be higher than θ. The product of the prior and the likelihood, the numerator of the posterior, is ! ! ! 500 500 1  θ2 θ2 θ4 and the denominator of the posterior is the integral of this from 600 to ∞. The 500’s will cancel; the integral without the 500 is: Z ∞ 1 dθ  3) 4 3 ( 600 θ 600 The posterior density function is therefore: 3 (6003 ) θ4

π ( θ | x) 

θ ≥ 600

The probability of a claim greater than 550 is 1 − 550 θ . To save a little work, we will instead calculate the 550 probability of a claim less than 550, θ , and take the complement at the end. Integrating 550 θ over the posterior function:

Z

∞ 600

550 θ

!

3 (6003 ) dθ  θ4

!

 

Z



(550)(3)(6003 ) dθ θ5

600

(550)(3)(6003 ) (4)(6004 )

3 550  0.6875 4 600

The answer is then 1 − 0.6875  0.3125 . (E)

46.15. The setup is the same as in exercise 46.14. Unlike in that exercise, where you had to calculate the posterior, you are given the posterior function here: π ( θ | x) 

3 (6003 ) θ4

θ ≥ 600

This is a single-parameter Pareto with α  3 and θ  600, so using the tables, E[Θ] 

αθ (3)(600)   900 α−1 2

If you didn’t recognize the distribution, you would have to integrate

Z

∞ 600

3 (6003 ) θ dθ θ4

to get the expectation of the posterior distribution. The Bayesian premium is the expected value of the next claim. Since the model is uniform on [0, θ], the model’s mean is θ/2, and the expected value of the next claim is E[θ]/2  450 . (A) The term “Bayesian premium” is unlikely to be used in the future, since only Loss Models uses it, not the other two syllabus options. The other options were not available at the time of this exam. C/4 Study Manual—17th edition Copyright ©2014 ASM

46. BAYESIAN METHODS—CONTINUOUS PRIOR

932

46.16. The model is a two-parameter Pareto. Any time a Pareto is specified, it means a two-parameter Pareto. The prior, however, is a single-parameter Pareto. The two-parameter likelihood of 3 is the density f (3 | Θ) 

2θ 2 . ( θ + 3) 3

Multiplying this by the prior we get the numerator of the posterior, 2θ 2 ( θ + 3) 3

1 2  2 θ ( θ + 3) 3

!

!

θ>1

You can integrate this to get the denominator and then integrate the posterior from 2 to ∞ to calculate the probability of Θ exceeding 2, but here’s a shortcut. Notice that by making the substitution y  θ + 3, this is a constant times a single-parameter Pareto with α  2 and y  4 (since θ  1). The probability that a single parameter Pareto with parameters α  2 and y  4 is greater than 5 (when θ  2, θ + 3  5) is 4 Pr (Θ > 2 | X1 )  Pr ( Y > 5 | X1 )  5

!2  0.64

(E)

46.17. Because the prior is uniform on [0, 0.5] rather than [0, 1], the Bernoulli-beta conjugate prior (Lesson 49) cannot be used. However, the integrals aren’t hard. Also note that the word “posterior” in the problem is being used in the sense of predictive, since the distribution of the claim, not the parameter, is being requested. The likelihood is p 8 . The numerator of the posterior is 2p 8 since the prior density is 2. Integrating this from 0 to 0.5, we get Z 0.5 0.5 2p 9 2 (0.59 )  2p 8 dp  9 0 9 0 So the posterior is

9p 8 0 ≤ p ≤ 0.5 (0.59 ) The predictive probability is calculated by integrating p, the probability of at least 1 loss, over this posterior, or ! 10 ! 0.5 ! ! Z 0.5 9 9p dp p 9 0.510 9    0.9 (0.5)  0.450 (A) 10 0 10 0.59 0.59 0.59 0 π ( p | x) 

46.18. The conditional probability of one claim and zero claims is q (1 − q ) , so the posterior density is q 3 ( q )(1 − q )  q 4 − q 5 divided by a constant. We integrate this from 0.6 to 0.8 to get the constant, and from 0.7 to 0.8 to get the posterior probability; the answer is 0.8 4 (q 0.7 R 0.8 (q4 0.6

R

The integral is

− q 5 ) dq − q 5 ) dq b

q 5 q 6 ( q − q ) dq  − 5 6 a a so we calculate the expression at 0.6, 0.7, and 0.8.

Z

b

4

5

0.85 0.86 −  0.021845 5 6 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 46

933

0.75 0.76 −  0.014006 5 6 0.65 0.66 −  0.007776 5 6 The answer is

0.021845 − 0.014006  0.55721 0.021845 − 0.007776

(D)

46.19. The posterior’s numerator is the prior times the likelihood of 200, or 2θ 1 2θ 2  3 θ 200 2003

0 < θ ≤ 200

Note that θ ≤ 200 or else the likelihood of a claim of 200 is zero. You may recognize this as a beta distribution with a  2, b  1, β (can’t use θ)  200, so that the constant of the posterior density is Γ(a + b ) Γ (3) 2   β a+b−1 Γ ( a ) Γ ( b ) 2002 Γ (2) Γ (1) 2002 If you didn’t recognize it, eliminate the constant 2/2003 from the numerator and calculate the denominator by integrating the numerator. Z 200 2002 θdθ   20,000 2 0 so the posterior density is π ( θ | x )  θ/20,000, 0 < θ ≤ 200. Mean claims given θ is αθ/ ( α − 1)  2θ. If you recognized that the posterior is a beta distribution with θ  200, a  2, b  1, you could evaluate the predictive mean by looking up the tables for beta: E[2θ]  2 E[θ] 

2θa 2 (200)(2)   266 32 a+b 2+1

Otherwise, the expected size of the next claim can be evaluated by integrating 2θ over the posterior: 200

Z 0



1 θ dθ  20,000 10,000

!

200

Z 0

θ 2 dθ 

2003  266 23 30,000

46.20. We’re going to try to recognize the posterior, as discussed in Section 46.2. Constants will be grouped into a single constant C. The prior is π (θ)  and the likelihood is f (x | θ )  C Multiplying the two, we have π ( θ | x)  C

1 θ e−

P

x i /θ

θ 15

e−

P

x i /θ

θ16

This is an inverse gamma distribution with parameters α  15, β (can’t use the letter θ)  β 760 mean is   54.2857 . α−1 14 C/4 Study Manual—17th edition Copyright ©2014 ASM

P

x i  760. The

46. BAYESIAN METHODS—CONTINUOUS PRIOR

934

46.21. The likelihood of no claims in 3 years is e −3λ . The posterior distribution is π ( λ | x)  R

e −3λ

∞ −3λ e dλ 0

which (just looking at the numerator) is an exponential with mean 13 . The predictive distribution is ∞

Z f ( x 4 | x) 

0

e −λ

Z ∞ 3 λ x4  −3λ  3e dλ  λ x4 e −4λ dλ x4 ! x4 ! 0

x4  0, 1, 2, . . .

and we recognize the integrand as a gamma integrand with α  x4 + 1, θ  therefore it is equal to the reciprocal of that constant. The constant is 1 Γ ( α ) θ  x4 ! 4 α

1 4

without the constant and

! x4 +1

so the predictive distribution is f ( x 4 | x) 

3x4 !

1 x4 +1 ! 4

x4 !



3 4x4 +1

x4  0, 1, 2, . . .

You should recognize this as a geometric distribution. Since p n  family with b  0 and a 

1 4



β 1+β ,

so β 

1 3

1 4 p n−1

and the variance is β (1 + β ) 

for n > 0, it is in the ( a, b, 0) 1 4 3 3



4 9

.

While it was interesting deriving the predictive distribution, it was not necessary. You can calculate the variance using the conditional variance formula in conjunction with the posterior. Let N be the number of claims in year 4. Then Var ( N )  VarΛ EN [N | λ] + EΛ VarN ( N | λ )  VarΛ (Λ) + EΛ [Λ]





f

g

since for a Poisson the mean and variance are both λ. Since Λ is exponential with mean 13 , its mean is and its variance is

1 2 3

, so Var ( N ) 

1 2 3

+

1 3



4 9

1 3

.

46.22. The model distribution for number of dollars in the envelope you selected is

 θ f (x | θ)    2θ

Probability 0.5 Probability 0.5

 The likelihood of x is therefore 0.5 for x  θ and 0.5 for x  2θ, 0 otherwise. Multiplying by the prior, we 1 0.5 1 1 1 3 get 0.5 θ  2x for θ  x and θ  x for θ  x/2, 0 otherwise. The denominator of the posterior is 2x + x  2x . Thus the posterior is 1/2x 1    θx    3/2x 3    1/x 2 x π ( θ | x1 )     θ   3/2x 3 2     otherwise 0 If θ  x and you exchange envelopes you’ll have 2x, while if θ  x/2 and you exchange envelopes you’ll have x/2. The expected amount you’ll have is 2 x 1 (2x ) +  x 3 3 2

!

So there is never any increase or decrease in expected value by changing envelopes. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 46

935

46.23. The exponential likelihood is π ( θ )  e −θ/100 /100, θ > 0. The numerator of the posterior is now u1  0.5e −x/100 /100 for θ  x and u2  0.5e −x/200 /100 for θ  x/2. The denominator of the posterior is the sum of these two expressions. The expected amount in the other envelope is u1 (2x ) + u2 ( x/2) 2xe −x/100 + 0.5xe −x/200  u1 + u2 e −x/100 + e −x/200 This will be greater than x when 2e −x/100 + 0.5e −x/200 > e −x/100 + e −x/200 e −x/100 x − 100 x 200 x

> 0.5e −x/200 x > ln 0.5 − 200 < ln 2 < 200 ln 2

or about 138.63. 46.24. The likelihood of no claims is e −λ . The joint distribution is then obtained by multiplying the prior by this, or π ( λ ) f (0 | λ )  (0.5) 5e −6λ + (0.5) 15 e −6λ/5

We recognize this as a mixture of exponentials with parameters 1/6 and 5/6. Let’s write it that way, with the appropriate constants for the exponentials. π ( λ ) f (0 | λ )  0.5

  1 6 −6λ/5 5  −6λ  6e + 0.5 e 6 6 5

The posterior is proportional to this. However, the weights of the mixture must add up to 1. The weights in the above expression, 0.5 (5/6) and 0.5 (1/6) , add up to 0.5, so they must be doubled in order to obtain the posterior density. The posterior density is then π ( λ | X1 ) 

5 1 exponential (1/6) + exponential (5/6) 6 6

where exponential ( λ )  e −x/λ /λ. The expected number of claims, by double expectation, is

f

g

E[X]  E E[X | λ]  E[λ], the posterior mean. The posterior is a mixture of two exponentials. Its mean is the weighted average of the exponential means: ! ! ! ! 1 5 5 5 1 +   0.2778 (A) 6 6 6 6 18 46.25. The probability of 10 claims given λ is e −λ

λ 10 . The joint distribution function is then 10!

1 * 1 1 10 −13λ/12 + .0.4 λ10 e −7λ/6 + 0.6 / π ( λ ) Pr (10 | λ )  λ e 10! 6 12

!

!

, C/4 Study Manual—17th edition Copyright ©2014 ASM

!

-

46. BAYESIAN METHODS—CONTINUOUS PRIOR

936

To simplify the expression, we’ll eliminate the multiplicative constant 1/10! and multiply through by 120 to get rid of the denominators: 8λ 10 e −7λ/6 + 6λ 10 e −13λ/12 We must integrate this expression from 0 to ∞ to obtain the normalizing constant which we divide by to make the expression integrate to 1. This is the posterior density. Then we must integrate the expected number of claims in the second year, λ, over this expression to obtain the answer. In other words, the answer will be  R∞ 8λ 11 e −7λ/6 + 6λ 11 e −13λ/12 dλ 0

R

∞ 0

8λ 10 e −7λ/6 + 6λ 10 e −13λ/12 dλ



We integrate by recognizing that each summand in the mixture is a gamma density function times a constant, and we determine the constant. The reciprocal of this constant is the integral, since a gamma density integrates to 1. The parameters of the gamma are always α  the exponent on the variable plus 1 and θ  the reciprocal of the power e is raised to. The constant is Γ ( α1) θ α , so the integral is equal to Γ ( α ) θ α . In the denominator, the first summand is a gamma with α  11 and θ  67 , and the second summand 12 is a gamma with α  11 and θ  13 . Thus the denominator is equal to 6 Γ (11) *8 7 ,

! 11

12 +6 13

! 11 + -

In the numerator, the first summand is a gamma with α  12 and θ  gamma with α  12 and θ  12 13 . Thus the numerator is equal to 6 Γ (12) *8 7 ,

! 12

12 +6 13

6 7

and the second summand is a

! 12 + -

Note that Γ (12) /Γ (11)  11. Thus the answer is

 12 8 67 + 6 * 11 .  11 6 ,8 7 + 6

12 12 +/ 13 12 11 13 -

3.5543  9.8847  11 3.9554

!

(D)

46.26. If the loss function is absolute loss, the point estimate of the parameter is the median, which for a lognormal distribution is e µ  e 6.2  492.7490 . 46.27. The loss function is absolute loss, so the point estimate is the median. Using the table to obtain VaR0.5 ( X ) : α  VaR0.5 ( X )  θ (0.5−1/τ − 1) −1

 1000 (0.5−1/3 − 1) −1  3847

46.28. For the zero-one loss function, select the mode. (E) 46.29. For absolute value, the median minimizes the loss function. The distribution function is 1 − e −x . Setting it equal to 0.5, we get x  ln 2 . (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 46

937

1 0.8 0.6 0.4 0.2 a

0.5

1

1.5

b

2

Figure 46.2: Graph of posterior in exercise 46.30

46.30. Figure 46.2 graphs the posterior density. We need to find a and b such that the areas of the hatched triangles add up to 0.05 and π ( a | x)  π ( b | x) . Let c  2 − b. Then 2 3a

 2c

from π ( a | x)  π ( b | x)

a  3c

a2

+ c 2  0.05 3 3c 2 + c 2  0.05 1 c2  80

from sum of areas = 0.05

1  0.1118 80 a  3c  0.3354

r

c

b  2 − c  1.8882 The interval is (0.3354, 1.8882) .

Quiz Solutions 46-1.

The probability function of the geometric is px 

βx (1 + β ) x+1

The likelihood of the four years’ experience is f (2, 0, 1, 1 | β ) 

β4 (1 + β ) 8

The product of the likelihood and the prior is π ( β ) f (2, 0, 1, 1 | β )  C/4 Study Manual—17th edition Copyright ©2014 ASM

1

(1 + β ) 8

0 0. You are given that three claims arose in the first year. Determine the posterior distribution of λ. (A)

1 − 32 λ 2e

1 3 − 21 λ 12 λ e

(B)

(C)

1 3 − 12 λ 4λ e

(D)

27 3 − 32 λ 32 λ e

(E)

1 2 − 32 λ 12 λ e

47.3. [4B-S91:49] (2 points) The parameter µ is the mean of a Poisson distribution. If µ has a prior gamma distribution with parameters α and θ, and a sample x1 , x2 , . . . , x n from the Poisson distribution is available, which of the following is the formula for the Bayes estimator of µ (i.e., the mean of the posterior distribution)? (A) (B) (C) (D) (E)

* n ,n +

Pn

i1 ln ( x i )

+· 1 θ-

* n +· 1 , n + θ2  

Pn

α+

n

n i1 + θ1

!

n

Pn

i1

xi

n

Pn

* +· 1 , n +Pθ -

xi

i1

i1

n

xi

xi

1 θ

+*

,n +

n

α·n · α·n+1 1 θ

!

!

1 θ2

+*

,n +

!

+

+*

1 θ

+ · ( αθ ) -

+ · ( αθ )  n · ( αθ )

1 θ2



α·n+1

n

,n +

1 θ

+ · ( αθ ) -

47.4.

[4B-S92:11] (2 points) You are given the following information:



Number of claims follows a Poisson distribution with parameter λ.



The claim frequency rate, λ, has a gamma distribution with mean 0.14 and variance 0.0004.



During the latest two-year period, 110 claims have been observed.



In each of the two years, 310 policies were in force.

Determine the Bayesian estimate of the posterior claim frequency rate based upon the latest observations. (A) (B) (C) (D) (E)

Less than 0.14 At least 0.14, but less than 0.15 At least 0.15, but less than 0.16 At least 0.16, but less than 0.17 At least 0.17

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

47. BAYESIAN CREDIBILITY: POISSON/GAMMA

942

[4B-F92:9] (2 points) You are given the following:

47.5. •

Number of claims for a single insured follows a Poisson distribution with mean λ.



The claim frequency rate, λ, has a gamma distribution with mean 0.10 and variance 0.0003.



During the last three-year period 150 claims have occurred.



In each of the three years, 200 policies were in force.

Determine the Bayesian estimate of the posterior claim frequency rate based upon the latest observations. (A) (B) (C) (D) (E)

Less than 0.100 At least 0.100, but less than 0.130 At least 0.130, but less than 0.160 At least 0.160, but less than 0.190 At least 0.190 [4B-S93:32] (2 points) You are given the following:

47.6. •

The number of claims for a class of business follows a Poisson distribution.



The prior distribution for the expected claim frequency rate of individuals belonging to this class of business is a gamma distribution with mean  0.10 and variance  0.0025.



During the next year, 6 claims are sustained by the 20 risks in the class.

Determine the variance of the posterior distribution for the expected claim frequency rate of individuals belonging to this class of business. (A) (B) (C) (D) (E)

Less than 0.0005 At least 0.0005, but less than 0.0015 At least 0.0015, but less than 0.0025 At least 0.0025, but less than 0.0050 At least 0.0050 [4B-F93:2] (1 point) You are given the following:

47.7. •

Number of claims follows a Poisson distribution with parameter µ.



Prior to the first year of coverage, µ is considered to have the gamma distribution f (µ) 

1000150 149 −1000µ µ e , Γ (150)

µ>0



In the first year, 300 claims are observed on 1,500 exposures.



In the second year, 525 claims are observed on 2,500 exposures. After two years, what is the Bayesian probability estimate of E[µ]?

(A) (B) (C) (D) (E)

Less than 0.17 At least 0.17, but less than 0.18 At least 0.18, but less than 0.19 At least 0.19, but less than 0.20 At least 0.20

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 47

47.8.

943

[4B-S94:25] (3 points) You are given the following:



For an individual risk in a population, the number of claims for a single exposure period follows a Poisson distribution with mean µ.



For the population, µ is distributed according to an exponential distribution with mean 0.1; g ( µ )  10e −10µ ,

µ > 0.



An individual risk is selected at random from the population.



After one exposure period, one claim has been observed. Determine the density function of the posterior distribution of µ for the selected risk.

(A) 11e −11µ 47.9.

(B) 10µe −11µ

(C) 121µe −11µ

(D)

1 −9µ 10 e

(E)

11e −11µ µ2

[4B-S96:21] (2 points) You are given the following:



The number of claims per year for a given risk follows a Poisson distribution with mean θ.



The prior distribution of θ is assumed to be a gamma distribution with mean 1/2 and variance 1/8.

Determine the variance of the posterior distribution of θ after a total of 4 claims have been observed for this risk in a 2-year period. (A) 1/16

(B) 1/8

(C) 1/6

(D) 1/2

(E) 1

47.10. [1999 C4 Sample:4] An individual automobile insured has a claim count distribution per policy period that follows a Poisson distribution with parameter λ. For the overall population of automobile insureds, the parameter λ follows a distribution with density function f ( λ )  5 exp (−5λ ) , λ > 0. One insured is selected at random from the population and is observed to have a total of one claim during two policy periods. Determine the expected number of claims that this same insured will have during the third policy period.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

47. BAYESIAN CREDIBILITY: POISSON/GAMMA

944

47.11. [4B-F94:24] (2 points) You are given the following: •

r is a random variable that represents the number of claims for an individual risk and has the Poisson density function t r e −t f (r )  , r  0, 1, 2, . . . r!



The parameter t has a prior gamma distribution with density function h ( t )  5e −5t ,

t > 0.



A portfolio consists of 100 independent risks, each having identical density functions.



In one year, 10 claims are experienced by the portfolio.

Determine the Bayesian credibility estimate of the expected number of claims in the second year for the portfolio. (A) (B) (C) (D) (E)

Less than 6 At least 6, but less than 8 At least 8, but less than 10 At least 10, but less than 12 At least 12

47.12. [4B-F95:7] (2 points) You are given the following: •

The number of claims per year for a given risk follows a Poisson distribution with mean θ.



The prior distribution of θ is assumed to be a gamma distribution with coefficient of variation 16 .

Determine the coefficient of variation of the posterior distribution of θ after 160 claims have been observed for this risk. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15 Cannot be determined from given information

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 47

945

Use the following information for questions 47.13 and 47.14: You are given the following: •

A portfolio of insurance risks consists of two classes, 1 and 2, that are equal in size.



For a Class 1 risk, the number of claims follows a Poisson distribution with mean θ1 .



θ1 varies by insured and follows an exponential distribution with mean 0.3.



For a Class 2 risk, the number of claims follows a Poisson distribution with mean θ2 .



θ2 varies by insured and follows an exponential distribution with mean 0.7.

47.13. [4B-S95:7] (2 points) Two risks are randomly selected, one from each class. What is the total variance of the number of claims observed for both risks combined? (A) (B) (C) (D) (E)

Less than 0.70 At least 0.70, but less than 0.95 At least 0.95, but less than 1.20 At least 1.20, but less than 1.45 At least 1.45

47.14. [4B-S95:8] (2 points) Of the risks that have no claims during a single exposure period, what proportion can be expected to be from Class 1? (A) (B) (C) (D) (E)

Less than 0.53 At least 0.53, but less than 0.58 At least 0.58, but less than 0.63 At least 0.63, but less than 0.68 At least 0.68

47.15. [4B-S98:4] (2 points) You are given the following: •

A portfolio consists of 10 identical and independent risks.



The number of claims per year for each risk follows a Poisson distribution with mean λ.



The prior distribution of λ is assumed to be a gamma distribution with mean 0.05 and variance 0.01.



During the latest year, a total of n claims are observed for the entire portfolio.



The variance of the posterior distribution of λ is equal to the variance of the prior distribution of λ. Determine n.

(A) 0

(B) 1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2

(D) 3

(E) 4

Exercises continue on the next page . . .

47. BAYESIAN CREDIBILITY: POISSON/GAMMA

946

47.16. [4B-S95:12] (2 points) You are given the following: •

A portfolio consists of 1000 identical and independent risks.



The number of claims for each risk follows a Poisson distribution with mean λ.



Prior to the latest exposure period, λ is assumed to have a gamma distribution, with parameters α  250 and θ  0.0005.



During the latest exposure period, the following loss experience is observed: Number of Claims Number of Risks 0 1 2 3

906 89 4 1 1000

Determine the mean of the posterior distribution of λ. (A) (B) (C) (D) (E)

Less than 0.11 At least 0.11, but less than 0.12 At least 0.12, but less than 0.13 At least 0.13, but less than 0.14 At least 0.14

47.17. [4B-F97:2] (2 points) You are given the following: •

A portfolio consists of 100 identical and independent risks.



The number of claims per year for each risk follows a Poisson distribution with mean λ.



The prior distribution of λ is assumed to be a gamma distribution with mean 0.25 and variance 0.0025.



During the latest year, the following loss experience is observed: Number of Claims 0 1 2

Number of Risks 80 17 3

Determine the variance of the posterior distribution of λ. (A) (B) (C) (D) (E)

Less than 0.00075 At least 0.00075, but less than 0.00125 At least 0.00125, but less than 0.00175 At least 0.00175, but less than 0.00225 At least 0.00225

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 47

947

Use the following information for questions 47.18 and 47.19: You are given the following: •

The number of claims for a particular insured in any given year follows a Poisson distribution with mean θ.



θ does not vary by year.



The prior distribution of θ is assumed to follow a distribution with mean 10/m, variance 10/m 2 , and density function e −mθ m 10 θ 9 , 0 < θ < ∞, f (θ)  Γ (10) where m is a positive integer.

47.18. [4B-F99:23] (2 points) The insured is observed for m years, after which the posterior distribution of θ has the same variance as the prior distribution. Determine the number of claims that were observed for the insured during these m years. (A) 10

(B) 20

(C) 30

(D) 40

(E) 50

47.19. [4B-F99:24] (2 points) As the number of years of observation becomes larger and larger, the ratio of the variance of the predictive (negative binomial) distribution to the mean of the predictive (negative binomial) distribution approaches what value? (A) 0

(B) 1

(C) 2

(D) 4

(E) ∞

47.20. [4-S00:30] You are given: (i) (ii) (iii) (iv) (v) (vi)

An individual automobile insured has an annual claim frequency distribution that follows a Poisson distribution with mean λ. λ follows a gamma distribution with parameters α and θ. The first actuary assumes that α  1 and θ  1/6. The second actuary assumes the same mean for the gamma distribution, but only half the variance. A total of one claim is observed for the insured over a three year period. Both actuaries determine the Bayesian premium for the expected number of claims in the next year using their model assumptions.

Determine the ratio of the Bayesian premium that the first actuary calculates to the Bayesian premium that the second actuary calculates. (A) 3/4

(B) 9/11

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 10/9

(D) 11/9

(E) 4/3

Exercises continue on the next page . . .

47. BAYESIAN CREDIBILITY: POISSON/GAMMA

948

47.21. [4-S01:2] You are given: (i) Annual claim counts follow a Poisson distribution with mean λ. (ii) The parameter λ has a prior distribution with probability density function: f (λ) 

1 −λ/3 e , 3

λ>0

Two claims were observed during the first year. Determine the variance of the posterior distribution of λ. (A) 9/16

(B) 27/16

(C) 9/4

(D) 16/3

(E) 27/4

47.22. [4-F01:3] You are given: (i) The number of claims per auto insured follows a Poisson distribution with mean λ. (ii) The prior distribution for λ has the following probability density function: f (λ)  (iii)

(500λ ) 50 e −500λ λΓ (50)

A company observes the following claims experience: Number of claims Number of autos insured

Year 1 75 600

Year 2 210 900

The company expects to insure 1100 autos in Year 3. Determine the expected number of claims in Year 3. (A) 178

(B) 184

(C) 193

(D) 209

(E) 224

47.23. [C-S05:21] You are given: (i) The annual number of claims for a policyholder follows a Poisson distribution with mean Λ. (ii) The prior distribution of Λ is gamma with probability density function: f (λ) 

(2λ ) 5 e −2λ 24λ

λ>0

,

An insured is selected at random and observed to have x1  5 claims during Year 1 and x2  3 claims during Year 2. Determine E[Λ | x 1  5, x2  3]. (A) 3.00

(B) 3.25

(C) 3.50

(D) 3.75

(E) 4.00

Additional released exam questions: C-F06:10

Solutions 47.1.

The exponent is 1, so α  2. The denominator of the exponent is missing, or 1, so θ  1 and γ  1.

Then α ∗  2 + 1  3, γ∗  1 + 1  2. The posterior distribution is 23 /Γ (3) h 2 e −2h  4h 2 e −2h . (C)



C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISE SOLUTIONS FOR LESSON 47

949

47.2. The variable is missing, or raised to the 0 power, so α  1. The denominator of the exponent is 2, so θ  2 and γ  1/2. Then α∗  1 + 3  4, γ∗  1/2 + 1  3/2. The posterior distribution is



 (3/2) 4 ) /Γ (4) λ3 e −(3/2) λ 

. (D)

27 3 − (3/2) λ λ e 32

47.3. We must add the number of claims, ni1 x i , to α to obtain the posterior parameter α ∗ , and the number of exposures, n, to γ  θ1 , to obtain the posterior parameter γ∗ . The posterior mean is than

P

n α∗ α + i1 x i α∗ θ∗   γ∗ n + θ1

P

(E)

47.4. Let the parameters of the gamma distribution be α and θ. The mean is 0.14  αθ. The variance is 2 700 0.142 0.0004  αθ2 . So θ  0.0004 0.14  700 , γ  2  350, α  0.0004  49. We add 110 claims to α to obtain α ∗ : α ∗  49+110  159. We add 310×2 exposures (310 policies for 2 years) to γ to obtain γ∗ : γ∗  350+620  970. The posterior mean is therefore αγ∗∗  159 970  0.1639 . (D) 47.5.

Let the parameters of the gamma distribution be α and θ. αθ  0.1 αθ 2  0.0003 θ  0.003

γ

1 0.003

1 0.03 1 α∗  + 150  183 31 0.03 1 γ∗  + 600  933 31 0.003 α

The posterior mean is 183 13 /933 13  0.1964 . (E) 47.6.

Let the parameters of the gamma distribution be α and θ. αθ  0.10 αθ2  0.0025 θ  0.025

γ  40

α4 α ∗  4 + 6  10

γ∗  40 + 20  60

The variance of the posterior is α ∗ /γ∗2  10/602  0.00278 . (D)

47.7. α  150, γ  1000. α∗  150 + 300 + 525  975. γ∗  1000 + 1500 + 2500  5000. Posterior mean is 975 5000  0.195 . (D) 47.8.

α  1, γ  10. α ∗  1 + 1  2, γ∗  10 + 1  11. The constant of the posterior gamma is 1 112   121 Γ ( α ) θ α Γ (2)

Posterior distribution is 121µe −11µ . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

47. BAYESIAN CREDIBILITY: POISSON/GAMMA

950

47.9.

Let the parameters of the gamma distribution be α and θ. αθ  12 , αθ2  18 , θ  14 , α  2, γ  4.

α ∗  2 + 4  6, γ∗  4 + 2  6. The posterior variance is 47.10.

6 62



1 6

. (C)

α  1 and γ  5, so we have

α∗  1 + 1  2 γ∗  5 + 2  7

and the expected number of claims is

α0 γ0



2 7

.

47.11. α  1, γ  5. α∗  1 + 10  11, γ∗  5 + 100  105. Posterior mean is 11  the expected number of claims is (100) 105  10.48 . (D) 47.12. The coefficient of variation is coefficient of variation is

√1 196



1 14

√1 α

11 105 ,

and for the portfolio of 100

(see our Example 47A). α  36, α∗  36 + 160  196. The new

 0.0714 . (B)

47.13. An exponential distribution is a gamma distribution in which α  1. Class 1 risks follow a negative binomial with parameters r  1, β  0.3. The variance is 0.3 (1.3)  0.39. Class 2 risks follow a negative binomial with parameters r  1, β  0.7. The variance is 0.7 (1.7)  1.19. We assume that all risks are independent. The total variance is 0.39 + 1.19  1.58 . (E) 47.14. We use the negative binomial distributions we developed in the preceding problem. For Class 1, 10/13 10 1 17 17 1  13 . For Class 2, p0  1.7  10 p0  1.3 17 . The proportion is therefore 10/13+10/17  17+13  30  0.567 . (B) 47.15. Let the parameters of the gamma distribution be α and θ. αθ  0.05, αθ2  0.01, θ  51 , α  41 , γ  5. We have: 1 4

+n

(5 + 10) 2

 0.01

1 + n  2.25 4 n 2

47.16. (B)

(C)

α  250, γ  2000. α ∗  250 + 100  350. γ∗  2000 + 1000  3000. Posterior mean is

47.17. Let the parameters of the gamma distribution be α and θ. Then αθ  0.25 αθ 2  0.0025 θ  0.01

γ  100

α  25 α∗  25 + 17 (1) + 2 (3)  48

γ∗  100 + 80 + 17 + 3  200

The posterior variance is 48/2002  0.0012 . (B) 47.18. Let x be the number of claims. Then γm γ∗  m + m  2m C/4 Study Manual—17th edition Copyright ©2014 ASM

α  10 α∗  10 + x

350 3000

 0.1167 .

EXERCISE SOLUTIONS FOR LESSON 47

951 10 + x 10  2 4m 2 m x  30

(C)

47.19. Let n be the years of observation. The ratio of variance to mean of a negative binomial is 1 + β, 1 which is the same as 1 + θ∗  1 + m+n . As n goes to infinity, this goes to 1 . (B) 47.20. The first actuary’s mean is αθ  1/6 and variance is αθ 2  1/36. Then for the second actuary (we’ll use subscripts 2 for the parameters of the second actuary): 1 6 1 α 2 θ22  72 α2 θ2 

α2  2

θ2 

1 12

Using primes for the revised parameters, α0  1 + 1  2 γ0  6 + 3  9 α02  2 + 1  3

γ20  12 + 3  15

and the ratio of Bayesian premiums is α0/γ0 10 2/9   α02 /γ20 3/15 9 47.21. α∗ γ∗ 2

3

47.22.

α  1 and γ 

3 2 4

 27/16 . (B)

1 3.

Then α ∗  1 + 2  3 and γ∗ 

(C) 1 3

+1 

4 3,

so the posterior variance of λ is

α  50 and θ  1/500, so γ  500. Then α∗  50 + 75 + 210  335

γ∗  500 + 600 + 900  2000

Note that the number of years (2) is irrelevant, since the exposure unit is a single auto insured for one year. The final answer is ! 335 1100  184.25 (B) 2000 47.23. The parameters of the gamma are α  5 and θ  0.5 or γ  2, as we can tell from the exponent on λ and the power of e. So α → 5 + 3 + 5  13 and γ → 2 + 2  4. The expected value is 13/4  3.25 . (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

47. BAYESIAN CREDIBILITY: POISSON/GAMMA

952

Quiz Solutions 47-1. For the gamma, αθ  0.2 and αθ2  1, so θ  5 and α  0.04. Then the posterior parameters are α ∗  0.04 + 0  0.04 and γ∗  1/5 + 3  3.2, making θ∗  1/3.2  0.3125. The parameters of the predictive negative binomial are r  0.04 and β  0.3125, and the predictive probability of 0 claims is Pr ( N  0 | n) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 1+β

!r 

1 1.3125

! 0.04  0.9892

Lesson 48

Bayesian Credibility: Normal/Normal Reading: Loss Models Fourth Edition Example 5.5, exercise 18.19 or Introduction to Credibility Theory 8.2.3, 8.3.3 The normal distribution as a prior distribution is the conjugate prior of a model having the normal distribution with a fixed variance. While the Herzog textbook discusses the normal/normal conjugate prior pair, Loss Models only mentions it in an example and an exercise, and Mahler-Dean briefly mentions it (page 8-104) with no discussion. No released exam questions have appeared on it. However, students have reported questions on this topic on recent exams. Therefore, while this lesson has low priority, it cannot be ignored completely. The model has a normal distribution with mean θ and fixed variance v. The prior hypothesis is that θ has a normal distribution with mean µ and variance a. As in the last lesson, posterior variables will be denoted with asterisk subscripts. The posterior mean is a weighted average of the prior mean and the ¯ Since the model is a continuous distribution, it can be used as a model for loss sizes, or sample mean x. maybe for aggregate losses. If there are n exposures, which means n losses (or n person-years if this is a ¯ In other words, the formula is: model for aggregate losses) the weights are v on µ and na on x. µ∗ 

v ( µ ) + na ( x¯ ) v + na

This formula for the posterior mean can also be expressed in terms of a credibility factor. The credibility factor for the experience is Z  na/ ( na + v ) or Z  n/ ( n + v/a ) , a form which you will better appreciate when studying Bühlmann credibility. Then na v µ∗  Z x¯ + (1 − Z ) µ  x¯ + µ na + v na + v

!

!

The formula is intuitively appealing. The higher v, the less you can trust your experience and the more weight you put on the prior mean µ. If process variance (i.e, the conditional variance given θ) is high, experience is subject to large random fluctuations. On the other hand, the higher n, the more weight you can put on the experience, and the higher a, the more differentiated different classes are and the more weight you can put on experience. The formula for posterior variance has the same denominator as the formula for the mean. It is a∗ 

va v + na

One way to remember these formulas1 is that the information given by each of the prior and the sample mean is the reciprocal of their variances. The variance of the prior is a and the variance of the sample mean is v/n. We weight the prior with the information 1/a and the sample mean with the information n/v, and get ( n/v ) x¯ + (1/a ) µ na x¯ + vµ  µ∗  ( n/v ) + (1/a ) na + v 1shown to me by Ying Xue C/4 Study Manual—17th edition Copyright ©2014 ASM

953

48. BAYESIAN CREDIBILITY: NORMAL/NORMAL

954

and the information of the posterior is the sum of the informations 1/a + n/v so that the variance is the reciprocal of this, or 1 va a∗   1/a + n/v v + na The posterior distribution can be derived by Bayesian techniques. The messy part is that after multiplying the prior and likelihood, you will have an expression that looks like a normal distribution—e raised to negative a square power—but pulling out the parameters requires completing the square. As usual, you don’t have to keep track of multiplicative constants, since the posterior must be a proper distribution, so once you get the normal parameters, you know the multiplicative constant must be the right √ one, 1/σ 2π. The derivation is shown in Table 48.1, for the curious. By the conditional variance formula, the unconditional variance of the original distribution is the sum of v and a. The variance of the predictive distribution is v + a ∗ . The predictive distribution is normal, since a normal mixture of normal distributions is normal. Example 48A A group insurance coverage is assumed to have aggregate losses having a normal distribution with mean θ and variance 640,000. The parameter θ varies by group. It is normally distributed with a mean of 12,000 and a variance of 90,000. You have observed 4 years of experience on a group, and average aggregate losses was 10,000. Determine the probability that this group will generate no more than 11,000 of losses in the fifth year. Answer: This is the hardest type of question, since you must calculate both the posterior mean and variance. Notice the high v and the low a, which means low credibility. The posterior mean is µ∗  The posterior variance is

vµ + na x¯ 640,000 (12,000) + 4 · 90,000 (10,000)   11,280. v + na 640,000 + 4 · 90,000 a∗ 

va (640,000)(90,000)   57,600. v + na 640,000 + 4 · 90,000



The predictive variance is therefore 640,000 + 57,600  697,600. The probability of losses being less than √ 11,000 is Φ (−280/ 697,600)  Φ (−0.34)  1 − 0.6331  0.3669 . Figure 48.1 graphs the prior and posterior density functions.

?

Quiz 48-1 Claim sizes are normally distributed with mean Θ and variance 100. The parameter Θ is normally distributed with mean 8 and variance 16. A policyholder submits three claims with sum 20. Calculate the posterior expected value of Θ. Although a normal distribution may not seem like a practical model due to its assumption of negative values, one could adapt the method described here for a model with a lognormal distribution. If X is lognormal with parameters θ and v, then Y  ln X is normal with the same parameters. If Y’s conjugate prior is normal, then the posterior will also be normal. Example 48B You are given: (i) Losses on an insurance coverage follow a lognormal distribution with density function 2 1 f ( x | θ )  √ e − (ln x−θ ) /2 x 2π

(ii) The parameter θ varies by policyholder in accordance with a normal distribution with density function 2 1 π ( θ )  √ e − ( θ−7) /8 2 2π C/4 Study Manual—17th edition Copyright ©2014 ASM

48. BAYESIAN CREDIBILITY: NORMAL/NORMAL

955

Table 48.1: Derivation of posterior distribution for normal/normal conjugate prior

The model is n (Θ, v ) , meaning that its density function is 2

f ( x | θ, v ) 

e − ( x−θ ) /2v √ 2πv

The prior distribution for Θ is n ( µ, a ) , meaning that its density function is 2

e − ( θ−µ) /2a π (θ)  √ 2πa √ In calculating the posterior, we can drop constants such as 1/ 2πv. Anything not involving θ is a constant. If x i , i  1, . . . n, are the observations, the posterior is then proportional to exp −

(θ − µ)2

P

(xi − θ)2

2a



U

(θ − µ)2

where

2v

+

P

!

 exp (−U/2)

(xi − θ)2

a v Let’s rearrange U by multiplying the first fraction by v and the second fraction by a and expanding the squares, collecting and ignoring additive constants which would become multiplicative constants after exponentiation. U

vθ 2 − 2vθµ + vµ2 + a

P

x 2i − 2aθ

P

x i + naθ 2

av P θ 2 ( v + na ) − 2θ ( µv + a x i ) + constants  av θ 2 − 2θ ( µv + na x¯ ) / ( v + na ) + constants  av/ ( v + na )

 

θ − ( µv + na x¯ ) / ( v + na )

2

+ constants

av/ ( v + na )

where the last line was obtained by completing the square in the numerator. The posterior is proportional exp (−U/2) , and looking at U, we see that the posterior is normal with parameters µ∗  ( µv + na x¯ ) / ( v + na ) and a∗  av/ ( v + na ) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

48. BAYESIAN CREDIBILITY: NORMAL/NORMAL

956

0.08

LEGEND Prior Posterior

0.06

0.04

0.02

10,500

11,000

11,500

12,000

12,500

13,000

Figure 48.1: Prior and posterior density functions in Example 48A

(iii) A policyholder submits the following claims in the first year: 4000 9000 5000 Calculate the expected size of the next claim submitted by this policyholder. Answer: The phrase “the first year” is extraneous, since we’re modeling claim sizes, not aggregate losses. Losses X | θ are lognormal with parameters µ  θ and σ  1, as can be seen by inspecting the density function f ( x | θ ) . Therefore ln X is n ( θ, 1) , where the second parameter is shown as σ 2 rather than as σ, in accordance with tradition.2 We’ll also log the three claims: ln 4000 + ln 9000 + ln 5000  8.63874 3 The prior distribution is n (7, 4) , as can be seen by inspecting the density function. The posterior distribution has parameters 1 (7) + (3)(4)(8.6387)  8.51268 1 + 3 (4) (1)(4) σ∗2   0.30769 1 + 3 (4) µ∗ 

The expected size of a lognormal claim is 2

E[X | θ]  E[e θ+0.5 (1 ) ]  e 0.5 E[e θ ]

Since θ is normally distributed, e θ is lognormally distributed, so 2

E[e θ ]  e µ∗ +0.5σ∗  e 8.51268+0.5 (0.30769)  5805.3 Expected claim size is therefore 5805.3e 0.5  9,571 . 2When we say a distribution is n ( µ, σ2 ) , we mean that it is normal with mean µ and variance σ 2 . C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISES FOR LESSON 48

957

Coverage of this material in the three syllabus options This material is directly covered only in Herzog, and in an example and exercise in Loss Models. However, the exam writers expect that you can work out normal/normal problems from basic principles even if you didn’t encounter it in the text option you chose.

Exercises 48.1. For a given policyholder, claim sizes are normally distributed with a mean of θ and a variance of 50,000. The parameter θ varies by risk class, and is normally distributed with mean 2,000 and variance 20,000. Determine the probability that a claim will be greater than 2,500. 48.2. For a given policyholder, claim sizes are normally distributed with mean θ and variance 1,000,000. The parameter θ varies by risk class, and is normally distributed with mean 2,500 and variance 4,000,000. In a given risk class, the following claims are observed: 3,000

5,000

2,000

2,200

3,300

Determine the posterior expected claim size for this risk class. 48.3. For a given policyholder, claim sizes are normally distributed with mean θ and standard deviation 500. The parameter θ varies by risk, and is normally distributed with mean 5,000 and standard deviation 400. For a certain risk, the following claims are observed: 3,000

5,000

4,500

6,000

5,500

3,000

Determine the posterior expected claim size for this risk. 48.4. For a given policyholder, claim sizes are normally distributed with mean θ and variance 100,000. The parameter θ varies by risk, and is normally distributed with mean 4,000 and variance 10,000. A certain risk class has experienced an average claim size of 4,600. The posterior expected claim size for this class is 4,450. Determine the number of claims observed for this risk class. 48.5. For a given policyholder, claim sizes are normally distributed with mean Θ and variance 100,000. The parameter Θ varies by risk, and is normally distributed with mean 2,000 and variance 1,000,000. For a certain risk class, 10 claims summing to 20,000 are observed. Determine the probability that a future claim from this risk class will be greater than 2,500. 48.6. For a given policyholder, claim sizes are normally distributed with mean Θ and variance 20,000. The parameter Θ varies by risk class, and is normally distributed with mean 1,000 and variance 10,000. After n claims are observed, the posterior variance of Θ is 1,000. Determine n. 48.7. For a given policyholder, claim sizes are normally distributed with mean Θ and variance 400,000. The parameter Θ varies by risk, and is normally distributed with mean 5,000 and variance 100,000. For a certain risk, 20 claims averaging 4,500 are observed. Determine the probability that a future claim from this risk will be less than 5,000. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

48. BAYESIAN CREDIBILITY: NORMAL/NORMAL

958

48.8. For a given policyholder, claim sizes are normally distributed with mean Θ and variance 100,000. The parameter Θ varies by risk, and is normally distributed with mean 1,500 and variance 1,000,000. For a certain risk, 10 claims averaging 1,800 are observed. Determine the posterior probability that θ is less than 1,500. 48.9. For a given policyholder, claim sizes are lognormally distributed with parameters µ  Θ and σ  2.5. The parameter Θ varies by risk, and is normally distributed with mean 6 and variance 5. For a certain risk, ten claims with geometric average 5000 are observed. Determine the posterior probability that Θ is greater than 7. 48.10. For a given policyholder, claim sizes are normally distributed with parameters µ  Θ and σ  30. The parameter Θ varies by risk, and is normally distributed with mean 4000 and variance 1500. After a specific policyholder submits n claims, the predictive variance of future claims from this policyholder is 1000 or less. Determine the smallest possible value for n. 48.11. For a given policyholder, loss sizes follow a lognormal distribution with parameters µ  Θ and σ2  4. The parameter Θ varies by risk, and is normally distributed with mean 6 and variance 3. You observe the following loss sizes from an insured policyholder: 3,000

10,000

2,718

Determine the expected size of the next loss from this policyholder. Additional released exam questions: SOA M-S05:10

Solutions 48.1. The overall mean is 2000 and the overall variance is 20,000+50,000  70,000. The overall distribution is normal. ! 2500 − 2000  1 − Φ (1.89)  1 − 0.9706  0.0294 Pr ( X > 2500)  1 − Φ √ 70,000 48.2.

The sample mean is x¯  3,100. The parameters are µ  2,500, v  106 , and a  4 · 106 . µ∗ 

48.3.

. 48.4.

106 (2,500) + 5 · 4 · 106 (3,100)  3071.43 106 + 5 · 4 · 106

The sample mean is x¯  4,500. The parameters are µ  5,000, v  5002 , and a  4002 . µ∗ 

5002 (5,000) + 6 · 4002 (4,500) 5,570,000,000   4603.31 1,210,000 5002 + 6 · 4002

Let n be the number of claims.

4000 (100,000) + 4600 (10,000n ) 100,000 + 10,000n 4450 (10 + n )  40,000 + 4600n 4450 

4500  150n n  30 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 48

959

48.5. The parameters are µ  2000, v  100,000, and a  106 . In the following, predictive variance is v + a ∗ , as we discussed in the text. 100,000 (2000) + 106 · 10 (2000)  2000 100,000 + 106 · 10 100,000 · 106 1011 106 a∗    100,000 + 106 · 10 105 (101) 101 106 Predictive variance  + 100,000  109,901 101 ! 2500 − 2000 Pr ( X n+1 > 2500 | X)  1 − Φ √ 109,901 µ∗ 

 1 − Φ (1.51)  1 − 0.9345  0.0655 48.6. av (20,000)(10,000)  v + an 20,000 + 10,000n 2000 + 1000n  20,000

1000 

n  18 48.7.

The parameters are µ  5,000, v  400,000, and a  100,000. 400,000 (5000) + (20)(100,000)(4500) 11,000   4583.33 400,000 + (20)(100,000) 2.4 (100,000)(400,000) 40,000   16,666.67 a∗  400,000 + (20)(100,000) 2.4 40,000 Predictive variance  400,000 +  416,666.67 2.4 ! 5000 − 4583.33  Φ (0.65)  0.7422 Pr ( X < 5000)  Φ √ 416,666.67 µ∗ 

48.8.

The parameters are µ  1,500, v  100,000, and a  106 . 1500 (105 ) + 1800 (107 ) 181500   1797.03 101 105 + 107 (105 )(106 ) 106 a∗    9900.99 101 105 + 107 ! 1500 − 1797.03 Pr (Θ < 1500)  Φ  Φ (−2.99)  1 − 0.9986  0.0014 √ 9900.99 µ∗ 

48.9. Let the claims be denoted by x i , i  1, . . . , 10. Since the geometric average, the 10th root of the product of the claims, is 5000, it follows that 10 Y

Xi1

x i  500010

ln x i  ln

C/4 Study Manual—17th edition Copyright ©2014 ASM

Y

x i  ln 500010  10 ln 5000

48. BAYESIAN CREDIBILITY: NORMAL/NORMAL

960 ln x i  ln 5000 10 The parameters of the posterior normal are

P

(2.52 )(6) + (10)(5)(ln 5000)  8.23751 2.52 + (10)(5) (2.52 )(5) σ∗2   0.55556 2.52 + (10)(5) µ∗ 

The posterior probability is 7 − 8.23751 Pr (Θ > 7)  1 − Φ √  Φ (1.66)  0.9515 0.55556

!

48.10. The predictive variance is a ∗ + v and v  302  900, so we need a∗ ≤ 100. av ≤ 100 v + na (1500)(900) ≤ 100 900 + 1500n 135 ≤ 9 + 15n 126 n≥ 15 and since n must be an integer, n ≥ 9 .

48.11. We calculate the sum of the logarithms of the losses: ln 3,000 + ln 10,000 + ln 2,718  25.1244 The parameters are µ  6, v  4, and a  3. In our formulas, n x¯  25.1244. The posterior parameters of the normal distribution are ¯ vµ + n xa 4 (6) + 25.1244 (3)   7.6441 µ∗  v + na 4 + 3 (3) va (4)(3) a∗    0.92308 v + na 4 + 3 (3) Thus the normal distribution associated with the predictive lognormal distribution has µ  7.6441 and variance 4 +0.92308  4.92308. The expected size of the next loss, using the standard formula for the mean of a lognormal distribution, is e 7.6441+0.5 (4.92308)  24,480 . Notice how strange this answer is. The original expected loss size, based on the original associated normal distribution with mean 6 and variance 4+3  7 had mean e 6+0.5 (7)  13,360, and arithmetic average observed loss size was 5239, yet the predictive mean is higher than the original mean! Bayesian credibility for a lognormal model with a prior normal cannot be expressed as a credibility factor Z.

Quiz Solutions 48-1.

The parameters are µ  8, v  100, and a  16. Also, n x¯  20. µ∗ 

C/4 Study Manual—17th edition Copyright ©2014 ASM

100 (8) + 16 (20)  7.5676 100 + 3 (16)

Lesson 49

Bayesian Credibility: Bernoulli/Beta Reading: Loss Models Fourth Edition exercise 15.5 with K j  1 or SN C-21-01 exercises 4.3.1–4.3.5 or Introduction to Credibility Theory 8.2.1, 8.3.1

49.1

Bernoulli/beta

The pre-2000 CAS 4B syllabus gave as much weight to the binomial model with the beta distribution as the conjugate prior as it did to the Poisson/gamma pair. Loss Models, on the other hand, skips it, and discusses a generalized version only in an exercise. The Mahler-Dean study note also does not discuss it directly, but places it in the exercises, and expects you to solve these exercises from basic principles, which admittedly isn’t hard. Still, understanding how to use conjugate priors to work out exam questions on this material. Here’s an example. We will first work it out from basic principles, without using the simplified formulas taking advantage of the relationship of the distributions. Example 49A An insurance coverage allows only one claim per year. The probability of a claim is q. The parameter q varies by insured, and is uniformly distributed on (0, 1) . A certain insured makes 2 claims in 3 years. Determine the posterior probability of a claim in the next year. Answer: The uniform distribution is a special case of a beta distribution with a  b  1, θ  1, but we won’t use that fact. The prior hypothesis is π ( q )  1, 0 ≤ q ≤ 1. The posterior hypothesis is: πQ|X ( q | x)  R

3 2 2 q (1

1 3 2 q (1 0 2

− q)

0 ≤ q ≤ 1.

− q ) dq

The integral in the denominator is: 1

Z 0

so

3 2

R

1 0

1

q 3 q 4 1  − 3 4 0 12

!

2

q (1 − q ) dq 

q 2 (1 − q ) dq  14 . The posterior distribution is 12q 2 (1 − q ) , 0 ≤ q ≤ 1. The mean of this is 1

Z 0

12q 3 (1 − q ) dq  12

1

q 4 q 5 12  −  0.6 . 4 5 0 20

!



Now lets discuss the shortcut. The beta distributions we deal with here are the ones in the Loss Models appendix (which you get on the exam) with parameters a and b, but for which the third parameter θ  1. Thus, they have the density function Γ ( a + b ) a−1 f (x )  x (1 − x ) b−1 . Γ(a )Γ(b ) C/4 Study Manual—17th edition Copyright ©2014 ASM

961

49. BAYESIAN CREDIBILITY: BERNOULLI/BETA

962

This beta distribution is the conjugate prior for a Bernoulli model. If the prior distribution is a beta with parameters a and b, and you observe n Bernoulli trials with k 1s, the posterior distribution is a beta with parameters a∗  a + k and b ∗  b + n − k. Another way to put it is: Prior: Beta (a,b), θ  1 Model: Binomial Observations: x1 ,. . . ,x n , with all x i  0 or 1 Posterior: a → a + number of x i equal to 1 b → b + number of x i equal to 0 Only the “losers”, the number of x i s not equal to 1, are added to b, not the number of observations. This is unlike the Poisson/gamma pair, where the total number of observations n is added to γ. You can easily determine the prior a and b, since a is 1 more than the exponent on x and b is 1 more than the exponent on 1 − x. The mean of a beta distribution with parameters a and b is a/ ( a + b ) . The new mean is a weighted average of the old mean and the experience. The credibility factor is n/ ( n + a + b ) : a∗ a+k a E[θ | x]    a∗ + b∗ n + a + b a+b

!

a+b k + n+a+b n

!

!

n n+a+b

!

The predictive distribution for the next claim is also Bernoulli. It must be, since any distribution that has positive probability at only the two values 0 and 1 must be Bernoulli. (And any distribution that only has positive probability at two values is a Bernoulli that is possibly scaled and shifted.) If the model is a binomial with m > 1, it can be treated as a series of m Bernoullis, and the same shortcut applies. The predictive distribution is more complicated (it is not binomial). However, we can determine the predictive mean. The mean of the binomial model given q is mq, and the posterior mean of q is a∗ / ( a∗ + b∗ ) . Let’s redo the above example using the fact that a uniform distribution is a beta distribution with a  1, b  1. The posterior hypothesis is then a beta with a ∗  1 + 2  3, b∗  1 + 3 − 2  2. The mean is therefore 3 3+2  0.6. As you can see, with the binomial/beta machinery, the problem is very easy. Figure 49.1 illustrates the prior and posterior beta density functions. The posterior clearly has less variance. 3

LEGEND Prior Posterior

2.5 2 1.5 1 0.5 0.2

0.4

0.6

0.8

Figure 49.1: Prior and posterior density functions in Example 49A

C/4 Study Manual—17th edition Copyright ©2014 ASM

1

49.1. BERNOULLI/BETA

963

Let’s now work out the five exercises in Mahler-Dean (4.3.1–4.3.5) using the conjugate prior. Work them out yourself first, then check your solutions against the following. 4.3.1 For a uniform distribution, a  1 and b  1. After 2 claims in 2 years, a ∗  1 + 2  3 and b ∗  1 + 0  1. The posterior density is a beta distribution with parameters 3 and 1, or π ( θ | X) 

Γ (3 + 1) 2 θ (1 − θ ) 0  3θ 2 . Γ (3) Γ (1)

π ( θ | X) 

Γ (4 + 1) 3 θ (1 − θ ) 0  4θ 3 . Γ (4) Γ (1)

4.3.2 Now a∗  1 + 3  4, b∗  1.

4.3.3 a∗  1 + 1  2, b ∗  1 + 2  3 since there were 2 failures. The posterior is π ( θ | X) 

Γ (2 + 3) θ (1 − θ ) 2 . Γ (2) Γ (3)

It is easier to integrate this over [0.3, 0.4] by changing the variable, setting θ0  1 − θ, so the density is 12θ02 − 12θ03 . We integrate this over θ0  [0.6, 0.7].

Z

0.7

Z

0.6 0.7 0.6

12θ02 dθ0  4 (0.7) 3 − 4 (0.6) 3  0.508 12θ03 dθ0  3 (0.7) 4 − 3 (0.6) 4  0.3315

0.508 − 0.3315  0.1765 a∗ 2   0.4 . a∗ + b∗ 5

4.3.4

4.3.5 This is taken from the Fall 1995 CAS 4B exam, question 5. See exercise 49.3 below.

?

Quiz 49-1 For an insurance, annual claim counts follow a Poisson distribution. The probability of at least one claim is θ. The distribution of θ over all policyholders has density function f ( θ )  20θ (1 − θ ) 3

0≤θ≤1

A policyholder submits claims in 1 year out of 4. Calculate the posterior probability that the policyholder submits at least one claim. This shortcut is widely applicable to exam problems. But you must recognize the beta density function to use it. The beta density function is a variable to a power (a − 1), times 1 minus the same variable to a power (b − 1), times a constant. All of the following densities are betas. 1

(a  b  1—the uniform distribution)

2x

(a  2, b  1)

2 (1 − x )

(a  1, b  2)

6x (1 − x ) √ 3 2 x 1 √ 2 x

(a  2, b  2) (a  1.5, b  1) (a  0.5, b  1)

C/4 Study Manual—17th edition Copyright ©2014 ASM

49. BAYESIAN CREDIBILITY: BERNOULLI/BETA

964

In all cases, the density is non-zero only for 0 ≤ x ≤ 1. Look for this pattern! Here is an application to a binomial model with m > 1: Example 49B Annual claim counts follow a binomial distribution with parameters m  3 and Q. The distribution of Q over all policyholders is beta with a  2, b  8, and θ  1. A policyholder submits 4 claims in 2 years. Calculate the posterior expected annual number of claims submitted by this policyholder. Answer: Treat 2 years as 2m  6 Bernoullis. Then a∗  2 + 4  6

b∗  8 + 6 − 4  10 The posterior expectation of annual number of claims is ma∗ / ( a∗ + b ∗ )  (3 · 6) / (6 + 10)  1.125 .

49.2



Negative binomial/beta

The negative binomial/beta pair has not been used on exams as far as I know, and you may skip this section. It is mentioned in Loss Models exercise 15.84, but not in the other two syllabus options. However, it will give you a little practice in applying Bayesian principles to continuous priors. Parametrize the negative binomial distribution in the traditional “k, p” manner (except that we’ll use r instead of k: ! r+x−1 r px  p (1 − p ) x x  0, 1, 2, . . . x With this parametrization, with p  1/ (1 + β ) , the mean is r (1 − p ) /p. Let r be fixed. Assume that the distribution of p is beta with parameters a, b, and θ  1. You have n ¯ Then the product of the prior and the likelihood, omitting constants, observations x 1 ,. . . ,x n with mean x. is π ( p ) f (x | p ) ∼ p a−1 (1 − p ) b−1 p nr (1 − p ) n x¯  p a−1+nr (1 − p ) b−1+n x¯ ¯ So the posterior distribution is a beta with which is the form of a beta with a∗  a + nr and b ∗  b + n x. a ∗  a + nr b ∗  b + n x¯ Unlike for other conjugate priors, the mean of the predictive is not the posterior parameter or a multiple of it, so we must integrate the conditional mean of the model over the posterior to obtain the overall mean. The initial overall mean is (assuming a > 1) Γ(a + b ) E[X]  Γ(a )Γ(b )

Z

Γ(a + b ) Γ(a )Γ(b )

Z



1 0

1

0

r (1 − p ) a−1 p (1 − p ) b−1 dp p

!

rp a−2 (1 − p ) b dp

The integrand is of the form of a beta with parameters a − 1 and b + 1, so the integral is the reciprocal of the normalizing constant for that beta. Γ(a + b ) E[X]  r Γ(a )Γ(b ) C/4 Study Manual—17th edition Copyright ©2014 ASM

!

Γ ( a − 1) Γ ( b + 1) rb  Γ(a + b ) a−1

!

EXERCISES FOR LESSON 49

965

The predictive mean, obtained by using a∗ and b ∗ instead of a and b, can be expressed as a linear function of the original mean and the sample mean: a−1 nr rb r ( b + n x¯ ) + x¯  a + nr − 1 a − 1 a + nr − 1 a + nr − 1

!

!

with credibility factor Z  nr/ ( nr + a − 1) . This assumes that the original mean exists, in other words a > 1. If the prior were uniform, for example, the original mean wouldn’t exist although the predictive mean would. Example 49C Annual claim counts for each policyholder have the following probability function: p x  p (1 − p ) x

x  0, 1, 2, . . .

The parameter p varies by policyholder. The distribution of p has density function π ( p )  6p (1 − p )

0≤p≤1

One policyholder submits 1 claim in 3 years. Calculate the posterior mean of p and the posterior expected annual number of claims from this policyholder. Answer: The model is geometric; r  1. The prior is a beta with parameters a  2, b  2. The sample mean is 1/3. The posterior is a beta with parameters a∗  2 + 3  5, b∗  2 + 1  3. The posterior mean is 5/ (5 + 3)  0.625 . The posterior expected annual number of claims, or the predictive mean, is rb ∗ / ( a∗ − 1)  3/4  0.75 . 

Coverage of this material in the three syllabus options The material on Bernoulli/beta is directly covered only by Herzog. However, exams frequently pose these questions. Without knowing this lesson, you would have to work out Bernoulli/beta problems from first principles. It is to your advantage to know the material in this lesson. The material on negative binomial/beta is only mentioned in an exercise in Loss Models. It was presented mainly to show you an example of how Bayesian credibility works for a distribution with an identifiable posterior.

Exercises 49.1. [4B-S91:32] (2 points) Assume that the number of claims, r, made by an individual insured in one year follows the binomial distribution 3 r p (r )  θ (1 − θ ) 3−r , for r  0, 1, 2, 3. r

!

Also assume that the parameter, θ, has the p.d.f. g ( θ )  6 ( θ − θ 2 ) , for 0 < θ < 1. Given an observation of one claim in a one-year period, what is the posterior distribution of θ? (A) 30θ 2 (1 − θ ) 2

(B) 10θ 2 (1 − θ ) 3

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 6θ 2 (1 − θ ) 2

(D) 60θ 2 (1 − θ ) 3

(E) 105θ 2 (1 − θ ) 4

Exercises continue on the next page . . .

49. BAYESIAN CREDIBILITY: BERNOULLI/BETA

966

[4B-F92:27] (2 points) You are given the following:

49.2. •

The distribution for number of claims is Bernoulli with parameter θ. (3 + 4 + 1) ! 3 The prior distribution of θ is the beta distribution f ( θ )  θ (1 − θ ) 4 , 0 ≤ θ ≤ 1, with 3!4! 4 20 mean and variance . 2 4+5 (4 + 5) (4 + 5 + 1) 2 claims are observed in 3 trials.





Determine the mean of the posterior distribution of θ. (A) (B) (C) (D) (E)

Less than 0.45 At least 0.45, but less than 0.55 At least 0.55, but less than 0.65 At least 0.65, but less than 0.75 At least 0.75

49.3. [4B-F95:5] (2 points) A number x is randomly selected from a uniform distribution of the interval [0,1]. Three independent Bernoulli trials are performed with probability of success x on each trial. All three are successes. What is the posterior probability that x is less than 0.9? (A) (B) (C) (D) (E) 49.4. • •

Less than 0.6 At least 0.6, but less than 0.7 At least 0.7, but less than 0.8 At least 0.8, but less than 0.9 At least 0.9 [4B-F95:24] (3 points) You are given the following:

The probability that a single insured will produce exactly one claim during one exposure period is p, while the probability of no claim is 1 − p. p varies by insured and follows a beta distribution with density function f ( p )  6p (1 − p ) ,

0 ≤ p ≤ 1.

Two insureds are randomly selected. During the first two exposure periods, one insured produces a total of two claims (one in each exposure period) and the other insured does not produce any claims. Determine the probability that each of the two insureds will produce one claim during the third exposure period. (A) 2/9

(B) 1/4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 4/9

(D) 1/2

(E) 2/3

Exercises continue on the next page . . .

EXERCISES FOR LESSON 49

967

[4B-F98:14] (2 points) You are given the following:

49.5. •

The probability that a risk has at least one loss during any given month is p.



p does not vary by month.



The prior distribution of p is assumed to be uniform on the interval (0, 1).



The risk is observed for n months.



At least one loss is observed during each of these n months.



After this period of observation, the mean of the posterior distribution of p for this risk is 0.95. Determine n.

(A) 8

(B) 9

(C) 10

(D) 18

(E) 19

49.6. [4B-S96:30] (3 points) A number x is randomly selected from a uniform distribution on the interval [0,1]. Four Bernoulli trials are to be performed with probability of success x. The first three are successes. What is the probability that a success will occur on the fourth trial? (A) (B) (C) (D) (E)

Less than 0.675 At least 0.675, but less than 0.725 At least 0.725, but less than 0.775 At least 0.775, but less than 0.825 At least 0.825

Use the following information for questions 49.7 and 49.8: On an insurance coverage, the probability of one claim in a year is q; no more than one claim is possible in a year. q varies by insured and has the distribution π ( q )  2 (1 − q )

0 ≤ q ≤ 1.

A randomly selected insured has 2 claims in 5 years. 49.7.

Determine the probability that the insured submits a claim in the next year.

49.8.

Determine the variance in the number of claims for this insured in the following year.

49.9.

[4-F00:11] For a risk, you are given:

(i) The number of claims during a single year follows a Bernoulli distribution with mean p. (ii) The prior distribution for p is uniform on the interval [0, 1]. (iii) The claims experience is observed for a number of years. (iv) The Bayesian premium is calculated as 1/5 based on the observed claims. Which of the following observed claims data could have yielded this calculation? (A) (B) (C) (D) (E)

0 claims during 3 years 0 claims during 4 years 0 claims during 5 years 1 claim during 4 years 1 claim during 5 years

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

49. BAYESIAN CREDIBILITY: BERNOULLI/BETA

968

49.10. [4-F03:31] You are given: (i) The probability that an insured will have exactly one claim is θ. (ii) The prior distribution of θ has probability density function: π (θ) 

3√ θ, 2

0 0.

Determine the mean of the claim severity distribution. (A) 10

(B) 20

(C) 200

(D) 2000

(E) 4000

50.2. For an automobile property damage liability coverage, claim sizes follow an exponential distribution with mean Θ. Θ follows a distribution given by the density function: π (θ) 

10005 e −1000/θ 24 θ6

θ > 0.

A particular insured submits 10 claims. Given these 10 claims, the posterior mean claim size for this insured is 100. Determine the sum of the 10 claim sizes. 50.3.

For an insurance portfolio with 2000 exposures, you are given:

(i) The number of claims for each exposure follows a Poisson distribution. (ii) The mean claim count varies by exposure. The distribution of mean claim counts is a gamma distribution with parameters α  0.75, θ  4. (iii) The size of claims for each exposure follows an exponential distribution. (iv) The mean claim size varies by exposure. The distribution of mean claim sizes is an inverse gamma distribution with parameters α  3, θ  4000. (v) The standard for full credibility of aggregate claims is that aggregate claims must be within 5% of expected 90% of the time. Determine the credibility assigned to this portfolio.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

50. BAYESIAN CREDIBILITY: EXPONENTIAL/INVERSE GAMMA

976

Use the following information for questions 50.4 and 50.5: You are given the following: •

Claim sizes for a given risk follow a distribution with density function f (x | λ) 



(A) (B) (C) (D) (E)

0 < x < ∞,

λ > 0.

The prior distribution of λ is assumed to follow a distribution with mean 50 and density function π (λ) 

50.4.

1 −x/λ , λe

500, 000 −100/λ e , λ4

0 < λ < ∞.

[4B-S98:28] (2 points) Determine the variance of the prior distribution. Less than 2,000 At least 2,000, but less than 4,000 At least 4,000, but less than 6,000 At least 6,000, but less than 8,000 At least 8,000

50.5. [4B-S98:29] (2 points) Determine the density function of the posterior distribution of λ after 1 claim of size 50 has been observed for this risk. 62, 500 −50/λ e (A) λ4 500, 000 −100/λ (B) e λ4 1, 687, 500 −150/λ e (C) λ4 50, 000, 000 −100/λ e (D) 3λ5 84, 375, 000 −150/λ e (E) λ5 50.6.

Claim sizes follow an inverse exponential distribution with parameter Θ: f (x | θ) 

θe −θ/x x2

x > 0.

Θ varies by insured according to a gamma distribution with mean 1000, variance 100,000. A particular insured submits 5 claims in the amounts of 1000, 2000, 1000, 500, 1000. Determine the mean of the posterior distribution for this insured.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 50

977

[4B-F93:26] (3 points) You are given the following:

50.7. •

The amount of an individual claim, Y, follows an exponential distribution function with probability density function 1 f ( y | δ )  e −y/δ , y, δ > 0. δ



The conditional mean and variance of Y given δ are



Var ( Y | δ )  δ2 .

and

E[Y | δ]  δ

The mean claim amount, δ, follows an inverse gamma distribution with density function p (δ) 

4e −2/δ , δ4

δ > 0.

Determine the unconditional density of Y at y  3. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.02 At least 0.02, but less than 0.04 At least 0.04, but less than 0.08 At least 0.08

50.8. [4-S00:10] The size of a claim for an individual insured follows an inverse exponential distribution with the following probability density function: f (x | θ) 

θe −θ/x , x2

x>0

The parameter θ has a prior distribution with the following probability density function: g (θ) 

e −θ/4 , 4

θ>0

One claim of size 2 has been observed for a particular insured. Which of the following is proportional to the posterior distribution of θ? (A) θe −θ/2

(B) θe −3θ/4

(D) θ 2 e −θ/2

(C) θe −θ

(E) θ 2 e −9θ/4

50.9. Loss sizes follow a gamma distribution with parameters α  3.9 and Θ. The parameter Θ varies according to a distribution with the following probability density function: π (θ) 

(1000/θ ) 2 e −1000/θ θ

An insured submits ten losses with average size 2500. Determine the expected size of the next loss from the same insured.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

50. BAYESIAN CREDIBILITY: EXPONENTIAL/INVERSE GAMMA

978

50.10. [4-F00:23] You are given: (i)

The parameter Λ has an inverse gamma distribution with probability density function: g ( λ )  500λ −4 e −10/λ , λ > 0

(ii)

The size of a claim has an exponential distribution with probability density function: f ( x | Λ  λ )  λ −1 e −x/λ , x > 0, λ > 0

For a single insured, two claims were observed that totaled 50. Determine the expected value of the next claim from the same insured. (A) 5

(B) 12

(C) 15

(D) 20

(E) 25

Additional released exam questions: C-S07:30

Solutions 50.1. The mean of the claim severity distribution is E[λ], the same as the mean of the prior hypothesis, which is an inverse gamma with α  2 and θ  20. E[λ]  20/ (2 − 1)  20 . (B) 50.2. The prior hypothesis is an inverse gamma with α  5, θ  1000. α ∗  5 + 10  15. 100  (1000 + P P x i ) /14, so x i  400 .

50.3. µ f  3. Using the fact that the aggregate claim count distribution is negative binomial with the parameters of the gamma, σ2f  0.75 (4)(5)  15. µS  2000. We could use the fact that the aggregate claim size distribution is Pareto with the same parameters as the inverse gamma, but instead we’ll calculate σS2 directly. Let θ be the mean of the exponential; then its variance is θ 2 . σs2  E[θ 2 ] + Var ( θ )  CV2s 

40002 40002 + − 20002  40002 − 20002 2 2

40002 − 20002 3 20002

!2 

1.645 15 nF  + 3  8659.28 0.05 3 8659.28  2886.43 eF  3

r Z



2000  0.8324 2886.43

50.4. The prior distribution is an inverse gamma distribution with α  3, θ  100. E[Λ2 ]  1002 / (2 · 1)  5000, so the variance of Λ is Var (Λ)  5000 − 502  2500 . (B) 50.5. (E).

α ∗  3 + 1  4. θ∗  100 + 50  150. The constant is 1504 /Γ (4)  84,375,000. The answer is therefore

50.6. For the gamma distribution, E[Θ]  αβ  1000 and Var (Θ)  αβ2  100,000, so α  10, β  100. To solve this problem from first principles, you would first calculate the product of the prior density and the likelihood function as a function of θ, ignoring constants. The likelihood of the 5 observations is θ 5 e −θ C/4 Study Manual—17th edition Copyright ©2014 ASM

P

(1/x i )

EXERCISE SOLUTIONS FOR LESSON 50

979

with x i  1000, 2000, 1000, 500, 1000. The prior gamma density is π ( θ ) ∝ θ 10−1 e −θ/100 . Multiplying these two together yields 14 −θ

θ e



P 1 1 xi 100 +



.

This is proportionate to a gamma function with parameters α  15 and β 



1 1 1 1 1 1  0.0155. + + + + + 100 1000 2000 1000 500 1000



The posterior mean is 15/0.0155  967.74 . You could also do this exercise using formulas (50.3) and (50.4). Then P you would1 calculate α∗  1 + x1i (since 100 is the original 10 + 5  15 (since 10 is the original α and there were 5 claims) and γ∗  100 γ), obtaining the same result. If you’ve already read the following lessons on Bühlmann credibility, note that this is less than both the prior mean (1000) and the experience mean (1100), so it is certainly not the Bühlmann credibility estimate. 50.7. fY (3) 



Z 0

Z 



0

4e −2/λ 1 −3/λ e dλ λ4 λ 4e −5/λ dλ λ5

The integrand would be an inverse gamma density if the constant were

θα 54  instead of 4, so the Γ(α) Γ (4)

24 4Γ (4)   0.0384 . (C) 4 625 5 Alternatively, the parameters of the inverse gamma are α  3 and β  2 (using β for the table’s θ to avoid confusion), and the parameters of the unconditional distribution, the Pareto, are the same. The density of a Pareto with parameters α  3 and β  2 at y  3 is value of the integral is

f (3) 

αβ α 3 ( 23 )   0.0384 α+1 (β + y ) 54

50.8. The posterior density is proportional to the product of the prior distribution g and the likelihood f (2 | θ ) , which is, after dropping constants:



θe −θ/2



e −θ/4  θe −3θ/4



Thus the answer is (B) 50.9. The prior is an inverse gamma with parameters α  2 and β  1000. We transform these using the observed losses: α ∗  2 + 3.9 (10)  41

β ∗  1000 + 10 (2500)  26,000

The expected value of Θ, the mean of the inverse gamma, is 26,000/40  650. The expected size of the next claim is 3.9 (650)  2535 . C/4 Study Manual—17th edition Copyright ©2014 ASM

50. BAYESIAN CREDIBILITY: EXPONENTIAL/INVERSE GAMMA

980

50.10. The prior is an inverse gamma with parameters α  3 (one less than the negative exponent on λ) and θ  10. This goes to α ∗  3 + 2  5 and θ∗  10 + 50  60 in the posterior. The mean of this posterior is θ/ ( α − 1)  15 , which is the parameter, and therefore the mean, of the exponential claim size. (C)

Quiz Solutions 50-1.

An inverse exponential is an inverse gamma with α  1. The posterior parameters are α∗  1 + 2  3

β∗  100 + 2000  2100

The posterior mean is 2100/ (3 − 1)  1050 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 51

Bühlmann Credibility: Basics Reading: Loss Models Fourth Edition 18.4–18.5 or SN C-21-01 3.1–3.2 or Introduction to Credibility Theory 6.2–6.3 This lesson and the next two will deal with the basic Bühlmann method. Expect 2–3 questions on the material covered in these three lessons. The Bayesian method is hard to apply in practice, since it requires not only a hypothesis for the loss distribution of a specific risk class, but also a hypothesis for the distribution of that hypothesis over risk classes. And even with that, you end up with integrals which usually require numerical techniques to integrate. Usually, the best you can do is estimate mean and variance of losses. The Bühlmann method is a linear approximation of the Bayesian method. The line is picked to be the weighted least squares approximation of the Bayesian result. The derivation of the Bühlmann method is found in the textbooks on the syllabus, but you are not responsible for it. Let’s return to the Ventnor Manufacturing example at the beginning of Lesson 42. Using the Bühlmann method, you would reason as follows. You assume that there are many different classes of risks. Ventnor may have a disproportionate share of the good risks. Based on both Ventnor’s experience and other experience you have observed, you conclude that the average variance in experience for an insured in a specific risk class is 1,000,000. The average experience of each risk class itself varies from class to class. (Of course—otherwise, how can we talk about “good risks” and “bad risks”?) The variance of this average is 10,000. (We will discuss in Lesson 57 how you may estimate the “average variance” and the “variance of the average experience”.) The average experience of each risk class doesn’t vary that much—10,000 is small relative to the average variance within the class, 1,000,000—leading to lower credibility. Still, you have 1000 exposures. Therefore, you multiply 1000 by 10,000. You divide this by 1000 × 10,000 + 1,000,000, where 1,000,000 is the variance within each class. The higher this variance, the less credibility you assign. 1000×10,000 10 This quotient is 1000×10,000+1,000,000  10 11 . You assign 11 credibility, and therefore set the pure premium equal to 800 (10/11) + 900 (1/11)  809.09. Let’s discuss what we have done. Let Θ represent the hypothesis as to the risk class to which an exposure belongs. We have already defined the hypothetical mean, which is the conditional mean conditioned on the value of the hypothesis Θ. This is denoted in Loss Models by µ (Θ) . In the solutions to the exercises, we will frequently use shorthand such as µ1 for the hypothetical mean of the first group. The overall mean, or µ  EΘ [µ (Θ) ], is  the first  of three important numbers. The variance of the hypothetical means over all hypotheses, VarΘ µ (Θ) , is denoted in Loss Models by a; it is the second of three important numbers. In the Mahler-Dean study note, it is denoted by VHM.1 In the Ventnor example, this was 10,000. The higher this is, the more credibility gets assigned. The process variance is the conditional variance conditioned on the value of the hypothesis Θ. It could have been called the hypothetical variance, and the hypothetical mean could have been called the process mean. It is denoted in Loss Models by v (Θ) . The expected value of the process variances over all hypotheses, EΘ [v (Θ) ], is the third of three important numbers, denoted v. In the Mahler-Dean study note, it is denoted by EPV. In the Ventnor example, this was 1,000,000. The higher it is, the less credibility gets assigned. To summarize µ, or EHM, is the expected value of the hypothetical mean, or the overall mean. 1Herzog avoids using any notation for variance of hypothetical means or expected process variance, until he gets to empirical Bayesian estimation; at that point, he uses a and v. C/4 Study Manual—17th edition Copyright ©2014 ASM

981

51. BÜHLMANN CREDIBILITY: BASICS

982

a, or VHM, is the variance of the hypothetical mean. v, or EPV, is the expected value of the process variance. The following are then defined: • Bühlmann’s k:2 k  v/a • Bühlmann’s credibility factor: Z  n/ ( n + k ) , where n is the number of observations: the number of periods when studying frequency or aggregate losses, the number of claims when studying severity. Combining the two formulas, Bühlmann’s Z can be expressed as na/ ( na + v ) . If you read Lesson 48, this should look familiar: it is the credibility factor for the normal/normal model. To clarify these concepts, let’s apply them to an example. Example 51A (Same data as Example 46B) Claim size follows a single-parameter Pareto distribution with parameters α  3 and Θ. Over all insureds, Θ has a uniform distribution on [1, 4]. An insured selected at random submits 4 claims of sizes 2, 3, 5, and 7. Calculate the following Bühlmann parameters: µ, v, a, k, Z. Answer: The model, or “process”, or “hypothesis”, is single-parameter Pareto, with mean µ (Θ )  and variance

3Θ αΘ2 − v (Θ )  α−2 2

αΘ  1.5Θ α−1

!2

 3Θ2 − 2.25Θ2  0.75Θ2

The overall mean, averaged over the uniform distribution on [1, 4], is µ  E[µ (Θ) ]  E[1.5Θ]  1.5 E[Θ]  1.5 (2.5)  3.75 The expected process variance is 32 v  E[v (Θ) ]  E[0.75Θ ]  0.75 E[Θ ]  0.75 + 2.52  5.25 12 2

!

2

The variance of the hypothetical means is 32 a  Var µ (Θ)  Var (1.5Θ)  2.25 Var (Θ)  2.25  1.6875 12



!



The Bühlmann k and Z are 5.25  3 19 1.6875 4 9 Z  1 16 4 + 39 k



2This concept has no relationship to the k of limited fluctuation credibility, the range parameter. Mahler-Dean uses capital K for the Bühlmann k, but the other two syllabus options use k. The other two options use a different letter for the range parameter. C/4 Study Manual—17th edition Copyright ©2014 ASM

51. BÜHLMANN CREDIBILITY: BASICS

?

983

Quiz 51-1 Claim counts follow a binomial distribution with parameters M and q  0.2. The distribution of M over all policyholders is a shifted Poisson with parameter λ  1: pm  Calculate Bühlmann’s k.

e −1 ( m − 1) !

m  1, 2, 3, . . .

Once you have Z, the credibility expectation is determined the same way as it was in limited fluctuation credibility: PC  Z x¯ + (1 − Z ) µ  µ + Z ( x¯ − µ )

where PC is the Bühlmann credibility expectation and µ is the overall mean, the expectation initially assumed if there is no credibility. PC is called the credibility premium in Loss Models, in contrast to the Bayesian premium, which is the predictive expected value. The other two syllabus options do not use this terminology, so don’t expect to see it on your exam. Instead, PC will probably be called the Bühlmann credibility estimate. It is interesting comparing Bühlmann credibility with limited fluctuation credibility. Limited fluctuation credibility ignores the variance between hypothetical means, and uses a square root rule which allows full credibility. Bühlmann methods never assign full credibility. The credibility factor converges to 1 but never reaches it. Here is an example comparing the two. Example 51B For an insurance risk with multiple risk classes, you are given: (i) Each policyholder is a group of insureds. (ii) Claim frequency for each insured in a group is Poisson. The Poisson mean varies by group, but not for insureds within a policyholder group. The overall mean is 0.2. (iii) The standard for full credibility of number of claims using limited fluctuation credibility methods requires that actual claims be within 5% of expected claims 90% of the time. (iv) The variance of the hypothetical means of claim counts—in other words, the variance of the Poisson parameter over all policyholders—is 0.0002. You are to evaluate the credibility of a policyholder’s experience. For what range of group sizes will Bühlmann credibility methods result in higher credibility than limited fluctuation credibility methods? Answer: The number of exposures is the size of the group. Classical credibility sets n F  n0  1082.41. That is the number of expected claims needed. To express credibility in terms of exposures, we divide by average claims per exposure, 0.2, to get e F  1082.41/0.2  5412.05 √ The limited fluctuation partial credibility factor is Z C  n/5412.05. For Bühlmann credibility, we need a and v. We are given a  0.0002. The process variance—the variance of claim counts given the policyholder—is the same as the hypothetical mean, since for a Poisson distribution, the variance equals the mean. Therefore, the expected process variance equals the expected hypothetical mean, or v  0.2. Bühlmann’s k is 0.2/0.0002  1000, and the Bühlmann credibility factor is n/ ( n + 1000) . Figure 51.1 compares the two partial credibility curves. We see that the limited fluctuation credibility curve increases more rapidly initially, then the Bühlmann curve overtakes it, but ultimately, the limited fluctuation credibility curve is higher since it has to get to 1. C/4 Study Manual—17th edition Copyright ©2014 ASM

51. BÜHLMANN CREDIBILITY: BASICS

984

Credibility Z

1 0.8 0.6 0.4 LEGEND Classical Bühlmann

0.2 1000

2000

3000 Exposures

4000

5000

6000

Figure 51.1: Comparison of Bühlmann and limited fluctuation partial credibility in Example 51B

Let’s carry out the calculation of the intersections. We equate the Z B and Z C . x  x + 1000

r

x 5412.05

Squaring both sides and dividing by x (Of course they are equal for x  0): x 1  x 2 + 2000x + 10002 5412.05 x 2 + (2000 − 5412.05) x + 10002  0 x 2 − 3412.05x + 10002  0

3412.05 ± 3412.052 − 4 (10002 ) x 2 x  323.81, 3088.24

p

So we conclude (assuming the number of exposures is integral) that Bühlmann credibility is higher in the  range of [324, 3088] exposures . If you just want to calculate k or Z but not the credibility premium, you will need a and v, but will not have to calculate µ. However, to calculate a, you will often calculate Var µ (Θ)  E[µ (Θ) 2 ] − E[µ (Θ) ]2  E[µ (Θ) 2 ] − µ2





in which case you will calculate µ. The Bernoulli shortcut which we learned in Section 3.3 will often come in handy for calculating v or a, as is illustrated by the next example. Example 51C There are two classes of risk for an insurance coverage. Loss amounts for each class are as follows: Class A Class B 90 risks 10 risks Loss Size Probability Loss Size Probability 100 0.8 100 0.7 300 0.2 400 0.3 C/4 Study Manual—17th edition Copyright ©2014 ASM

51. BÜHLMANN CREDIBILITY: BASICS

985

Determine Bühlmann’s k for a randomly selected risk. Answer: When all you need is k, you don’t have to calculate µ; you only need to calculate v and a. A randomly selected risk has a probability of 0.9 of being in Class A. The process variances for the two classes are: v A  (300 − 100) 2 (0.8)(0.2)  6,400

v B  (400 − 100) 2 (0.7)(0.3)  18,900

So v  0.9 (6,400) + 0.1 (18,900)  7,650. The hypothetical means for the two classes are:

µA  0.8 (100) + 0.2 (300)  140 µ B  0.7 (100) + 0.3 (400)  190 So a  (190 − 140) 2 (0.9)(0.1)  225. Bühlmann’s k 

v a



7,650 225

 34 .



By conditional variance, the overall variance is equal to the variance of the hypothetical means plus the expected value of the process variance. If X is the variable and Θ is the hypothesis: Var ( X )  VarΘ (EX [X | Θ]) + EΘ [VarX ( X | Θ) ] a+v

Example 51D For a block of business, annual losses have mean 50 and variance 537.5. The block of business is split into three segments: A, B, and C. Segments A and B each comprise 25% of the block, and Segment C comprises the remaining 50% of the block. The following are the hypothetical means and process variances of annual losses for segments A and B: Segment

A

B

Mean Variance

30 300

40 600

Determine the hypothetical mean and process variance of annual losses for Segment C. Answer: Let X be annual losses and I the indicator for the segment. Since the overall mean is the expected value of the hypothetical means of the three segments, we have

f

E[X]  E E[X | I]

g

50  0.25 (30) + 0.25 (40) + 0.50 E[X | C]

E[X | C]  2 (50 − 0.25 (30 + 40)  65





The variance of the hypothetical means 30, 40, and 65 is

a  0.25 (30 − 50) 2 + 0.25 (40 − 50) 2 + 0.5 (65 − 50) 2  237.5

Since the overall variance is 537.5, the expected value of the process variance is 537.5 − 237.5  300. From the given values of the process variance for A and B: 300  0.25 (300) + 0.25 (600) + 0.50v ( C ) v ( C )  2 300 − 0.25 (300) − 0.25 (600)  150



C/4 Study Manual—17th edition Copyright ©2014 ASM





51. BÜHLMANN CREDIBILITY: BASICS

986

Coverage of this material in the three syllabus options This material is required. It is covered in all three syllabus reading options.

Exercises 51.1.

Aggregate losses for various risks in a portfolio have the following probability distribution:

Risk 1 Risk 2 Risk 3 Risk 4 Risk 5

Number of Risks 60 30 15 10 5

Loss Amount 100 200 200 300 300

Loss Amount 500 800 1000 600 700

Probability 0.9 0.9 0.8 0.6 0.4

Probability 0.1 0.1 0.2 0.4 0.6

Determine the expected value of the process variance for this portfolio of risks. 51.2.

Two urns have balls with numbers. The number of balls with each number is as follows: Number on ball Urn 1 Urn 2

0 5 2

1 3 2

2 1 4

x 1 4

x is a positive number. Urn 1 is twice as likely to be selected as urn 2. The variance of the hypothetical means of the values of the balls in the urns is 0.5. Determine x. 51.3. [4B-S90:35] (1 point) The underlying expected loss for each individual insured is assumed to be constant over time. The Bühlmann credibility assigned to the pure premium for an insured observed for one year is 12 . Determine the Bühlmann credibility to be assigned to the pure premium for an insured observed for 3 years.

(A) 1/2 (B) 2/3 (E) Cannot be determined

(C) 3/4

(D) 6/7

51.4. [4B-S91:25] (2 points) Assume that the expected pure premium for an individual insured is constant over time. The Bühlmann credibility for two years of experience is equal to 0.40. Determine the Bühlmann credibility for three years of experience. (A) (B) (C) (D) (E)

Less than 0.500 At least 0.500, but less than 0.525 At least 0.525, but less than 0.550 At least 0.550, but less than 0.575 At least 0.575

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 51

987

[4B-F92:19] (1 point) You are given the following:

51.5. •

The Bühlmann credibility of an individual risk’s experience is



The risk’s underlying expected loss is constant.

1 3

based upon 1 observation.

Determine the Bühlmann credibility for the risk’s experience after four observations. (A) 1/4 (B) 1/2 (E) Cannot be determined

(C) 2/3

(D) 3/4

[4B-S93:3] (1 point) You are given the following:

51.6.

X is a random variable with mean m and variance v. m is a random variable with mean 2 and variance 4. v is a random variable with mean 8 and variance 32. Determine the value of the Bühlmann credibility factor Z after three observations of X. (A) (B) (C) (D) (E)

Less than 0.25 At least 0.25, but less than 0.50 At least 0.50, but less than 0.75 At least 0.75, but less than 0.90 At least 0.90 [4B-F93:25] (1 point) You are given the following:

51.7.

(i) A random sample of losses taken from policy year 1992 has sample variance s 2  16. (ii) The losses are sorted into 3 classes, A, B, and C, of equal size. (iii) The sample variances for each of the classes are: s B2  5

2 4 sA

s C2  6

Estimate the variance of the hypothetical means. (A) (B) (C) (D) (E)

Less than 4 At least 4, but less than 8 At least 8, but less than 12 At least 12, but less than 16 At least 16 [4B-S95:2] (2 points) You are given the following:

51.8. •

The Bühlmann credibility of three observations is twice the credibility of one observation.



The expected value of the process variance is 9. Determine the variance of the hypothetical means.

(A) 3

(B) 4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 6

(D) 8

(E) 9

Exercises continue on the next page . . .

51. BÜHLMANN CREDIBILITY: BASICS

988

51.9. [4B-S95:16] (1 point) Which of the following will DECREASE the Bühlmann credibility of the current observations? 1.

Decrease in the number of observations

2.

Decrease in the variance of the hypothetical means

3.

Decrease in the expected value of the process variance

(A) 1

(B) 2

(C) 3

(D) 1,2

(E) 1,3

51.10. [4B-F95:2] (1 point) The Bühlmann credibility of five observations of the loss experience of a single risk is 0.29. Determine the Bühlmann credibility of two observations of the loss experience of this risk. (A) (B) (C) (D) (E)

Less than 0.100 At least 0.100, but less than 0.125 At least 0.125, but less than 0.150 At least 0.150, but less than 0.175 At least 0.175

51.11. [4B-S96:3] (1 point) Given a first observation with a value of 2, the Bühlmann credibility estimate for the expected value of the second observation would be 1. Given a first observation with a value of 5, the Bühlmann credibility estimate for the expected value of the second observation would be 2. Determine the Bühlmann credibility of the first observation. (A) 1/3

(B) 2/5

(C) 1/2

(D) 3/5

(E) 2/3

51.12. [4B-F96:10] (2 points) The Bühlmann credibility of n observations of the loss experience of a single risk is 1/3. The Bühlmann credibility of n + 1 observations of the loss experience of this risk is 2/5. Determine the Bühlmann credibility of n + 2 observations of the loss experience of this risk. (A) 4/9

(B) 5/11

(C) 1/2

(D) 6/11

(E) 5/9

51.13. [4B-F96:20] (3 points) You are given the following: •

The number of claims for a single risk follows a Poisson distribution with mean θµ.



θ and µ have a prior probability distribution with joint density function f ( θ, µ )  1,

0 < θ < 1, 0 < µ < 1.

Determine the value of Bühlmann’s k. (A) (B) (C) (D) (E)

Less than 5.5 At least 5.5, but less than 6.5 At least 6.5, but less than 7.5 At least 7.5, but less than 8.5 At least 8.5

51.14. 50 observations of experience are made under an insurance coverage. Bühlmann credibility of 75% is given to this experience. Determine the least number of observations needed to obtain 90% credibility. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 51

989

51.15. 50 observations of experience are made under an insurance coverage. Initially, 80% credibility is given to this experience. You then change the assumptions for this coverage so that the variance of the hypothetical means is doubled. Determine the revised Bühlmann credibility assigned to the experience. 51.16. [4B-F97:5] (2 points) You are given the following: •

A portfolio of independent risks is divided into two classes.



Each class contains the same number of risks.



The claim count probabilities for each risk for a single exposure period are as follows: Class 1 2

Probability of 0 Claims 1/4 3/4



All claims incurred by risks in Class 1 are of size u.



All claims incurred by risks in Class 2 are of size 2u.

Probability of 1 Claim 3/4 1/4

A risk is selected at random from the portfolio. Determine the Bühlmann credibility for the pure premium of one exposure period of loss experience for this risk. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.15 At least 0.15, but less than 0.25 At least 0.25, but less than 0.35 At least 0.35

51.17. [4B-S98:2] (1 point) You are given the following: •

The number of claims for a single insured follows a Poisson distribution with mean λ.



λ varies by insured and follows a Poisson distribution with mean µ. Determine the value of Bühlmann’s k.

(A) 1

(B) λ

(C) µ

(D) λ/µ

(E) µ/λ

51.18. [4B-F98:19] (2 points) You are given the following: •

Claim sizes follow a gamma distribution with parameters α and θ  0.5.



The prior distribution of α is assumed to be uniform on the interval (0, 4). Determine the value of Bühlmann’s k for estimating the expected value of a claim.

(A) 2/3

(B) 1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 4/3

(D) 3/2

(E) 2

Exercises continue on the next page . . .

51. BÜHLMANN CREDIBILITY: BASICS

990

Use the following information for questions 51.19 and 51.20: You are given the following: •

Claim sizes follow a Pareto distribution with parameters θ and α  3.



The prior distribution of θ has density function f ( θ )  e −θ ,

0 < θ < ∞.

51.19. [4B-S99:5] (2 points) Determine the expected value of the process variance. (A) 3/8

(B) 3/4

(C) 3/2

(D) 3

(E) 6

51.20. [4B-S99:6] (2 points) Determine the variance of the hypothetical means. (A) 1/4

(B) 1/2

(C) 1

(D) 2

(E) 4

51.21. [4B-F99:18] (2 points) You are given the following: •

Partial Credibility Formula A is based on the methods of limited fluctuation credibility, with 1,600 expected claims needed for full credibility.



Partial Credibility Formula B is based on Bühlmann’s credibility formula with a k of 391.



One claim is expected during each period of observation.

Determine the largest number of periods of observation for which Partial Credibility Formula B yields a larger credibility value than Partial Credibility Formula A. (A) (B) (C) (D) (E)

Less than 400 At least 400, but less than 800 At least 800, but less than 1,200 At least 1,200, but less than 1,600 At least 1,600

51.22. For a disability coverage, there are 3 rating classes. The number of insureds in each class, and the mean frequency of claims in each class, is as follows: Rating Class

Number of Insureds

Mean Frequency of Claims

A B C

400 300 300

0.20 0.30 0.35

Claim frequency for each insured follows a Poisson distribution, with a mean equal to the mean frequency of claims for the class. An insured is selected at random. Determine the Bühlmann credibility given to 10 observations from this insured.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 51

991

51.23. [4-S00:3] You are given the following information about two classes of business, where X is the loss for an individual insured: Number of insureds E[X] E[X 2 ]

Class 1 25 380 365,000

Class 2 50 23 —

You are also given than an analysis has resulted in a Bühlmann k value of 2.65. Calculate the process variance for Class 2. (A) 2,280

(B) 2,810

(C) 7,280

(D) 28,320

(E) 75,050

51.24. [4-S01:6] You are given: (i) The full credibility standard is 100 expected claims. (ii) The square-root rule is used for partial credibility. You approximate the partial credibility formula with a Bühlmann credibility formula by selecting a Bühlmann k value that matches the partial credibility formula when 25 claims are expected. Determine the credibility factor for the Bühlmann credibility formula when 100 claims are expected. (A) 0.44

(B) 0.50

(C) 0.80

(D) 0.95

(E) 1.00

51.25. [C-S05:11] You are given: (i) (ii) (iii) (iv) (v) (vi)

The number of claims in a year for a selected risk follows a Poisson distribution with mean λ. The severity of claims for the selected risk follows an exponential distribution with mean θ. The number of claims is independent of the severity of claims. The prior distribution of λ is exponential with mean 1. The prior distribution of θ is Poisson with mean 1. A priori, λ and θ are independent.

Using Bühlmann’s credibility for aggregate losses, determine k. (A) 1

(B) 4/3

(C) 2

(D) 3

(E) 4

51.26. You are given the following: (i)

Losses for a given policyholder follow a gamma distribution with parameters α and θ. θ does not vary by policyholder. (ii) α is a random variable with mean 10. (iii) The Bühlmann credibility of one observation is 0.25. Determine Var ( α ) . (A) 5/3

(B) 20/9

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 5/2

(D) 3

(E) 10/3

Exercises continue on the next page . . .

51. BÜHLMANN CREDIBILITY: BASICS

992

Use the following information for questions 51.27 and 51.28: You are given the following: •

For an individual risk in a population, the number of claims for a single exposure period follows a Poisson distribution with parameter λ.



For the population, λ is distributed according to an exponential distribution: h ( λ )  5e −5λ

λ > 0.



An individual risk is randomly selected from the population.



After two exposure periods, one claim has been observed.

51.27. [4B-F94:25] (2 points) For the selected risk, determine the expected value of the process variance. (A) 0.04

(B) 0.20

(C) 0.29

(D) 5.00

(E) 25.00

51.28. [4B-F94:26] (3 points) Determine the density function of the posterior distribution of λ for the selected risk. (A) 7e −7λ

(B) 5λe −7λ

(C) 49λe −7λ

(D) 108λ 2 e −6λ

(E) 270λ 2 e −6λ

Additional released exam questions: C-S07:6,21

Solutions 51.1.

This exercise is mostly a showcase of the Bernoulli shortcut. v1  (0.9)(0.1)(500 − 100) 2  14,400

v2  (0.9)(0.1)(800 − 200) 2  32,400

v3  (0.8)(0.2)(1000 − 200) 2  102,400

v4  (0.6)(0.4)(600 − 300) 2  21,600

v5  (0.4)(0.6)(700 − 300) 2  38,400 v

51.2.

1 120



60 (14,400) + 30 (32,400) + 15 (102,400) + 10 (21,600) + 5 (38,400)  31,500



Let µ i be the hypothetical mean of urn i. Then µ1  0.5 (0) + 0.3 (1) + 0.1 (2) + 0.1x  0.5 + 0.1x µ2  16 (0) + 16 (1) + 26 (2) + 26 ( x ) 

5 6

+ 13 x

The variance of these two hypothetical means is expressed using the Bernoulli shortcut. 1 2 1 7 + x a  ( µ1 − µ2 ) ( )( )  3 3 3 30



2

Now we solve for x.

 C/4 Study Manual—17th edition Copyright ©2014 ASM

1 3

+

2 7 30 x

 2.25

2

2  0.5 9

!

EXERCISE SOLUTIONS FOR LESSON 51

993

1 3

+

7 30 x

 1.5

10 + 7x  45 x 51.3.

First we back out the Bühlmann k from Z 

1 2

35 7

 5

when n  1:

n 1  n+k 2 1 1  1+k 2 k1 Z

Then we calculate Z when n  3. Z 51.4.

n 3 3   n+k 3+1 4

The Bühlmann credibility factor Z 

n n+k .

(C)

Here we are given that n  2 and Z  0.40, so

2  0.4 2+k k3 If we now set n  3, then Z  3/ (3 + 3)  0.5 . (B) One of the rare cases where the answer is at the bound of a range. 51.5.

The Bühlmann credibility factor Z 

n n+k .

Here we ar given that n  1 and Z  13 , so

1 1  1+k 3 k2 If we now set n  4, then

4 4+2



2 3

. (C)

51.6. m is the hypothetical mean of X and v is the process variance. The expected value of the process variance is the expected value of v, or 8. The variance of the hypothetical means is the variance of m, or 4. Bühlmann’s k  va  48  2. Then Z

n 3   0.6 n+k 3+2

(C)

51.7. The total variance is 16 and the expected value of the process variances is (1/3)(4 + 5 + 6)  5. The expected value of the process variances plus the variance of the hypothetical means adds up to the total variance, by the conditional variance formula: Var ( X )  E[Var ( X | I ) ] + Var (E[X | I])  EPV + VHM So the variance of the hypothetical means is 16 − 5  11 . (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

51. BÜHLMANN CREDIBILITY: BASICS

994

51.8.

Since Z  n/ ( n + k ) , the first condition implies 3 1 2 3+k 1+k

!

3 + 3k  6 + 2k k3 Since k  v/a, the second condition implies 9 a a 3

3

(A)

n k  1− . Decreasing n will decrease the denominator, increase the fraction, and n+k n+k decrease 1 minus the fraction, so the first statement is true. (It is also intuitively obvious.) n Z  . Decreasing a will increase k, increase the denominator, and decrease the fraction, so n + v/a the second statement is true. Decreasing v will decrease k, decrease the denominator, and increase the fraction, so the third statement is false. (D) 51.9.

Z 

51.10. As usual, we back out k from Z  n/ ( n + k ) . Here we are given that Z  0.29 and n  5. 5  0.29 5+k 5 − 1.45 3.55 k  0.29 0.29 0.58 2   0.140 Z 2 + 3.55/0.29 4.13

(C)

51.11. Bühlmann credibility is a linear function of the mean of the observations; in fact PC  Z X¯ + (1 − Z ) µ Here we are given 1  2Z + (1 − Z ) µ 2  5Z + (1 − Z ) µ Subtracting the first equation from the second, we see that 3Z  1, so Z  1/3 . (A) 51.12. From n observations,

n 1  ⇒ n + k  3n n+k 3

From n + 1 observations,

n+1 2  n+1+k 5

Substituting n + k  3n and solving for n, n+1 2  3n + 1 5 5n + 5  6n + 2 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 51

995

n3 k6 Calculating the credibility factor for n + 2  5 observations, 5 5  5+6 11

(B)

51.13. In this exercise, the classes are continuous rather than discrete. Since the number of claims is Poisson, the hypothetical mean is θµ and process variance is also v ( θ, µ )  θµ. Both θ and µ are uniformly distributed on [0, 1] and are independent, so the expected value of the process variance is the product of the expected values of θ and µ: 1 v  E[v ( θ, µ ) ]  E[θ] E[µ]  2

!

1 1  2 4

!

The variance of the hypothetical means can be calculated by calculating the expected value of the squares of the hypothetical means and then subtracting the square of the overall expected value: a  E[θ 2 µ2 ] − E[θµ]2

For a uniform distribution, the second moment is 31 , and by independence the expected value of the product is the product of the expected values, so E[θµ]  E[θ] E[µ] 

1 2

1 E[θ µ ]  E[θ ] E[µ ]  3 2 2

2

1 1  2 4

!

2

!

!

1 1  3 9

!

!2

51.14. First we back out k. Z

1 7 1 a −  9 4 144 v 144 36 k    5.14 a 4·7 7

(A)

50 12.5 50  0.75 ⇒ 37.5 + 0.75k  50 ⇒ k   50 + k 0.75 3

Now we solve for n such that Z  0.9. n Z  0.9 ⇒ 0.9n + 15  n ⇒ n  150 n + 50 3 51.15. 50 50 + v/a v a v 2a 50 50 + 6.25

C/4 Study Manual—17th edition Copyright ©2014 ASM

 0.8  12.5  6.25 

8 9

51. BÜHLMANN CREDIBILITY: BASICS

996

51.16. Let µ i be the hypothetical mean of Class i and v i the process variance of Class i. µ1  34 u

µ2  24 u We use the Bernoulli shortcut to calculate the variance of the hypothetical means. 1 a 2

!

1 2

!

3 2 u− u 4 4

2



u2 64

We also use the Bernoulli shortcut to calculate the process variance of each class. 1 v1  4

!

3 2 3 2 u  u 4 16

!

3 1 12 2 4u 2  u 4 4 16   1 3 2 12 2 15 2 v u + u  u 2 16 16 32 v 15 (64)  30 k  a 32 1 1 Z   0.032 (A) 1 + 30 31

!

!

v2 

51.17. λ is both the hypothetical mean and the process variance, since for a Poisson the two are the same. Therefore, a  Var ( λ )  µ v  E[λ]  µ µ k  1 µ

(A)

51.18. 1 a  Var (0.5α )  Var ( α )  4 1 v  E[0.25α]  0.25 E[α]  2 1/2 3 k  (D) 1/3 2 1 4

!

16 1  12 3

!

51.19. The process variance is the variance of a Pareto, or Var ( X | θ )  E[X 2 | θ] − E[X | θ]2

2θ 2 θ  − ( α − 1)( α − 2) α−1  θ2 −

!2

θ 2 3θ 2  4 4

The prior distribution of θ is exponential with mean 1, whose second moment is 2, so the expected value of 3θ 2 /4 is 3 (2) /4  3/2 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 51

997

51.20. The hypothetical mean is θ/ ( α − 1)  θ/2. The variance of θ, which is exponential with mean 1, is 1, so the variance of θ/2 is 1/4 . (A) √ 51.21. By formula A, Z  n/1600, where n is the number of expected claims. By formula B, Z  n/ ( n + 391) . n > n + 391

r

n 1600

n2 n > ( n + 391) 2 1600 1600n > ( n + 391) 2  n 2 + 782n + 3912

n



n 2 − 818n + 3912 < 0

818 + 8182 − 4 · 3912 818 + 240   529 2 2

(B)

Note that the lower solution for n, 289, is the beginning of the range for which formula B gives larger credibility than formula A. Formula B gives less credibility below 289, more between 289 and 529, and less above 529. 51.22. v  0.4 (0.2) + 0.3 (0.3) + 0.3 (0.35)  0.275 µ  0.275 2

E[µ (Θ) ]  0.4 (0.22 ) + 0.3 (0.32 ) + 0.3 (0.352 )  0.07975 a  0.07975 − 0.2752  0.004125 0.275 v k  a 0.004125 10 Z  0.1304 10 + 0.275/0.004125 51.23. There is a 1/3 probability of Class 1 and a 2/3 probability of Class 2. The variance of the hypothetical means is ! ! 1 2 a (380 − 23) 2  28,322 3 3 and k  v/a, so v 28,322 v  75,053.3

2.65 

However, the process variance for Class 1 is 365,000 − 3802  220,600, so 1 2 (220,600) + PV2  75,053.3 3 3 PV2  2279.95

C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

51. BÜHLMANN CREDIBILITY: BASICS

998

51.24. The square-root rule gives Z 

q

25 100

 12 . Then

25 1  25 + k 2 k  25 100  0.8 100 + k

(C)

51.25. Let S be the random variable for aggregate losses, and as usual N will be the frequency random variable and X the severity random variable. For this compound process, by the formulas for mean and variance of a compound Poisson distribution (see equation (14.4) for the latter) µ ( λ, θ )  E[S | λ, θ]  E[N | λ, θ] E[X | λ, θ]  λθ v ( λ, θ )  Var ( S | λ, θ )  λ E[X 2 ]  2λθ 2

To calculate the variance of the hypothetical mean λθ, we will calculate the first and second moments of λθ. Since λ and θ are independent, the moment of the product is the product of the moments. The distribution of λ is exponential with mean 1, so the second moment of λ is twice its mean squared (see the distribution tables) or 2 (12 )  2. The distribution of θ is Poisson, so its variance equals its mean, which is 1, and its second moment is the sum of its variance and the square of its mean, or 1 + 12  2. Therefore E[λθ]  E[λ] E[θ]  1 E ( λθ ) 2  E λ 2 E θ 2  (2)(2)  4

f

g

f

g

f

g

a  Var ( λθ )  4 − 12  3

The expected value of the process variance v ( λ, θ ) is

v  E[2λθ2 ]  2 E[λ] E[θ 2 ]  (2)(1)(2)  4 Bühlmann’s k is 4/3 . (B) 51.26. The hypothetical mean is αθ and the process variance is αθ 2 . We back out Var ( α ) . v  E[αθ 2 ]  10θ 2 a  Var ( αθ )  θ 2 Var ( α ) 1 0.25  Z  1+k 10θ 2 k3 2 θ Var ( α ) Var ( α ) 

10 3

(E)

51.27. The process variance is λ, and its expectation is the mean of the exponential distribution, or 0.20 . (B) 51.28.

α  1, γ  5. α ∗  1 + 1  2, γ∗  5 + 2  7. The constant of the posterior gamma is 1 72   49 Γ ( α ) θ α Γ (2)

Posterior density is 49λe −7λ . (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 51

999

Quiz Solutions 51-1. The hypothetical mean is µ ( M )  0.2M and the process variance is v ( M )  0.2 (0.8) M  0.16M. The Poisson is shifted by 1, so its mean is E[M]  λ + 1  2 and its variance is Var ( M )  λ  1, since shifting doesn’t affect variance. Then v  E[0.16M]  0.32 a  Var (0.2M )  0.04 v 0.32 k   8 a 0.04

C/4 Study Manual—17th edition Copyright ©2014 ASM

1000

C/4 Study Manual—17th edition Copyright ©2014 ASM

51. BÜHLMANN CREDIBILITY: BASICS

Lesson 52

Bühlmann Credibility: Discrete Prior Reading: Loss Models Fourth Edition 18.4–18.5 or SN C-21-01 3.1–3.2 or Introduction to Credibility Theory 6.4–6.5 In this lesson, we will work on problems in which the prior hypothesis is discrete. The model will usually be discrete too. You are asked to calculate the credibility premium, the Bühlmann credibility estimate of the next claim count/claim size/aggregate loss. On pre-2000 exams, these questions came in pairs: for a single set of data, you are asked to calculate the Bayesian premium and the credibility premium. Current CBT exams do not have such grouped questions. Still, these questions will help you review Bayesian methods as you learn Bühlmann methods. Example 52A Two urns contain balls each marked with 0, 1, or 2 in the proportions described below: Percentage of Balls in Urn Urn A Urn B

Marked 0 0.20 0.70

Marked 1 0.40 0.20

Marked 2 0.40 0.10

An urn is selected at random and two balls are selected, with replacement, from the urn. The sum of values on the selected balls is 2. Two more balls are selected from the same urn. 1. [4B-S92:8] (3 points) Determine the expected total of the two balls using Bayes’ Theorem. (A) (B) (C) (D) (E)

Less than 1.6 At least 1.6, but less than 1.7 At least 1.7, but less than 1.8 At least 1.8, but less than 1.9 At least 1.9

2. [4B-S92:9] (3 points) Determine the expected total of the two balls using Bühlmann’s credibility formula. (A) (B) (C) (D) (E)

Less than 1.6 At least 1.6, but less than 1.7 At least 1.7, but less than 1.8 At least 1.8, but less than 1.9 At least 1.9 1. The probability of selecting Urn A and observing 2 is 0.5 0.42 + 2 (0.2)(0.4)  0.16. The



Answer:



probability of selecting Urn B and observing 2 is 0.5 0.22 +2 (0.7)(0.1)  0.09. The expected value of 2 balls





from Urn A is 2 0.4 (1) + 0.4 (2)  2.4. The expected value of 2 balls from Urn B is 2 0.2 (1) + 0.1 (2)  0.8. So taking the weighted average of these expected values, the Bayesian premium is







(0.16)(2.4) + (0.09)(0.8) 0.16 + 0.09

C/4 Study Manual—17th edition Copyright ©2014 ASM

1001

 1.824 .

(D)



52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1002

2. We calculate µ, v, and a, using 2 balls as a single exposure unit: µ  0.5 (2.4) + 0.5 (0.8)  1.6 v A  2 0.4 (12 ) + 0.4 (22 ) − 2 (1.22 )  1.12





v B  2 0.2 (12 ) + 0.1 (22 ) − 2 (0.42 )  0.88





v  0.5 (1.12) + 0.5 (0.88)  1

a  (0.5)(0.5)(2.4 − 0.8) 2  0.64 The calculation of a used the Bernoulli shortcut. v 1   1.5625 a 0.64 1 1  since there’s only one exposure unit of 2 balls Z 1 + 1.5625 2.5625 1 PC  1.6 + (2 − 1.6)  1.756 . (C) 2.5625 k

You could also calculate the parameters using each ball separately as an exposure unit and set n  2 when calculating Z.  In this question, you were asked to calculate the expected value in two ways—Bayes and Bühlmann. However, the Bayes method calculates the true expected value, whereas the Bühlmann method is only an approximation of the expected value.

Whenever a credibility question asks you to calculate the expected value of something and doesn’t specify how, use the Bayesian method. The Bayesian method calculates the true expected value. The Bühlmann method is only an approximation.

?

Quiz 52-1 For a certain insurance coverage, only one claim per year can be submitted. The expected annual number of claims from good risks is 0.1 and the expected annual number of claims from bad risks is 0.2. Good risks comprise 70% of the population. A group of 10 homogeneous risks (in other words, they’re all good or they’re all bad) submits 2 claims in one year. Calculate the Bühlmann prediction for the number of claims submitted by this group in the following year. In many of these problems, you are better off not calculating k, but going directly to Z using the formula Z  na/ ( na + v ) . There are two things you should be careful about: 1. The exposure unit. 2. The unit you calculating credibility for.

The exposure unit The exposure unit is based on the item for which the credibility premium is charged. In other words, if you are calculating number of claims per . . . or claim size per . . . , the exposure unit is what you replace the ellipsis with. Normally, this is “insured” in the first case and ”claim” in the second case. You have C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1003

the option of making the unit any number of insureds in the first case and any number of claims in the second case. But you may not use “insured” as the unit in the second case. In the previous example, we treated 2 balls as an exposure unit and had 1 observation. We could’ve also treated 1 ball as an exposure unit and had 2 observations. Example 52B Number of claims for each member of a group follows a Poisson distribution with mean λ. λ varies by insured according to a uniform distribution on (0, 0.1). You are given three years of experience for the group: Year

Number of members

Number of claims

1 2 3

120 150 170

3 4 4

The group will have 200 members in year 4. Calculate the Bühlmann credibility premium for the group in year 4. Answer: The hypothetical mean and the process variance are both λ. The expected value of λ is the expected value of the uniform distribution, or 0.05; hence µ  v  0.05. The variance of the hypothetical 0.05  60. means is the variance of the uniform distribution, or 0.12 /12. Hence k  0.01/12 The exposure unit is 1 member-year, not 1 year. The random variable that is being measured is number of claims per member-year, not number of claims per year. It would be unusual that the number of members would be treated as a random variable to be estimated using Bühlmann credibility. Hence there are 120 + 150 + 170  440 exposures, and the credibility factor is Z  440/ (440 + 60)  0.88. The experience mean is x¯  (3 + 4 + 4) /440  0.025. Hence the premium for the group in year 4 is 200 0.12 (0.05) + 0.88 (0.025)  200 (0.028)  5.6







The unit you are calculating credibility for Occasionally you will encounter a situation where you are given heterogeneous groups. However, you will be asked for the Bühlmann credibility premium for an insured selected from the group. You must calculate the variance of the hypothetical means taking into account the means of each insured, not the means of the groups. Example 52C There are two groups of insureds, A and B. Each group is equally large. The number of claims for each member of either group follows a Poisson distribution. You are given the following information on mean number of claims for members of each group. Group

Average hypothetical mean

Variance of hypothetical means

A B

0.1 0.2

0.04 0.09

Calculate the Bühlmann credibility to assign to one observation of one member. Answer: The overall mean µ is 0.5 (0.1 + 0.2)  0.15, and since number of claims for each member is Poisson, v  µ is also 0.15. Note that the hypothetical means vary within each group. It is therefore incorrect to calculate a as the variance of 0.1 and 0.2, which by the Bernoulli shortcut would be 41 (0.12 )  0.0025. This would ignore the variation of the hypothetical means within each group. Instead, • Let Λ be the hypothetical mean of each individual, the expected number of claims. C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1004

• Let I be an indicator variable which is equal to A or B when a member is from Group A or B. Using the conditional variance formula, a  Var (Λ)  E Var (Λ | I ) + Var E[Λ | I]

f

g





The expected value of Λ | I is 0.1 in Group A and 0.2 in Group B, so Var E[Λ | I]  0.0025, as calculated above. The variance of Λ | I is given as 0.04 in Group A and 0.09 in Group B, so the expected value f g of the variance of hypothetical means within each group, is the average of 0.04 and 0.09, or E Var (Λ | I )  0.065. Therefore, a  0.065 + 0.0025  0.0675. Then



0.15 20 v   a 0.0675 9 1 9 Z  1 + 20/9 29



k



It is important to take into account every class in the model. You should not use the variances or means of combinations of classes, as the next example illustrates. Example 52D You sell individual health coverage. Aggregate claim costs vary for each insured, based on the insured’s diet and exercise habits. The following table lists the mean and variance of annual aggregate claim costs per insured. Annual aggregate claim costs Exercise Bad diet Good diet Habit Expected claims Claim variance Expected claims Claim variance Sedentary 8 20 4 12 Active 6 10 2 8 Total 7 16 3 11 40% of insureds have a bad diet and 60% have a good diet. Calculate the Bühlmann credibility factor for one year of experience. Wrong answer: The hypothetical means are 7 and 3, and using the Bernoulli shortcut, the variance of the hypothetical means is a  (0.4)(0.6)(7 − 3) 2  3.84. The process variances are 16 and 11 and their expected value is v  (0.4)(16) + (0.6)(11)  13. The credibility factor is Z

a 3.84   0.228029 a + v 3.84 + 13



Whoa! Just because there are summary statistics labeled “Total” doesn’t mean you should use them! A lot of students got this wrong when a similar question, with different numbers, was asked on the exam. You must consider all four possibilities. Any individual can be sedentary or active. When calculating credibility for an individual, you cannot ignore different possibilities for classes that are part of your model. Let’s do it the right way. Answer: The overall mean is µ  0.4 (7) + 0.6 (3)  4.6. No problem with that, since the mean of the means is the mean, so this is the mean of all four classes. µ will be useful for calculating the variance of the hypothetical means. Since the mean of 8 and 6 is 7 and the mean of 4 and 2 is 3, we deduce that in each diet class, half the group is sedentary and half is active. This means that the probabilities of the four classes are C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1005

Bad diet 0.2 0.2

Sedentary Active

Good diet 0.3 0.3

We have the process variances for each of the four classes. Then the expected process variance is v  0.2 (20) + 0.2 (10) + 0.3 (12) + 0.3 (8)  12 The hypothetical means are 8, 6, 4, 2. The second moment of these means is 0.2 (82 ) + 0.2 (62 ) + 0.3 (42 ) + 0.3 (22 )  26 so the variance of the hypothetical means is a  26 − µ2  26 − 4.62  4.84. The credibility factor is Z

?

a 4.84   0.287411 a + v 4.84 + 12



Quiz 52-2 For auto comprehensive insurance: (i) Each insured submits either 0 or 1 claims per year. (ii) The probability of submitting 1 claim is Q. (iii) For an urban insured, Q is uniformly distributed on [0.2, 0.5]. (iv) For a rural insured, Q is uniformly distributed on [0.1, 0.28]. (v) 50% of all insureds are urban and 50% are rural. An insured of unknown type submits 0 claims in a year. Calculate the Bühlmann prediction for expected number of claims submitted by this insured in the following year. To further emphasize the importance of calculating credibility for the right level, we’ll do an example both ways. Example 52E You are given: (i) Annual claim counts follow a Poisson distribution with mean λ. (ii) Within a group, λ  θ or 2θ, with each possibility equally likely. (iii) Over all groups, θ  0.2 with probability 0.75 and θ  0.6 with probability 0.25. 1. An individual is selected. Calculate the Bühlmann credibility factor to assign to one year of experience from that individual for computing future claim counts from that individual. 2. A group is selected. Calculate the Bühlmann credibility factor to assign to one year of experience from one individual of that group for computing future claim counts from an individual randomly selected from that group. Answer: 1. The hypothesis and process are based on one individual, making the hypothetical mean and process variance both equal to λ. The mean of λ is

f

g

E[λ]  E E[λ | θ]  E[1.5θ]

E[θ]  0.75 (0.2) + 0.25 (0.6)  0.3

E[1.5θ]  0.45 C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1006

So the expected process variance is 0.45. The second moment of λ is E[λ2 ]  E E[λ2 | θ]  E[2.5θ 2 ]

f

g

E[θ 2 ]  0.75 (0.22 ) + 0.25 (0.62 )  0.12

E[2.5θ 2 ]  0.3 So the variance of the hypothetical means is a  0.3 − 0.452  0.0975. We could’ve also calculated this using conditional variance, conditioning on θ. The Bühlmann credibility factor is a/ ( a + v )  0.0975/ (0.0975 + 0.45)  0.178082 . 2. The hypothesis and process are a mixed distribution. The mean is E[λ]  1.5θ and the variance, computed with conditional variance, is E[Var ( X | λ ) ] + Var (E[X | λ])  E[λ] + Var ( λ )  1.5θ + 0.25θ 2 where we used the Bernoulli shortcut to compute the variance of λ. The variance of the hypothetical means is a  Var (1.5θ )  1.52 Var ( θ )  1.52 (0.4) 2 (0.25)(0.75)  0.0675 and the expected process variance is v  E[1.5θ + 0.25θ 2 ]  1.5 (0.3) + 0.25 0.75 (0.22 ) + 0.25 (0.62 )  0.48





The Bühlmann credibility factor is a/ ( a + v )  0.0675/ (0.0675 + 0.48)  0.123288 . It is not a coincidence that a + v is the same in both cases, since a + v represents the overall variance of claim counts, as we discussed at the end of the previous lesson. 

Coverage of this material in the three syllabus options This material is required. It is covered in all three syllabus reading options.

Exercises 52.1. [4B-S96:24] (2 points) A die is randomly selected from a pair of fair, six-sided dice, A and B. Die A has its faces marked with 1, 2, 3, 4, 5, and 6. Die B has its faces marked with 6, 7, 8, 9, 10, and 11. The selected die is rolled four times. The results of the first three rolls are 1, 2, and 3. Determine the Bühlmann credibility estimate of the expected value of the result of the fourth roll. (A) (B) (C) (D) (E)

Less than 1.75 At least 1.75, but less than 2.25 At least 2.25, but less than 2.75 At least 2.75, but less than 3.25 At least 3.25

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1007

Use the following information for questions 52.2 and 52.3: Three urns contain balls marked with either 0 or 1 in the proportions described below. Urn A Urn B Urn C

Marked 0

Marked 1

10% 60 80

90% 40 20

An urn is selected at random and three balls are selected, with replacement, from the urn. The total of the values is 1. Three more balls are selected from the same urn. 52.2. (A) (B) (C) (D) (E)

[4B-S90:39] (2 points) Calculate the expected total of the three balls using Bayes’ Theorem. Less than 1.05 At least 1.05, but less than 1.10 At least 1.10, but less than 1.15 At least 1.15, but less than 1.20 At least 1.20

52.3. [4B-S90:40] (2 points) Calculate the expected total of the three balls using Bühlmanns’s credibility formula. (A) (B) (C) (D) (E)

Less than 1.05 At least 1.05, but less than 1.10 At least 1.10, but less than 1.15 At least 1.15, but less than 1.20 At least 1.20

52.4. [4B-S91:45] (3 points) A population of insureds consists of two classifications each with 50% of the total insureds. The Bühlmann credibility for the experience of a single insured within a classification is calculated below.

Classification

Mean Frequency

Variance of Hypothetical Means

Expected Value of Process Variance

Bühlmann Credibility

A B

0.09 0.27

0.01 0.03

0.09 0.27

0.10 0.10

Calculate the Bühlmann credibility for the experience of a single insured selected at random from the population if its classification is unknown. (A) (B) (C) (D) (E)

Less than 0.08 At least 0.08, but less than 0.10 At least 0.10, but less than 0.12 At least 0.12, but less than 0.14 At least 0.14

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1008

Use the following information for questions 52.5 and 52.6: One spinner is selected at random from a group of three different spinners. Each of the spinners is divided into six equally likely sectors marked as described below. Number of Sectors Spinner

Marked 0

Marked 12

Marked 48

A B C

2 3 4

2 2 1

2 1 1

52.5. [4B-S91:37] (2 points) Assume a spinner is selected and a zero is obtained on the first spin. What is the Bühlmann credibility estimate of the expected value of the second spin using the same spinner? (A) (B) (C) (D) (E)

Less than 12.5 At least 12.5, but less than 13.0 At least 13.0, but less than 13.5 At least 13.5, but less than 14.0 At least 14.0

52.6. [4B-S91:38] (3 points) Assume a spinner is selected and a 12 was obtained on the first spin. Use Bayes’ theorem to calculate the expected value of the second spin using the same spinner. (A) (B) (C) (D) (E) 52.7.

Less than 12.5 At least 12.5, but less than 13.0 At least 13.0, but less than 13.5 At least 13.5, but less than 14.0 At least 14.0 [4B-F96:4] (2 points) You are given the following:



A portfolio of independent risks is divided into three classes.



Each class contains the same number of risks.



For each risk in Classes 1 and 2, the probability of exactly one claim during one exposure period is 1/3, while the probability of no claim is 2/3.



For each risk in Class 3, the probability of exactly one claim during one exposure period is 2/3, while the probability of no claim is 1/3.

A risk is selected at random from the portfolio. During the first two exposure periods, two claims are observed for this risk (one in each exposure period). Determine the Bühlmann credibility estimate of the probability that a claim will be observed for this same risk during the third exposure period. (A) 4/9

(B) 1/2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 6/11

(D) 5/9

(E) 3/5

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1009

[4B-S96:16] (3 points) You are given the following:

52.8. •

Two urns contain balls.



In Urn A, half of the balls are marked 0 and half of the balls are marked 2.



In Urn B, half of the balls are marked 0 and half of the balls are marked t.

An urn is randomly selected. A ball is then randomly selected from this urn, observed, and replaced. You wish to estimate the expected value of the number on the second ball randomly selected from this same urn. For which of the following values of t would the Bühlmann credibility of the first observation be greater than 1/10? (A) 1

(B) 2

(C) 3

(D) 4

(E) 5

Use the following information for questions 52.9 and 52.10: Four urns contain balls marked with either 0 or 1 in the proportions described below. Urn

Marked 0

Marked 1

A B C D

70% 70 30 20

30% 30 70 80

An urn is selected at random and four balls are selected from the urn with replacement. The total of the values is 2. Four more balls are selected from the same urn. 52.9. (A) (B) (C) (D) (E)

[4B-S91:50] (2 points) Calculate the expected total of the four balls using Bayes’ theorem. Less than 1.96 At least 1.96, but less than 1.99 At least 1.99, but less than 2.02 At least 2.02, but less than 2.05 At least 2.05

52.10. [4B-S91:51] (3 points) Calculate the expected total of the four balls using Bühlmann’s credibility formula. (A) (B) (C) (D) (E)

Less than 1.96 At least 1.96, but less than 1.99 At least 1.99, but less than 2.02 At least 2.02, but less than 2.05 At least 2.05

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1010

52.11. [4B-S97:23] (3 points) You are given the following: •

Two urns contain balls.



In Urn A, half of the balls are marked 0 and half of the balls are marked 2.



In Urn B, half of the balls are marked 0 and half of the balls are marked t.

An urn is randomly selected. A ball is then randomly selected from this urn, observed, and replaced. An estimate is to be made of the expected value of the number on the second ball randomly selected from this same urn. Determine the limit of the Bühlmann credibility of the first observation as t goes to infinity. (A) 0

(B) 1/3

(C) 1/2

(D) 2/3

(E) 1

Use the following information for questions 52.12 and 52.13: Number of Claims

Size of Loss

Class

Mean

Variance

Mean

Variance

A B

1/6 5/6

5/36 5/36

4 2

20 5

Classes A and B have the same number of risks. A risk is randomly selected from one of the two classes and four observations are made of the risk. 52.12. [4B-S92:18] (3 points) Determine the value for the Bühlmann credibility, Z, that can be applied to the observed pure premium. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20

52.13. [4B-S92:19] (1 point) The pure premium calculated from the four observations is 0.25. Determine the Bühlmann credibility estimate for the risk’s pure premium. (A) (B) (C) (D) (E)

Less than 0.25 At least 0.25, but less than 0.50 At least 0.50, but less than 0.75 At least 0.75, but less than 1.00 At least 1.00

52.14. A portfolio of insurance risks has two classes, A and B. The number of claims for risks in class A follows a geometric distribution with parameter β  0.1. The number of claims for risks in class B follows a geometric distribution with parameter β  0.5. There are three times as many risks in class A as in class B. A given risk is selected at random. No claims are observed on this risk for m years. Determine m such that the Bühlmann credibility expectation of the number of claims per year for this risk is 0.15.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1011

Use the following information for questions 52.15 and 52.16: •

An insurance portfolio consists of two classes, A and B.



The number of claims distribution for each class is: Probability of Number of Claims Class

0

1

2

3

A B

0.7 0.5

0.1 0.2

0.1 0.1

0.1 0.2



Class A has three times as many insureds as Class B.



A randomly selected risk from the portfolio generates 1 claim over the most recent policy period.

52.15. [4B-S93:26] (2 points) Determine the Bayesian analysis estimate of the claims frequency rate for the observed risk. (A) (B) (C) (D) (E)

Less than 0.72 At least 0.72, but less than 0.78 At least 0.78, but less than 0.84 At least 0.84, but less than 0.90 At least 0.90

52.16. [4B-S93:27] (2 points) Determine the Bühlmann credibility estimate of the claims frequency rate for the observed risk. (A) (B) (C) (D) (E)

Less than 0.72 At least 0.72, but less than 0.78 At least 0.78, but less than 0.84 At least 0.84, but less than 0.90 At least 0.90

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1012

Use the following information for questions 52.17 and 52.18: Two spinners, A1 and A2 , are used to determine number of claims. Each spinner is divided into regions marked 0 and 1, where 0 represents no claims and 1 represents a claim. The probability of a claim for each spinner is: Spinner A1 A2

Probability of Claim 0.15 0.05

A second set of spinners, B1 and B2 , represent claim severity. Each spinner has two areas marked 20 and 40. The probabilities for each claim size are: Spinner B1 B2

Claim Size 20 40 0.80 0.20 0.30 0.70

A spinner is selected randomly from A1 and A2 and a second from B1 and B2 . Three observations from the selected spinners yield the following claim amounts: 0, 20, 0 52.17. [4B-F92:6] (3 points) Use Bühlmann credibility to separately estimate the expected number of claims and expected severity. Use these estimates to calculate the expected value of the next observation from the same pair of spinners. (A) (B) (C) (D) (E)

Less than 2.9 At least 2.9, but less than 3.0 At least 3.0, but less than 3.1 At least 3.1, but less than 3.2 At least 3.2

52.18. [4B-F92:7] (3 points) Determine the Bayesian estimate of the expected value of the next observation from the same pair of spinners. (A) (B) (C) (D) (E)

Less than 2.9 At least 2.9, but less than 3.0 At least 3.0, but less than 3.1 At least 3.1, but less than 3.2 At least 3.2

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1013

Use the following information for questions 52.19 and 52.20: You are given the following: •

Two risks have the following severity distribution: Amount of Claim

Probability of Claim Amount for Risk 1 Risk 2

100 1,000 20,000

0.50 0.30 0.20

0.70 0.20 0.10



Risk 1 is twice as likely as Risk 2 of being observed.



A claim of 100 is observed, but the observed risk is unknown.

52.19. [4B-F93:17] (2 points) Determine the Bayesian analysis estimate of the expected value of a second claim amount from the same risk. (A) (B) (C) (D) (E)

Less than 3,500 At least 3,500, but less than 3,650 At least 3,650, but less than 3,800 At least 3,800, but less than 3,950 At least 3,950

52.20. [4B-F93:18] (3 points) Determine the Bühlmann credibility estimate of the expected value of a second claim amount from the same risk. (A) (B) (C) (D) (E)

Less than 3,500 At least 3,500, but less than 3,650 At least 3,650, but less than 3,800 At least 3,800, but less than 3,950 At least 3,950

52.21. [4-F00:19] For a portfolio of independent risks, you are given: (i) The risks are divided into two classes, Class A and Class B. (ii) Equal numbers of risks are in Class A and Class B. (iii) For each risk, the probability of having exactly 1 claim during the year is 20% and the probability of having 0 claims is 80%. (iv) All claims for Class A are of size 2. (v) All claims for Class B are of size c, an unknown but fixed quantity. One risk is chosen at random, and the total loss for one year for that risk is observed. You wish to estimate the expected loss for that same risk in the following year. Determine the limit of the Bühlmann credibility factor as c goes to infinity. (A) 0

(B) 1/9

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 4/5

(D) 8/9

(E) 1

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1014

Use the following information for questions 52.22 and 52.23: The aggregate loss distributions for two risks for one exposure period are as follows: Risk A B

Aggregate Losses $0 $50 $1000 0.80 0.16 0.04 0.60 0.24 0.16

A unit is selected at random and observed to have 0 of losses in the first two exposure periods. 52.22. [4B-S94:8] (2 points) Determine the Bayesian analysis estimate of the expected value of the aggregate losses for the same risk’s third exposure period. (A) (B) (C) (D) (E)

Less than 90 At least 90, but less than 95 At least 95, but less than 100 At least 100, but less than 105 At least 105

52.23. [4B-S94:9] (3 points) Determine the Bühlmann credibility estimate of the expected value of the aggregate losses for the same risk’s third exposure period. (A) (B) (C) (D) (E)

Less than 90 At least 90, but less than 95 At least 95, but less than 100 At least 100, but less than 105 At least 105

52.24. An insurance portfolio has two classes of risk, A and B. You are given: (i)

For class A, the number of claims per year for each insured has a binomial distribution with parameters m  2, q  0.25. The size of each claim is 1000. (ii) For class B, the number of claims per year for each insured has a binomial distribution with parameters m  4, q  0.25. The size of each claim is 1000. (iii) Each class is equally large. (iv) Claim counts and sizes are independent. An insured is selected at random. After 5 years of experience, the Bühlmann credibility estimate for aggregate losses for that insured is 700. Determine the average aggregate losses per year for that risk. (A) (B) (C) (D) (E)

Less than 575 At least 575, but less than 625 At least 625, but less than 675 At least 675, but less than 725 At least 725

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1015

Use the following information for questions 52.25 and 52.26: A portfolio of 200 independent insureds is subdivided into two classes as follows:

Class

Number of Insureds

Expected Number of Claims Per Insured

Variance of Number of Claims Per Insured

Expected Severity Per Claim

Variance of Severity Per Claim

1 2

50 150

0.25 0.50

0.75 0.75

4 8

20 36



Claim count and severity for each insured are independent.



A risk is selected at random from the portfolio, and its pure premium, P1 , for one exposure period is observed.

52.25. [4B-S94:17] (3 points) Use the Bühlmann credibility method to estimate the expected value of the pure premium for the second exposure period for the same selected risk. (A) 3.25 (B) 0.03P1 + 3.15 (C) 0.05P1 + 3.09 (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) 0.08P1 + 3.00

52.26. [4B-S94:18] (1 point) After three exposure periods, the observed pure premium for the selected risk is P. The selected risk is returned to the portfolio. Then, a second risk is selected at random from the portfolio. Use the Bühlmann credibility method to estimate the expected pure premium for the next exposure period for the newly selected risk. (A) 3.25

(B) 0.09P + 2.97

(C) 0.14P + 2.80

(D) 0.20P + 2.59

(E) 0.99P + 0.03

52.27. [Sample C4:24] Type A risks have each year’s losses uniformly distributed on the interval [0, 1]. Type B risks have each year’s losses uniformly distributed on the interval [0, 2]. A risk is selected at random with each type being equally likely. The first year’s losses equal L. Let X be the Bühlmann credibility estimate of the second year’s losses. Let Y be the Bayesian estimate of the second year’s losses. Which of the following statements is true? (A) (B) (C) (D) (E)

If L < 1 then X > Y. If L > 1 then X < Y. If L  12 then X < Y. There are no values of L such that X  Y. There are exactly two values of L such that X  Y.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1016

Use the following information for questions 52.28 through 52.30: Two urns contain balls with each ball marked 0 or 1 in the proportions described below: Percentage of Balls in Urn Marked 0 Marked 1 Urn A Urn B

20% 70

80% 30

An urn is randomly selected and two balls are drawn from the urn. The sum of the values on the selected balls is 1. Two more balls are selected from the same urn. NOTE: Assume that each selected ball has been returned to the urn before the next ball is drawn. 52.28. [4B-F94:5] (3 points) Determine the Bayesian analysis estimate of the expected value of the sum of the values on the second pair of selected balls. (A) (B) (C) (D) (E)

Less than 1.035 At least 1.035, but less than 1.055 At least 1.055, but less than 1.075 At least 1.075, but less than 1.095 At least 1.095

52.29. [4B-F94:6] (3 points) Determine the Bühlmann credibility estimate of the expected value of the sum of the values on the second pair of selected balls. (A) (B) (C) (D) (E)

Less than 1.035 At least 1.035, but less than 1.055 At least 1.055, but less than 1.075 At least 1.075, but less than 1.095 At least 1.095

52.30. [4B-F94:7] (1 point) The sum of the values of the second pair of selected balls was 2. One of the two urns is then randomly selected and two balls are drawn from the urn. Determine the Bühlmann credibility estimate of the expected value of the sum of the values on the third pair of selected balls. (A) (B) (C) (D) (E)

Less than 1.07 At least 1.07, but less than 1.17 At least 1.17, but less than 1.27 At least 1.27, but less than 1.37 At least 1.37

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1017

Use the following information for questions 52.31 and 52.32: The aggregate loss distributions for three risks for one exposure period are as follows:

Risk A B C

Aggregate Losses 0 50 2000 0.80 0.16 0.04 0.60 0.24 0.16 0.40 0.32 0.28

A risk is selected at random and is observed to have aggregate loss of 50 in the first exposure period. 52.31. [4B-S95:19] (2 points) Determine the Bayesian analysis estimate of the expected value of the aggregate losses for the same risk’s second exposure period. (A) (B) (C) (D) (E)

Less than 300 At least 300, but less than 325 At least 325, but less than 350 At least 350, but less than 375 At least 375

52.32. [4B-S95:20] (3 points) Determine the Bühlmann credibility estimate of the expected value of the aggregate losses for the same risk’s second exposure period. (A) (B) (C) (D) (E)

Less than 300 At least 300, but less than 325 At least 325, but less than 350 At least 350, but less than 375 At least 375

52.33. [4-F01:11] An insurer writes a large book of home warranty policies. You are given the following information regarding claims filed by insureds against these policies: (i) A maximum of one claim may be filed per year. (ii) The probability of a claim varies by insured, and the claims experience for each insured is independent of every other insured. (iii) The probability of a claim for each insured remains constant over time. (iv) The overall probability of a claim being filed by a randomly selected insured in a year is 0.10. (v) The variance of the individual insured claim probabilities is 0.01. An insured selected at random is found to have filed 0 claims over the past 10 years. Determine the Bühlmann credibility estimate for the expected number of claims the selected insured will file over the next 5 years. (A) 0.04

(B) 0.08

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.17

(D) 0.22

(E) 0.25

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1018

Use the following information for questions 52.34 and 52.35: You are given the following: •

A portfolio of independent risks is divided into two classes of equal size.



All of the risks in Class 1 have identical claim count and claim size distributions as follows: CLASS 1 Number of Claims Probability 1 1/2 2 1/2



CLASS 1 Claim Size Probability 50 2/3 100 1/3

All of the risks in Class 2 have identical claim count and claim size distributions as follows: CLASS 2 Number of Claims Probability 1 2/3 2 1/3

CLASS 2 Claim Size Probability 50 1/2 100 1/2



The number of claims and claim size(s) for each risk are independent.



A risk is selected at random from the portfolio, and a pure premium of 100 is observed for the first exposure period.

52.34. [4B-F95:14 and 1999 C4 Sample:19] (3 points) Determine the Bayesian analysis estimate of the expected number of claims for this same risk for the second exposure period. (A) 4/3

(B) 25/18

(C) 41/29

(D) 17/12

(E) 3/2

52.35. [4B-F95:15 and 1999 C4 Sample:20] (2 points) A pure premium of 150 is observed for this risk for the second exposure period. Determine the Bühlmann credibility estimate of the expected pure premium for this same risk for the third exposure period. (A) (B) (C) (D) (E)

Less than 110 At least 110, but less than 120 At least 120, but less than 130 At least 130, but less than 140 At least 140

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1019

Use the following information for questions 52.36 and 52.37: You are given the following: •

A portfolio of independent risks is divided into three classes.



Each class contains the same number of risks.



For all of the risks in Class 1, claim sizes follow a uniform distribution on the interval from 0 to 400.



For all of the risks in Class 2, claim sizes follow a uniform distribution on the interval from 0 to 600.



For all of the risks in Class 3, claim sizes follow a uniform distribution on the interval from 0 to 800. A risk is selected at random from the portfolio. The first claim observed for this risk is 340.

52.36. [4B-S97:11] (2 points) Determine the Bayesian analysis estimate of the expected value of the second claim observed for this same risk. (A) (B) (C) (D) (E)

Less than 270 At least 270, but less than 290 At least 290, but less than 310 At least 310, but less than 330 At least 330

52.37. [4B-S97:12] (2 points) Determine the Bühlmann credibility estimate of the expected value of the second claim observed for this same risk. (A) (B) (C) (D) (E)

Less than 270 At least 270, but less than 290 At least 290, but less than 310 At least 310, but less than 330 At least 330

52.38. [4B-F99:14] (3 points) You are given the following: •

A portfolio of independent risks is divided into two classes.



Each class contains the same number of risks.



The claim count distribution for each risk in Class A is a mixture of a Poisson distribution with mean 1/6 and a Poisson distribution with mean 1/3, with each distribution in the mixture having a weight of 0.5.



The claim count distribution for each risk in Class B is a mixture of a Poisson distribution with mean 2/3 and a Poisson distribution with mean 5/6, with each distribution in the mixture having a weight of 0.5. A risk is selected at random from the portfolio. Determine the Bühlmann credibility of one observation for this risk.

(A) 9/83

(B) 9/82

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1/9

(D) 10/83

(E) 5/41

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1020

Use the following information for questions 52.39 and 52.40: Ten urns contain five balls, numbered as follows: Urn 1: 1,2,3,4,5 Urn 2: 1,2,3,4,5 Urn 3: 1,2,3,4,5 Urn 4: 1,2,3,4,5 Urn 5: 1,2,3,4,5 Urn 6: 1,1,1,1,1 Urn 7: 2,2,2,2,2 Urn 8: 3,3,3,3,3 Urn 9: 4,4,4,4,4 Urn 10: 5,5,5,5,5 An urn is randomly selected. A ball is then randomly selected from this urn. The selected ball has the number 2 on it. This ball is replaced, and another ball is randomly selected from the same urn. The second selected ball has the number 3 on it. This ball is then replaced, and another ball is randomly selected from the same urn. 52.39. [4B-F95:20] (2 points) Determine the Bayesian analysis estimate of the expected value of the number on this third selected ball. (A) (B) (C) (D) (E)

Less than 2.2 At least 2.2, but less than 2.4 At least 2.4, but less than 2.6 At least 2.6, but less than 2.8 At least 2.8

52.40. [4B-F95:21] (3 points) Determine the Bühlmann credibility estimate of the expected value of the number on this third selected ball. (A) (B) (C) (D) (E)

Less than 2.2 At least 2.2, but less than 2.4 At least 2.4, but less than 2.6 At least 2.6, but less than 2.8 At least 2.8

52.41. [4-F01:23] You are given the following information on claim frequency of automobile accidents for individual drivers:

Rural Urban Total

Business Use Expected Claim Claims Variance 1.0 0.5 2.0 1.0 1.8 1.06

Pleasure Use Expected Claim Claims Variance 1.5 0.8 2.5 1.0 2.3 1.12

You are also given: (i) Each driver’s claims experience is independent of every other driver’s. (ii) There are an equal number of business and pleasure use drivers. Determine the Bühlmann credibility factor for a single driver. (A) 0.05

(B) 0.09

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.17

(D) 0.19

(E) 0.27

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1021

52.42. [4B-F97:17] (2 points) You are given the following: •

The number of claims follows a Poisson distribution with mean λ.



Claim sizes follow the following distribution: Claim Size 2λ 8λ



The prior distribution for λ is: λ 1 2 3



Probability 1/3 2/3

Probability 1/3 1/3 1/3

Given λ, the number of claims and claim sizes are independent. Determine the expected value of the process variance of the aggregate losses.

(A) (B) (C) (D) (E)

Less than 150 At least 150, but less than 300 At least 300, but less than 450 At least 450, but less than 600 At least 600

52.43. [4-F02:29] You are given the following joint distribution: 0 0.4 0.1 0.1

X 0 1 2

Θ

1 0.1 0.2 0.1

For a given value of Θ and a sample of size 10 for X: 10 X

x i  10

i1

Determine the Bühlmann credibility premium. (A) 0.75

(B) 0.79

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.82

(D) 0.86

(E) 0.89

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1022

Use the following information for questions 52.44 and 52.45: You are given the following: •

A portfolio of independent risks is divided into two classes.



Each risk contains the same number of risks.



For each risk in Class 1, the number of claims for a single exposure period follows a Poisson distribution with mean 1.



For each risk in Class 2, the number of claims for a single exposure period follows a Poisson distribution with mean 2.

A risk is selected at random from the portfolio. During the first exposure period, 2 claims are observed for this risk. During the second exposure period, 0 claims are observed for this same risk. 52.44. [4B-S96:5] (2 points) Determine the posterior probability that the risk selected came from Class 1. (A) (B) (C) (D) (E)

Less than 0.53 At least 0.53, but less than 0.58 At least 0.58, but less than 0.63 At least 0.63, but less than 0.68 At least 0.68

52.45. [4B-S96:6] (2 points) Determine the Bühlmann credibility estimate of the expected number of claims for this same risk for the third exposure period. (A) (B) (C) (D) (E)

Less than 1.32 At least 1.32, but less than 1.34 At least 1.34, but less than 1.36 At least 1.36, but less than 1.38 At least 1.38

52.46. [4-F04:25] You are given: (i) A portfolio of independent risks is divided into two classes. (ii) Each class contains the same number of risks. (iii) For each risk in Class 1, the number of claims per year follows a Poisson distribution with mean 5. (iv) For each risk in Class 2, the number of claims per year follows a binomial distribution with m  8 and q  0.55. (v) A randomly selected risk has three claims in Year 1, r claims in Year 2 and four claims in Year 3. The Bühlmann credibility estimate for the number of claims in Year 4 for this risk is 4.6019. Determine r. (A) 1

(B) 2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 3

(D) 4

(E) 5

Exercises continue on the next page . . .

EXERCISES FOR LESSON 52

1023

Use the following information for questions 52.47 and 52.48: You are given the following: •

An urn contains six dice.



Three of the dice have two sides marked 1, two sides marked 2, and two sides marked 3.



Two of the dice have two sides marked 1, two sides marked 3, and two sides marked 5.



One die has all six sides marked 6. One die is randomly selected from the urn and rolled. A 6 is observed.

52.47. [4B-S98:14] (2 points) Determine the Bühlmann credibility estimate of the expected value of the second roll of this same die. (A) (B) (C) (D) (E)

Less than 4.5 At least 4.5, but less than 5.0 At least 5.0, but less than 5.5 At least 5.5, but less than 6.0 At least 6.0

52.48. [4B-S98:15] (3 points) The selected die is placed back into the urn. A seventh die is then added to the urn. The seventh die is one of the following three types: 1.

Two sides marked 1, two sides marked 3, and two sides marked 5

2.

All six sides marked 3

3.

All six sides marked 6

One die is again randomly selected from the urn and rolled. An estimate is to be made to the expected value of the second roll of this same die. Determine which of the three types for the seventh die would increase the Bühlmann credibility of the first roll of the selected die (compared to the Bühlmann credibility used in the previous question). (A) 1

(B) 2

(C) 3

(D) 1,3

(E) 2,3

52.49. [C-S05:20] For a particular policy, the conditional probability of the annual number of claims given Θ  θ, and the probability distribution of Θ are as follows: Number of claims Probability

0 2θ

1 θ

2 1 − 3θ

θ Probability

0.05 0.80

0.30 0.20

Two claims are observed in Year 1. Calculate the Bühlmann credibility estimate of the number of claims in Year 2. (A) (B) (C) (D) (E)

Less than 1.68 At least 1.68, but less than 1.70 At least 1.70, but less than 1.72 At least 1.72, but less than 1.74 At least 1.74

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1024

52.50. [4-S01:11] (i)

The claim count and claim size distributions for risks of type A are: Number of Claims 0 1 2

(ii)

Probabilities 4/9 4/9 1/9

Claim Size 500 1235

Probabilities 1/3 2/3

The claim count and claim size distributions for risks of type B are: Number of Claims 0 1 2

Probabilities 1/9 4/9 4/9

Claim Size 250 328

Probabilities 2/3 1/3

(iii) Risks are equally likely to be type A or type B. (iv) Claim counts and claim sizes are independent within each risk type. (v) The variance of the total losses is 296,962. A randomly selected risk is observed to have total annual losses of 500. Determine the Bühlmann credibility premium for the next year for this same risk. (A) 493

(B) 500

(C) 510

(D) 513

(E) 514

52.51. [4-F03:23] You are given: (i)

Two risks have the following severity distributions: Amount of Claim 250 2,500 60,000

(ii)

Probability of Claim Amount for Risk 1 0.5 0.3 0.2

Probability of Claim Amount for Risk 2 0.7 0.2 0.1

Risk 1 is twice as likely to be observed as Risk 2.

A claim of 250 is observed. Determine the Bühlmann credibility estimate of the second claim amount from the same risk. (A) (B) (C) (D) (E)

Less than 10,200 At least 10,200, but less than 10,400 At least 10,400, but less than 10,600 At least 10,600, but less than 10,800 At least 10,800

Additional released exam questions: C-F05:19, C-F06:6,23

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 52

1025

Solutions 52.1.

A fair six-sided die X has mean and variance 1+2+···+6 7   3.5 6 2 2 + 22 + · · · + 6 2 91 1  E[X 2 ]  6 6 E[X] 

91 7 Var ( X )  − 6 2

!2



35 12

Therefore, the hypothetical mean for die A is 3.5, and the hypothetical mean for die B, whose faces are all 5 more than the faces of a standard die, is 3.5 + 5  8.5. The process variance is 35/12 for both dice since adding a constant doesn’t affect variance; hence v  35/12. The variance of the hypothetical means is a

1 2 25 (5 )  4 4

We are ready to compute credibility. 3 25 45 3a 225  25 4 35   3a + v 3 4 + 12 225 + 35 52 45 PC  6 − (C) (4)  2.538 52



Z

52.2. 3 Pr ( X1 + X2 + X3  1 | A )  (0.1) 2 (0.9)  0.027 1

!

Pr ( X1 + X2 + X3  1 | B ) 

3 (0.6) 2 (0.4)  0.432 1

!

3 Pr ( X1 + X2 + X3  1 | C )  (0.8) 2 (0.2)  0.384 1

!

0.027 (0.9) + 0.432 (0.4) + 0.384 (0.2) E[X4 + X5 + X6 | X]  3  0.9747 0.027 + 0.432 + 0.384

!

(A)

52.3. We will treat the 3 balls as a single exposure unit. The formula for PC is the alternative formula (44.2) on page 861. µ  0.9 + 0.4 + 0.2  1.5 v  0.9 (0.1) + 0.6 (0.4) + 0.8 (0.2)  0.49 1 E[µ (Θ) 2 ]  (2.72 + 1.22 + 0.62 )  3.03 3 a  3.03 − 1.52  0.78 0.49 k 0.78 1 78 Z  1 + 0.49/0.78 49 + 78 C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1026

PC  1.5 − 52.4.

78 (0.5)  1.1929 49 + 78

(D)

The expected value of the process variance is v  12 (0.09 + 0.27)  0.18

A common mistake made in calculating the variance of the hypothetical means is to assume that 0.09 and 0.27 are the hypothetical means to calculate the variance of. As discussed in Example 52C, this is incorrect. It would be correct if the problem were worded as follows: A classification is selected at random and the experience of a single insured is observed. An insured is then selected at random from the same classification. Calculate the Bühlmann credibility factor to be applied to calculate the expected frequency for this insured. However, that is not the situation. The situation is that credibility is being calculated for each insured, not for each classification. Therefore, the hypothetical mean to be used is the hypothetical mean for each insured, not for the classification. The hypothetical means vary within each classification—the variance of the hypothetical means within each classification is not zero. Therefore, we calculate the variance of the hypothetical means by conditioning on the classification (conditional variance formula): a  Var µ (Θ)  Var (0.09, 0.27) + E[0.01, 0.03]





 (0.182 )(0.52 ) + 0.02  0.0281 0.18 k 0.0281 0.0281 Z  0.1350 (D) 0.18 + 0.0281 52.5.

We first calculate the hypothetical means. µA  20

µ B  12

µ C  10

Then we calculate the expected hypothetical mean. µ  13 (20 + 12 + 10)  14 The process variances are calculated directly. An alternative would be to calculate second moments and subtract the squares of the means.

 (0 − 20) 2 + (12 − 20) 2 + (48 − 20) 2  416   v B  61 3 (0 − 12) 2 + 1 (48 − 12) 2  288   v C  61 4 (0 − 10) 2 + 1 (12 − 10) 2 + 1 (48 − 10) 2  308

vA 

1 3



We calculate the expected value of the process variance by averaging these three process variances. v

1 1012 (416 + 288 + 308)  3 3

We calculate the variance of the hypothetical means directly, using the three hypothetical means (20, 12, 10) which are equally weighted and have a mean of 14. a C/4 Study Manual—17th edition Copyright ©2014 ASM

 56 1 (20 − 14) 2 + (12 − 14) 2 + (10 − 14) 2  3 3

EXERCISE SOLUTIONS FOR LESSON 52

1027

Now we calculate the credibility factor and the credibility premium. 1012 56 56 56 Z  1012 + 56 1068 56 PC  14 + (−14)  13.27 1068 k

(C)

52.6. This is easy, since the relative probabilities of 12 are 2, 2, and 1, and we already calculated the hypothetical means. So 2 (20) + 2 (12) + 1 (10)  14.8 . (E) E[X2 | X1 ]  5 Why was this simple computation worth 3 points but the last problem only 2 points? 52.7. Usually Bühlmann credibility predicts expected values, not probabilities. However, the process in this question is Bernoulli; either a claim occurs or it doesn’t. For a Bernoulli distribution, expected value and probability of 1 are one and the same. Let N be the random variable which is 0 if no claim occurs and 1 if a claim occurs. Then N | class is Bernoulli in every class, with mean q  1/3 in Classes 1 and 2 and q  2/3 3. Therefore the  in Class  hypothetical means are µ1  µ2  13 . µ3  32 , and the overall mean is µ  23 31 + 13 32  94 . Since there are only two values for the hypothetical mean, the variance of the hypothetical means can be computed with the Bernoulli shortcut: ! ! !2 2 1 2 1 a  3 3 3 81 2 1 3 3

The process variance in each class is q (1 − q )  Z

2 2

2 81

2 81

+

2 9



 29 , so v  29 .

4 2  4 + 18 11

4 5 4 10 54 6 PC  + Z  +   9 9 9 99 99 11

!

(C)

52.8. An easy way to solve this is to realize that the greater the difference between classes, the greater the credibility. Thus we want to make t as different from 2 as possible. In the answer choices, 5 is furthest away from 2. Calculate Z with t  5, and you see it’s greater than 1/10. The following calculation shows how to get the range of t for which Z > 1/10. We use the Bernoulli variable shortcut several times. µA  1

µB 

t 2

2 1 t 1 − 1  16 ( t − 2) 2 4 2 v A  41 (4)  1 v B  14 t 2



a

v Z The denominator of Z is

1 2





1 2 4t

+ 1  18 ( t 2 + 4)



a ( t − 2) 2  a + v ( t − 2) 2 + 2 ( t 2 + 4)

t 2 − 4t + 4 + 2t 2 + 8  3t 2 − 4t + 12 C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1028

Z equals

1 10

when 10 ( t − 2) 2  3t 2 − 4t + 12

10t 2 − 40t + 40  3t 2 − 4t + 12

7t 2 − 36t + 28  0 p √ 36 ± 362 − 4 (7)(28) 36 ± 512 t   0.96, 4.19 14 14 The quadratic is concave up. For Z > 1/10, either t < 0.96 or t > 4.19 is necessary. (E) 52.9. P (2 | A )  P (2 | B )  P (2 | C ) 

4 (0.72 )(0.32 )  0.2646 2

!

4 P (2 | D )  (0.82 )(0.22 )  0.1536 2 0.2646 (1.2 + 1.2 + 2.8) + 0.1536 (3.2) E[X5 + · · · X8 | X]   1.9711 3 (0.2646) + 0.1536

!

(B)

52.10. We will treat 4 balls as one exposure unit. µ  0.3 + 0.3 + 0.7 + 0.8  2.1 v  3 (0.7)(0.3) + (0.2)(0.8)  0.79 We calculate the variance of the hypothetical means by taking each hypothetical mean (1.2, 1.2, 2.8, 3.2) and subtracting the overall mean 2.1, then squaring. a  41 (0.92 + 0.92 + 0.72 + 1.12 )  0.83 0.79 k 0.83 1 83 Z  1 + 79/83 162 83 PC  2.1 + (D) (2 − 2.1)  2.0488 162 52.11. Refer to the solution to exercise 52.8, where we compute Z 

( t − 2) 2 . As t → ∞, ( t − 2) 2 + 2 ( t 2 + 4)

the square terms dominate, and they have coefficients 1 in the numerator and 3 in the denominator, so Z→

1 3

. (B)

52.12. Let S be aggregate losses. The hypothetical means are 1 2 E[S | A]  ( )(4)  6 3 5 5 E[S | B]  ( )(2)  6 3 We use the Bernoulli shortcut to calculate the variance of the above two hypothetical means. a C/4 Study Manual—17th edition Copyright ©2014 ASM

1 2

!

1 2

!

5 2 − 3 3

2



1 4

EXERCISE SOLUTIONS FOR LESSON 52

1029

To calculate the process variances, we will use the compound variance formula Var[S]  E[N] Var ( X ) + Var ( N ) E[X]2 1 5 200 Var ( S | A )  ( )(20) + ( )(42 )  6! 36 36 ! 5 170 5 (5) + (22 )  Var ( S | B )  6 36 36 The expected value of the process variance is v

1 200 170 185 +  2 36 36 36





The credibility factor for 4 observations is Z

36 4 4    0.1629 4 + k 4 + 185/36 221 1/4

(D)

52.13. We have already calculated the credibility factor Z  36/221 in the previous exercise. We just need the overall mean. 2 1 (4)  µA  6 3

!

5 5 (2)  6 3   1 2 5 7 µ +  2 3 3 6

!

µB 

The credibility premium is 185 PC  221

!

7 36 + (0.25)  1.0173 6 221

!

!

(E)

52.14. We’ll calculate the credibility premium as a function of m. For a geometric distribution, the mean is β and the variance is β (1 + β ) . Therefore, the hypothetical means and process variances are µA  0.1

v A  (0.1)(1.1)  0.11

µ B  0.5

v B  (0.5)(1.5)  0.75

The overall mean, expected process variance, and variance of hypothetical means are (variance of hypothetical means calculated using Bernoulli shortcut) µ  43 (0.1) + 14 (0.5)  0.2

v  43 (0.11) + 41 (0.75)  0.27

a  ( 43 )( 14 )(0.5 − 0.1) 2  0.03

The credibility factor for m years is Z  ma/ ( ma + v ) . Setting the credibility premium equal to 0.15, Z C/4 Study Manual—17th edition Copyright ©2014 ASM

0.03m 0.03m + 0.27

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1030

0.054  0.15 0.03m + 0.27 0.054  0.0045m + 0.0405 PC  0.2 (1 − Z )  m 3 52.15. µA  0.6 and µ B  1. The joint probabilities are Pr ( A&X1  1)  0.75 (0.1)  0.075 and Pr ( B&X1  1)  0.25 (0.2)  0.05. The Bayesian premium is: E[X2 | X1 ] 

0.075 (0.6) + 0.05 (1)  0.76 . 0.075 + 0.05

(B)

52.16. We first calculate the expected hypothetical mean. µ  0.75 (0.6) + 0.25 (1)  0.7 We calculate the process variance for each class by calculating the second conditional moment and subtracting the square of the hypothetical mean. E[µ (Θ) 2 | A]  0.1 (1) + 0.1 (4) + 0.1 (9)  1.4 v A  1.4 − 0.62  1.04

E[µ (Θ) 2 | B]  0.2 (1) + 0.1 (4) + 0.2 (9)  2.4 v B  2.4 − 12  1.40

We average the two process variances to obtain the expected process variance. v  0.75 (1.04) + 0.25 (1.40)  1.13 We use the Bernoulli shortcut on the two hypothetical means 1 and 0.6 to calculate the variance of the hypothetical means. a  (0.25)(0.75)(1 − 0.6) 2  0.03 We calculate credibility.

1.13  37 32 0.03 1 3 Z  2 116 1 + 37 3 3 (0.3)  0.7078 PC  0.7 + 116 k

(A)

52.17. The number of claims for each spinner is a Bernoulli distribution with mean 0.15 or 0.05. These are the hypothetical means. The process variance for a Bernoulli is q (1 − q ) , or (0.15)(0.85) for the first spinner and (0.05)(0.95) for the second spinner. The expected value of the process variance is v The expected hypothetical mean is

1 2



 (0.15)(0.85) + (0.05)(0.95)  0.0875 µ  0.1

and using the Bernoulli shortcut, the variance of the hypothetical means 0.15 and 0.05 is a  41 (0.15 − 0.05) 2  0.0025 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 52

1031

We now calculate the credibility factor v  35 a 3 3  Z 3 + 35 38   3 1 1 PC  0.1 +  0.11842 − 38 3 10 k

For claim size, we use the Bernoulli shortcut repeatedly to obtain the process variance in each class, and average the two process variances to get the expected process variance: v1  (40 − 20) 2 (0.2)(0.8)  64

v2  (40 − 20) 2 (0.7)(0.3)  84 v  12 (64 + 84)  74

The hypothetical means are 24 and 34. The expected hypothetical mean is µ  12 (24 + 34)  29 and using the Bernoulli shortcut on the two hypothetical means, the variance of the hypothetical means is a  41 (34 − 24) 2  25

We now calculate the credibility factor

v 74   2.96 a 25 1 Z 3.96 1 PC  29 − (9)  26.727 3.96 k

The credibility premium for aggregate claims is (0.11842)(26.727)  3.1650 . (D) 52.18. Let S be aggregate losses. Pr (0, 20, 0 | A1 B1 )  (0.85) 2 (0.15)(0.8)  0.0867

Pr (0, 20, 0 | A1 B2 )  (0.85) 2 (0.15)(0.3)  0.0325125 Pr (0, 20, 0 | A2 B1 )  (0.952 )(0.05)(0.8)  0.0361

Pr (0, 20, 0 | A2 B2 )  (0.952 )(0.05)(0.3)  0.0135375

The hypothetical means of severity are µ ( B1 )  0.8 (20) + 0.2 (40)  24 µ ( B2 )  0.3 (20) + 0.7 (40)  34 The hypothetical means of aggregate loss are µ ( A1 , B1 )  0.15 (24)  3.6 µ ( A1 , B2 )  0.15 (34)  5.1 C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1032

µ ( A2 , B1 )  0.05 (24)  1.2 µ ( A2 , B2 )  0.05 (34)  1.7 The expected value of the next observation from the same pair of spinners is 0.0867 (3.6) + 0.0325125 (5.1) + 0.0361 (1.2) + 0.0135375 (1.7)  3.2233 0.0867 + 0.0325125 + 0.0361 + 0.0135375

E[S | 0, 20, 0] 

(E)

52.19. Let µ1 be the hypothetical mean of Risk 1 and µ2 the hypothetical mean of Risk 2. µ1  4350

µ2  2270

Pr (Risk 1&X1  100) E[X2 | X1 ] 

2 3 (0.5)

  13 Pr (Risk 2&X1  1 0.7 3 (4350) + 3 (2270)  3493.53 1 0.7 3 + 3

100)  13 (0.7) (A)

52.20. Let v1 be the process variance of Risk 1 and v2 the process variance of Risk 2. E[µ1 (Θ) 2 ]  80,305,000

v 1  61,382,500

2

E[µ2 (Θ) ]  40,207,000 v a

v2  35,054,100

1 2 3 (61,382,500) + 3 (35,054,100)  ( 31 )( 23 )(4350 − 2270) 2  961,422

52,606,367

a  0.017948 a+v µ  32 (4350) + 31 (2270)  3657

Z

PC  3657 − 3557Z  3593

(B)

52.21. The hypothetical means are 0.2 (2)  0.4 for A and 0.2c for B. Using the Bernoulli shortcut, a  1 1 2 2 2 4 (0.4 − 0.2c )  (0.2 − 0.1c ) . The process variance is 0.16 (4)  0.64 for A and 0.16c for B. v  2 (0.64 + 0.16c 2 )  0.32 + 0.08c 2 . Then Z

a (0.2 − 0.1c ) 2  a + v (0.2 − 0.1c ) 2 + 0.32 + 0.08c 2

As c → ∞, the highest exponents of c, c 2 , dominate. In the numerator, the coefficient of c 2 is 0.01, while in the denominator it is 0.01 + 0.08  0.09. The answer is

0.01 0.09



1 9

. (B)

52.22. The Bayesian estimate is the weighted average of the hypothetical means, weighted with the joint probabilities of the observations of 0. Since the two classes are equally likely, we only need to weight the hypothetical means with the likelihoods of 0. The hypothetical means are µA  48

µ B  172

The likelihoods of 0 for the two classes are Pr ( A | X1  0, X2  0)  0.82  0.64

The Bayesian estimate is

E[X3 | X1  0, X2  0]  C/4 Study Manual—17th edition Copyright ©2014 ASM

Pr ( B | X1  0, X2  0)  0.62  0.36 0.64 (48) + 0.36 (172)  92.64 0.64 + 0.36

(B)

EXERCISE SOLUTIONS FOR LESSON 52

1033

52.23. For the Bühlmann estimate, first calculate the overall mean and the variance of the hypothetical means. We calculated the hypothetical means in the previous exercise. µ  21 (48 + 172)  110

a  41 (172 − 48) 2  3844

Calculate the process variances as the second moment of the conditional means of the classes minus the hypothetical means squared. E[µA (Θ) 2 ]  0.16 (2500) + 0.04 (1,000,000)  40,400 v A  40,400 − 482  38,096

E[µ B (Θ) 2 ]  0.24 (2500) + 0.16 (1,000,000)  160,600 v B  160,600 − 1722  131,016

v  12 (38,096 + 131,016)  84,556

Now calculate the credibility factor and premium. Z

na 2 (3844)   0.083344 na + v 2 (3844) + 84,556

PC  (1 − Z ) µ  (1 − 0.083344)(110)  100.83

(D)

52.24. The hypothetical means of the two classes are µA  1000mq  (1000)(2)(0.25)  500 µ B  1000mq  (1000)(4)(0.25)  1000 Therefore, the overall mean and the variance of hypothetical means are (the latter by the Bernoulli shortcut): µ  0.5 (1000) + 0.5 (500)  750 a  14 (5002 )  62,500 The process variances of the two classes are v A  106 Var ( N )  106 mq (1 − q )  106 (2)(0.25)(0.75)  375,000

v2  106 Var ( N )  106 mq (1 − q )  106 (4)(0.25)(0.75)  750,000

The expected value of the process variance is

v  0.5 (375,000 + 750,000)  562,500 The credibility factor is

na 62,500 (5) 5   na + v 62,500 (5) + 562,500 14 ¯ Therefore, backing out the average aggregate losses x: Z

9 5 + x¯  700 14 14 700 − 750 (9/14) x¯   610 5/14 750

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

!

(B)

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1034

52.25. µ1  1 µ2  4 1 3 µ  (1) + (4)  3.25 4 ! !4 1 3 27 a (4 − 1) 2  4 4 16 v1  0.25 (20) + 0.75 (42 )  17 v2  0.50 (36) + 0.75 (82 )  66 1 3 215 v  (17) + (66)  4 4 4 27 a 27 Z  215 16 27   0.0304 v+a 887 4 + 16 PC  0.0304P1 + (1 − 0.0304) 3.25  0.0304P1 + 3.15

(B)

52.26. If the risk is selected at random, there is no credibility and we must use the overall mean of 3.25 . (A) 52.27. The hypothetical means are 1 2

1 2

1 is therefore a  2 (1 − 21 ) 2  16 . 1 The process variances are 12 and a 3 credibility is a+v  13 . So

and 1, with probabilities 21 . The variance of the hypothetical means 4 12 ,

so the expected value is v  0.5



1 12

+

4 12





5 24 .

Then Bühlmann

3 3 3 + L− 4 13 4 The Bayesian estimate if L > 1 is 1, since then it is definitely a Type B risk. If L < 1, the densities are 1  23 probability, and the Bayesian expectation is 1 for A and 12 for B, so Bayesian analysis gives A a 1+1/2 therefore ! 2 1 1 2 Y + (1)  3 2 3 3



X



Looking through the five answer choices, we see that only (E) is correct; we can solve for L such that X is exactly 1 and L > 1, and we can solve for L such that X is exactly 23 and L < 1, and those will be the only two values such that X  Y. 52.28. E[X3 | X1 + X2  1] 

(0.2)(0.8)(0.8) + (0.7)(0.3)(0.3) 0.191  (0.2)(0.8) + (0.7)(0.3) 0.37 !

E[X3 + X4 | X1 + X2  1]  2

0.191  1.0324 0.37

52.29. We will treat each ball as a separate exposure. µ  12 (0.8) + 12 (0.3)  0.55 a  14 (0.5) 2 

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 16

(A)

EXERCISE SOLUTIONS FOR LESSON 52

1035

v A  (0.8)(0.2)  0.16 v B  (0.7)(0.3)  0.21 v  21 (0.16) + 12 (0.21)  0.185

k  16 (0.185)  2.96 2 2 Z  2 + 2.96 4.96 PC  2 .0.55 − 0.05

*

2 + /  1.0597 4.96

!

(C)

-

,

52.30. If an urn is randomly selected, there is no credibility, so the expected value is 2 (0.55)  1.1 . (B) 52.31.

µA  88, µ B  332, µ C  576. E[X2 | X1 ] 

0.16 (88) + 0.24 (332) + 0.32 (576)  386.22 . 0.16 + 0.24 + 0.32

(E)

52.32. µ  13 (88 + 332 + 576)  332

E[µA (Θ) 2 ]  0.16 (502 ) + 0.04 (20002 )  160,400 E[µ B (Θ) 2 ]  0.24 (502 ) + 0.16 (20002 )  640,600

E[µ C (Θ) 2 ]  0.32 (502 ) + 0.28 (20002 )  1,120,800

v A  160,400 − 882  152,656

v B  640,600 − 3322  530,376

v C  1,120,800 − 5762  789,024

v  13 (152,656 + 530,376 + 789,024)  490,685 31

E[µ (Θ) 2 ]  13 (882 + 3322 + 5762 )  149,914 23 a  149,914 32 − 3322  39,690 32

Z

39,690 32 a   0.074835 a+v 530,376

PC  332 − 282Z  310.897

(B)

52.33. The variable indicating that a claim was made is Bernoulli. If q is the probability of a claim in one year, we are given µ  E[q]  0.1 and a  Var ( q )  0.01. Then E[q 2 ] − E[q]2  0.01, so E[q 2 ]  0.02 and since the process variance is q (1 − q ) , v  E q (1 − q )  E[q] − E[q 2 ]  0.1 − 0.02  0.08





Then Z

10a 5  10a + v 9

PC  (1 − Z )(0.1) 

4 90

and over 5 years, expected number of claims is 5PC  2/9  0.2222 . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1036

52.34.

µ1  32 , µ2  43 . Pr (100 | class 1) 

1 2

!

Pr (100 | class 2) 

2 3

!

E[N2 | S1  100] 

1 1 + 3 2

!

1 1 + 2 3

!

!

!

7 3 18 2 7 18

5 4 12 3 5 12

+ +

2 3

!2

1 2

!2





7 18



5 12

21 + 20 41  14 + 15 29

(C)

52.35. Calculate the hypothetical means: 3 µ1  2

200  100 3

!

!

4 µ2  (75)  100 3

!

a0 When a  0, there is no credibility. (k is infinite). So the credibility premium is the overall mean, 100 . (A) It is not surprising that no credibility is possible if the hypothetical means are all equal. This is true in Bayesian analysis as well. 52.36. The relative probabilities of observing 340 from each class are the density functions, which are 1 1 1 400 , 600 , and 800 . We therefore weight the expected values of each class with these weights. E[X2 | X1  340]  

1 1 600 (300) + 800 (400) 1 1 1 400 + 600 + 800

1 400 (200)

1.5 13 2400



+

3600  276.92 13

(B)

52.37. To calculate the variance of the hypothetical means, we calculate the mean of the hypothetical means, then the second moment of the hypothetical means, and then subtract the square of the mean from the second moment to obtain the variance. µ  13 (200 + 300 + 400)  300

E[µ ( θ ) 2 ]  31 (2002 + 3002 + 4002 )  a

4 29 3 (10 )

− 3002  32 (104 )

4 29 3 (10 )

The process variance for each class is the square of the range over 12. We average the three process variances to get v. 1 1 1,160,000 (4002 + 6002 + 8002 )  v 3 12 36

!

Z

2 4 3 (10 ) 2 116 4 4 3 (10 ) + 36 (10 )



PC  300 + 40Z  306.86

C/4 Study Manual—17th edition Copyright ©2014 ASM

24 24 6   24 + 116 140 35 (C)

EXERCISE SOLUTIONS FOR LESSON 52

1037

52.38. In order to calculate the process variance of each class, we must calculate the variance of a mixture of distributions. To calculate the variance of a mixture, we must calculate the second moment of the mixture (which is the weighted sum of the second moments of the mixed distributions) and then subtract the square of the first moment of the mixture (which is the weighted sum of the first moments of the mixed distributions). For class A, the mean of each Poisson distribution is the Poisson parameter λ and the second moment of each Poisson distribution is the square of the mean plus the variance, λ 2 + λ. Let µA be the hypothetical mean for class A. Then   1 1 1 1  + µA  2 6 3 4 The second moment for class A is: 1* 1 E[µA (Θ) ]  . 2 6 2

!2

1 1 + + 6 3

!2

1 + 23 + / 3 72

,

-

The process variance for class A, v A , is then: 1 37 23 −  72 16 144

vA  We go through the same calculation for class B. µB 

1 2 5 3  + 2 3 6 4





1* 2 E[µ B (Θ) ]  . 2 3 2

!2

2 5 + + 3 6

!2

5 + 95 + / 6 72

,

-

95 9 109 vB  −  72 16 144 Then the expected value of the process variance, v, is v  0.5



109 73 37 +  144 144 144



The variance of the hypothetical means is obtained using the Bernoulli shortcut with the two values, the two hypothetical means ( 14 and 34 ) each having probability 1/2. a

1 1 4 2

!2 

1 16

Finally we calculate the credibility factor. Z

1 16

1 16

+

73 144



9 82

(B)

52.39. You must have selected one of urns 1–5. For them, the mean is 3 . (E)

C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1038

52.40. µ  3. The process variances are 2 for urns 1–5 and 0 for urns 6–10, so v  1. The variance of the hypothetical means is best calculated directly, subtracting the overall mean (3) from the hypothetical means, squaring, and summing up. a

(−2) 2 + (−1) 2 + 12 + 22 + 6 (02 ) 10

v 1  1 a 1 2 2 2 Z   2+k 2+1 3

1

k

PC  µ + Z ( x¯ − µ )  3 +

2 (2.5 − 3)  2 23 3

(D)

52.41. Note that there are 4 types of drivers (all combinations of business and pleasure use and rural and urban), and thus 4 hypothetical means and process variances. If you thought that there were only 2 classes, review Example 52D. From the total expected claims in business and pleasure use, we can deduce that 80% are urban and 20% rural. For example, in the business use column, from 1.0x + 2.0 (1 − x )  1.8

it follows that x  0.2. The same applies to pleasure use. Thus the joint probabilities of business use/rural, business use/urban, pleasure use/rural, and pleasure use/urban are 0.1, 0.4, 0.1, 0.4 respectively. Expected hypothetical mean is 0.5 (1.8) + 0.5 (2.3)  2.05. Expected square hypothetical mean is 0.1 (12 ) + 0.4 (22 ) + 0.1 (1.52 ) + 0.4 (2.52 )  4.425 Then a  4.425 − 2.052  0.2225. The expected process variance is

v  0.1 (0.5) + 0.4 (1) + 0.1 (0.8) + 0.4 (1)  0.93

a 0.2225   0.1931 . (D) a + v 0.2225 + 0.93 Note that total claim variance was put there to confuse you—this is the overall variance for business use and pleasure use, not the expected process variance. If you wanted to use this information, you would have to back out the square moment, then combine business and pleasure use, and in this way you would get the overall variance, or a + v. Naturally you would get the same answer if you did it this way. So Z 

52.42. To calculate v ( λ ) , we use the usual formula for a compound Poisson distri the process variance  bution Var ( S )  λ E[X]2 + Var ( X ) : E[X]  13 (2λ ) + 23 (8λ )  6λ. For Var ( X ) , we use the usual shortcut for the variance of a Bernoulli variable: So

Var ( X )  (8λ − 2λ ) 2 ( 13 )( 23 )  29 (36λ 2 ) v ( λ )  λ 36λ 2 + 92 (36λ 2 )  44λ 3





The expected value of the process variance is then: v  E 44λ 3  44 E[λ 3 ]

f

g

E[λ 3 ]  31 (1 + 8 + 27)  12 v  44 (12)  528 C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

EXERCISE SOLUTIONS FOR LESSON 52

1039

52.43. There are two classes, Θ  0 and Θ  1. The hypothetical means are 0.1 (1) + 0.1 (2) 1  0.6 2 0.2 (1) + 0.1 (2) µ (1)  1 !0.4 1 + 0.4 (1)  0.7 µ  0.6 2 µ (0) 



a  (0.6)(0.4) 1 −

1 2

2

 0.06

The second moments are µ (0) 2 

0.1 (1) + 0.1 (4) 5  0.6 6

!2

7 1 5  − 6 2 12 0.2 (1) + 0.1 (4) µ (1) 2   1.5 0.4 v (1)  1.5 − 12  0.5 v (0) 

6.6 7 + 0.4 (0.5)   0.55 v  0.6 12 12

!

Thus the credibility factor is Z 

10a 6  . x¯  1. The credibility premium is 10a + v 11.5 PC  0.7 +

6 (1 − 0.7)  0.8565 11.5

(D)

52.44. 12  0.067668 Pr (2, 0 | class 1)  ( e ) 2

!

−1 2

22 Pr (2, 0 | class 2)  ( e )  0.036631 2 0.067668  0.64879 Pr (class 1 | 2, 0)  0.067668 + 0.036631

!

−2 2

52.45.

µ  v  23 , a  14 , k  6, Z 

2 2+6

 14 . PC 

3 2

+ 41 (− 12 ) 

11 8

(D)

 1.375 . (D)

52.46. For Class 1, µ (1)  v (1)  5. For Class 2, µ (2)  8 (0.55)  4.4 and v (2)  8 (0.55)(0.45)  1.98. Thus 5 + 4.4  4.7 2 !2 1 a (5 − 4.4) 2  0.09 2 5 + 1.98 v  3.49 2 3a Z  0.071809 3a + v µ

C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1040

Now, if x¯ is average claims in 3 years, then 4.6019  4.7 + Z ( x¯ − 4.7) , so 4.6019 − 4.7  −1.3661 0.071809 3+r+4 x¯  3.33  3 3 + r + 4  10 x¯ − 4.7 

r 3

(C)

52.47. We will subscript 1 for 1/2/3 dice, 2 for 1/3/5 dice, and 3 for all-sides-6 dice. µ1  2

v 1  2/3

µ2  3

v 2  8/3

µ3  6 v3  0 1 1 1 µ  (2) + (3) + (6)  3 2 ! 3 ! 6 1 2 1 8 22 11 v +   2 3 3 3 18 9   1 1 2 (1 ) + (32 )  2 a 2 6 a 18 2 Z   a + v 2 + 11 29 9 PC  3 + 3

18  4.862 29

!

(B)

52.48. I think the exam setters wanted you to use insight to solve this problem. Bühlmann credibility is increased by decreasing the expected process variance and increasing the variance of the hypothetical means. Of the three changes recommended, which one accomplishes this the most? Comparing 1 and 2, they both add a hypothetical mean of 3, so they both give rise to the same a. However, 2 adds a 0 process variance die to the set, while 1 adds a high process variance die. So 2 is better (in terms of increasing credibility) than 1. Comparing 3 and 2, they both add a 0 process variance die, so they both have the same effect on the expected process variance. However, 3’s mean is more out of range than 2’s, so it will increase the variance of the hypothetical means more. So 3 is better than 2. We now know the answer has to be (C) or (E). So we only have to determine whether 2 increases or decreases the credibility. But in 2, all we’re doing is we’re adding a die with 0 variance (which multiplies the expected process variance by 76 ) and the same mean as the mean of the first 6. The latter multiplies the variance of the hypothetical means by 67 . a and v are both multiplied by 76 , so the credibility is unchanged. The answer is (C). It would take an extraordinary student to reason as above. More likely, you would recalculate the credibility with all three changes: 12/7 3 2 3 8 10 3 2 1 2 12 6 18 7 3 + 7 3  7 . µ is still 3. a  7 (1 ) + 7 (3 )  7 . Z  12/7+10/7  11 < 29 . 12/7 6 11 12 18 22 7 9  21 . µ  3. a is the same as in 1, 7 . Z  12/7+22/21  29 , the same as before.

1.

v

2.

v

3.

v is the same as in 2. µ 

1 7



3 (2) + 2 (3) + 2 (6) 



10 1 a  *3 7 7

, C/4 Study Manual—17th edition Copyright ©2014 ASM

!2

3 +2 7

!2

24 7 .

!2

18 + 138 +2  7 49

-

EXERCISE SOLUTIONS FOR LESSON 52

Z

1041

414 414 207 18 138/49    > 138/49 + 22/21 414 + 154 568 284 29

52.49. Let’s use subscripts of 1 for the class with θ  0.05 and 2 for the class with θ  0.30. Then µ1  0.05 (1) + 0.85 (2)  1.75 µ2  0.30 (1) + 0.10 (2)  0.5 µ  0.8 (1.75) + 0.2 (0.5)  1.5 By the Bernoulli shortcut,

a  (1.75 − 0.5) 2 (0.8)(0.2)  0.25

The process variances are calculated as second moment minus first moment squared. E[µ ( θ ) 2 | θ  0.05]  0.05 + 0.85 (22 )  3.45 v1  3.45 − 1.752  0.3875

E[µ ( θ ) 2 | θ  0.30]  0.30 + 0.10 (22 )  0.7 v2  0.7 − 0.52  0.45

v  0.8 (0.3875) + 0.2 (0.45)  0.4

The credibility factor after n  1 year is Z  PC 

a 0.25 5   . The credibility estimate is a + v 0.25 + 0.4 13

5 8 (1.5) + (2)  1.6923 13 13

(B)

52.50. The hypothetical mean of A is

! ! ! ! ! *. 4 (1) + 1 (2) +/ *. 1 (500) + 2 (1235) +/  2 (990)  660 9 9 3 3 3 , -, The hypothetical mean of B is

! ! ! ! ! *. 4 (1) + 4 (2) +/ *. 2 (250) + 1 (328) +/  4 (276)  368 9 9 3 3 3 , -, The expected value of the hypothetical means is µ  12 (660) + 21 (368)  514. The variance of the hypothet-

2

ical means, using the Bernoulli shortcut, is a  12 (660 − 368) 2  21,316. The variance of total losses is a + v, as mentioned right before Example 51D. Thus the credibility factor is Z

a 21,316   0.07178 a + v 296,962

The Bühlmann credibility premium is PC  514 + 0.07178 (500 − 514)  513.00

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

1042

52.51. Let µ ( i ) and v ( i ) be the hypothetical mean and process variance of Risk i respectively. µ (1)  0.5 (250) + 0.3 (2,500) + 0.2 (60,000)  12,875 µ (2)  0.7 (250) + 0.2 (2,500) + 0.1 (60,000)  6,675 2 (12,875) + 6,675 µ  10,808.33 ! ! 3 2 1 (12,875 − 6,675) 2  8,542,222 a 3 3 µ (1) 2  0.5 (2502 ) + 0.3 (2,5002 ) + 0.2 (60,0002 )  721,906,250 v (1)  721,906,250 − 12,8752  556,140,625

µ (2) 2  0.7 (2502 ) + 0.2 (2,5002 ) + 0.1 (60,0002 )  361,293,750 v (2)  361,293,750 − 6,6752  316,738,125 2 (556,140,625) + 316,738,125 v  476,339,792 3 8,542,222 Z  0.017617 8,542,222 + 476,339,792 PC  10,808.33 + 0.017617 (250 − 10,808.33)  10,622.33

(D)

Quiz Solutions 52-1. The hypothetical means are 0.1 and 0.2, so µ  0.7 (0.1) + 0.3 (0.2)  0.13 and a  (0.7)(0.3)(0.2 − 0.1) 2  0.0021. The process variances for the conditional Bernoulli distributions of good and bad risks are (0.1)(0.9)  0.09 and (0.2)(0.8)  0.16 with mean v  0.7 (0.09) + 0.3 (0.16)  0.111. Therefore, the Bühlmann k  0.111/0.0021 and the credibility factor is Z

10  0.159091 10 + 0.111/0.0021

The Bühlmann prediction for the number of claims is 2Z + 1.3 (1 − Z )  1.4114 .

52-2. The distribution of number of claims is Bernoulli with hypothetical mean Q and process variance Q (1 − Q ) . Let I be the indicator of whether the insured is urban or rural. Then µ  E[Q]  E E[Q | I]  E[0.35, 0.19]  0.27

f

g

a  Var ( Q )  E[Var ( Q | I ) ] + Var (E[Q | I]) 0.32 0.182 E , + Var (0.35, 0.19) 12 12

"

#

 0.5 (0.0075 + 0.0027) + (0.5)(0.5)(0.162 )  0.0115 E[Q 2 ]  E E[Q 2 | I]

f

g

0.32 0.182  0.5 + 0.352 + + 0.192  0.0844 12 12

!

v  E[Q (1 − Q ) ]  E[Q] − E[Q 2 ]  0.27 − 0.0844  0.1856 0.0115 Z  0.058346 0.0115 + 0.1856 C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 52 PC  0.058346 (0) + (1 − 0.058346)(0.27)  0.2542

C/4 Study Manual—17th edition Copyright ©2014 ASM

1043

1044

C/4 Study Manual—17th edition Copyright ©2014 ASM

52. BÜHLMANN CREDIBILITY: DISCRETE PRIOR

Lesson 53

Bühlmann Credibility: Continuous Prior Reading: Loss Models Fourth Edition 18.4–18.5 or SN C-21-01 3.1–3.2 or Introduction to Credibility Theory 6.4–6.5 Bühlmann credibility with a continuous prior is no different in principle from Bühlmann credibility with a discrete prior. Your task is to identify the hypothetical mean and process variance, then to calculate the mean and variance of the former (µ and a) and the mean of the latter (v). From there, you calculate k, Z, and the credibility premium. However, since the prior is continuous, the means and variances of the hypothetical mean and process variance may require integration rather than summation. We can classify these questions in the following categories: 1. Hypothetical mean and process variance are sums of powers of the parameter(s), and the prior distribution of the parameter(s) is in the tables. In that case, the tables provide the moments you need to calculate expected values and variances. An important subcategory is when the model is a compound distribution. 2. Hypothetical mean and process variance are sums of powers of the parameter(s), and the prior distribution is uniform. Usually the parameters only appear to the first or second power, and you use the fact that for a uniform distribution, the mean is the midpoint and the variance is the range squared divided by 12. 3. Other situations, when you’ll have to integrate to calculate a and v.

Prior Distribution in Tables This is the most common type of question on exams. Example 53A You are given (i) Annual aggregate losses follow a normal distribution with parameters µ and σ 2 . (ii) The prior distribution function of µ is single-parameter Pareto with α  3 and θ  100. (iii) The prior distribution function of σ is a gamma distribution with α  10 and θ  50. (iv) µ and σ are independent. Calculate Bühlmann’s k for annual aggregate losses. Answer: The hypothetical mean is µ, and its variance is obtained by looking up the moments of a singleparameter Pareto. 3 (100) αθ   150 α−1 2 αθ2 3 (10,000) E[µ2 ]    30,000 α−2 1 Var ( µ )  30,000 − 1502  7500 E[µ] 

The process variance is σ 2 , and its expected value is the second moment of the gamma distribution: E[σ2 ]  θ 2 ( α )( α + 1)  502 (10)(11)  275,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

1045

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1046

The Bühlmann k is 275,000/7500  36 23 .



Sometimes the model is a compound distribution. There are many possibilities: one of the two components (frequency and severity) may depend on the parameter, they may both depend on the same parameter, or they may both depend on different parameters. Here’s an example of the first possibility. Example 53B You are given (i) (ii) (iii) (iv)

Annual claim counts follow a geometric distribution with β  0.1. Claim sizes follow a two-parameter Pareto distribution with parameters α  3 and Θ. Claim counts and claim sizes are independent, and claim sizes are independent of each other. The prior distribution of Θ is Weibull with τ  0.25 and θ  10.

Calculate Bühlmann’s k for annual aggregate losses. Answer: The hypothetical mean of the compound distribution is the product of the means of claim counts and claim sizes. The mean of the geometric is β  0.1 and the mean of the Pareto is θ/ ( α − 1)  Θ/2, so the hypothetical mean is 0.05Θ. The variance of the hypothetical mean is a  0.052 Var (Θ)  0.052 θ 2 Γ (1 + 2/τ ) − θΓ (1 + 1/τ )





2

 0.052 102 Γ (9) − 102 Γ (5) 2  0.052 (3,974,400)  9936





The process variance, by the compound variance formula, is v (Θ)  E[N] Var ( X ) + Var ( N ) E[X]2

!2 !2 2Θ2 Θ + Θ * β − + β (1 + β ) 2 2 2 , 2 2  0.1 (0.75) Θ + 0.11 (0.25Θ )  0.1025Θ2 The expected value of the process variance is v  0.1025 E[Θ2 ]  0.1025θ 2 Γ (1 + 2/τ )  0.1025 (100) Γ (9)  413,280 The Bühlmann k is 413,280/9936  41.5942 .

?



Quiz 53-1 Annual claim counts follow a geometric distribution with parameter β. The parameter β follows a two-parameter Pareto distribution with density function: π (β) 

3

(1 + β ) 4

β>0

No claims were observed for 4 years. Determine the Bühlmann prediction of the number of claims in the following year from the same individual.

C/4 Study Manual—17th edition Copyright ©2014 ASM

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1047

Prior Distribution is Uniform Exam questions commonly use a uniform distribution as prior. The mean of a uniform is the midpoint of the range, and the variance is the range squared over 12. Higher moments are computed using integration. Especially common are questions in which the model is Poisson, or compound Poisson. Example 53C You are given: (i) Claim counts follow a Poisson distribution with mean Λ. (ii) The prior distribution Λ is uniform on [0.1, 0.15]. Calculate Bühlmann’s k for claim counts. Answer: The hypothetical mean and process variance are both Λ. The mean of Λ is v  0.125 and the variance is a  0.052 /12. It follows that k  0.125 (12) /0.052  600 . 

?

Quiz 53-2 You are given: (i) Claim counts follow a Poisson distribution with mean Λ2 . (ii) The prior distribution Λ is uniform on [0.1, 0.7]. Calculate Bühlmann’s k for claim counts. Questions with compound processes are also possible. Example 53D You are given: (i) Claim counts follow a negative binomial with parameters R and β  1. (ii) Claim sizes follow a single-parameter Pareto distribution with α  4 and θ  100R. (iii) Given R, claim counts and claim sizes are independent, and claim sizes are independent of each other. (iv) The distribution of R is uniform on (1,3). Calculate Bühlmann’s k for aggregate losses. Answer: The hypothetical mean is Rβαθ/ ( α − 1)  ( R )(4)(100R ) /3  (400/3) R 2 . The variance of the hypothetical means is 400 400 2 R  a  Var 3 3







160,000 * 9

3

Z 1

!2 

E[R 4 ] − E[R 2 ]2

r 4 dr 22 − 22 + 2 12

,

!2

160,000 * 35 − 15 13 +  − 9 10 3

, 



!2 + -

-

160,000 169  24.2 −  96,395.062 9 9



The process variance is v ( R )  E[N | R] Var ( X | R ) + Var ( N | R ) E[X | R]2

!2

 R*

,

4 (100R ) 2 4 (100R ) + 4 (100R ) − + 2R 2 3 3

-

20,000 3 320,000 3 340,000 3  R + R  R 9 9 9 C/4 Study Manual—17th edition Copyright ©2014 ASM

!2

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1048

The expected value of the process variance is evaluated by integrating r 3 times the uniform density 1/2. 340,000 v 9

3

Z 1

r 3 dr 340,000  2 9

!

34 − 14  377,777 79 8

!

The Bühlmann k is 377,777 79 /96,395.062  3.9191 .



Priors Requiring Integration Questions requiring integration are rare, and harder than the other types. An example would be a prior not in the tables. In some older questions, which you’ll see in the exercises, you would be given evaluated integrals to help you out, but that doesn’t seem to be the current style. Other questions in this category have a model with means and variances which are not polynomial functions of the parameters. For example, if the model is lognormal with varying µ and/or σ, the hypothetical mean and process variance are exponentials of the parameters. Sometimes moment generating functions in the tables will help, but usually you’ll have to integrate. Example 53E You are given: (i) Claim sizes follow a two-parameter Pareto distribution with parameters α and θ. (ii) The prior joint distribution for α and θ has probability density function

  0.1 f ( α, θ )   0

3 ≤ α ≤ 5, 5 ≤ θ ≤ 10 otherwise

 Calculate Bühlmann’s k for claim sizes. Answer: The hypothetical mean is µ ( α, θ )  θ/ ( α −1) . Let’s double integrate it to get the first and second moments of the hypothetical mean. E[µ ( α, θ ) ]  0.1

Z

 0.1

Z

 0.1

Z

10 Z 5 5

5

10

3

θ dαdθ α−1

 5

θ ln ( α − 1) dθ



3

10

5

θ (ln 2) dθ

102 − 52  2.599302 2 Z 10 Z 5 θ2 E[µ ( α, θ ) 2 ]  0.1 dαdθ 2 5 3 ( α − 1)  0.1 (ln 2)

 0.1

10

Z

5

θ2

 0.1 (0.25)



1 1 − dθ 3−1 5−1



10

θ 3 3 5

103 − 53  0.025  7.291667 3

!

a  7.291667 − 2.5993022  0.535296 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 53

1049

The process variance is 2θ 2 θ v ( α, θ )  − ( α − 1)( α − 2) α−1

!2

We already calculated the second moment of θ/ ( α − 1) as 7.291167, so it remains to calculate the expected value of 2θ 2 / ( α − 1)( α − 2) , which will require partial fractions. We find that 1

( α − 1)( α − 2)

−

1 1 + α−1 α−2

We already calculated the integral of 1/ ( α − 1) from 3 to 5 as ln 2. Similarly, the integral of 1/ ( α − 2) from 3 to 5 is ln 3 − ln 1  ln 3. Therefore (the density function 0.1 is left outside the integral) v  0.2

10

Z 5

θ2

5

Z 3

 0.2 (ln 3 − ln 2)  0.2 ln 1.5



Z

103 − 3



1 1 + dα dθ − 7.291167 α−1 α−2

10

5 ! 53



θ 2 dθ − 7.291167 − 7.291167  16.3610

The Bühlmann k is 16.3610/0.535296  30.56 .



Coverage of this material in the three syllabus options This material is required. It is covered in all three syllabus reading options.

Exercises 53.1.

[4-F01:38] You are given:

(i) Claim size, X, has mean µ and variance 500. (ii) The random variable µ has a mean of 1000 and variance of 50. (iii) The following three claims were observed: 750, 1075, 2000 Calculate the expected size of the next claim using Bühlmann credibility. (A) 1025

(B) 1063

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1115

(D) 1181

(E) 1266

Exercises continue on the next page . . .

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1050

53.2.

[4B-S90:41] (2 points) You are given the following:

(i) The number of claims made by an individual insured follows a Poisson distribution. (ii) The expected number of claims, λ, for insureds in the population has the probability density function f ( λ )  4λ−5 for 1 ≤ λ < ∞.

Determine the value of the Bühlmann k used for estimating the expected number of claims for an individual insured. (A) (B) (C) (D) (E)

k < 5.7 5.7 ≤ k < 5.8 5.8 ≤ k < 5.9 5.9 ≤ k < 6.0 k ≥ 6.0

Use the following information for questions 53.3 and 53.4: You are given the following: (i) The claim count N for an individual insured has a Poisson distribution with mean λ; and (ii) λ is uniformly distributed between 1 and 3 53.3. [4B-S91:42] (2 points) Determine the probability that a randomly selected insured will have no claims. (A) (B) (C) (D) (E) 53.4.

Less than 0.11 At least 0.11, but less than 0.13 At least 0.13, but less than 0.15 At least 0.15, but less than 0.17 At least 0.17 [4B-S91:43] (2 points) An insured has one claim during the first period.

Use Bühlmann’s credibility formula to estimate the expected number of claims for that insured in the next period. (A) (B) (C) (D) (E)

Less than 1.20 At least 1.20, but less than 1.40 At least 1.40, but less than 1.60 At least 1.60, but less than 1.80 At least 1.80

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 53

1051

53.5. [4B-S90:52] (2 points) The number of claims each year for an individual insured has a Poisson distribution. The expected annual claim frequency of the entire population of insureds is uniformly distributed over the interval (0, 1). An individual’s expected claim frequency is constant through time. A particular insured had 3 claims during the prior three years. Determine the Bühlmann credibility estimate of this insured’s future annual claim frequency. (A) (B) (C) (D) (E) 53.6.

Less than 0.60 At least 0.60, but less than 0.65 At least 0.65, but less than 0.70 At least 0.70, but less than 0.75 At least 0.75 [C-S05:6] You are given:

(i) Claims are conditionally independent and identically Poisson distributed with mean Θ. (ii) The prior distribution function of Θ is: F (θ)  1 −

1 1+θ

! 2.6 ,

θ>0

Five claims are observed. Determine the Bühlmann credibility factor. (A) (B) (C) (D) (E) 53.7.

Less than 0.6 At least 0.6, but less than 0.7 At least 0.7, but less than 0.8 At least 0.8, but less than 0.9 At least 0.9 You are given the following:

(i)

Losses for a given policyholder follow a two-parameter Pareto distribution with parameters α and θ. (ii) α does not vary by policyholder. (iii) θ varies by insured, and has density function π (θ)  (iv)

1 θ 8 e −θ/1000 10009 Γ (9)

θ>0

The Bühlmann credibility of one loss observation is 0.05.

Determine α. (A) (B) (C) (D) (E)

Less than 4.0 At least 4.0, but less than 4.2 At least 4.2, but less than 4.4 At least 4.4, but less than 4.6 At least 4.6

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1052

Use the following information for questions 53.8 and 53.9: You are given the following: •

Number of claims for a single insured follows a Poisson distribution with mean µ.



The amount of a single claim has an exponential distribution given by f (x ) 

1 −x/λ λe

x > 0, λ > 0.



µ and λ are independent random variables.



E[µ]  0.10, Var ( µ )  0.0025



E[λ]  1000, Var ( λ )  640,000



Number of claims and claim severity distributions are independent.

53.8. [4B-S93:21] (2 points) Determine the expected value of the pure premium’s process variance for a single risk. (A) (B) (C) (D) (E)

Less than 150,000 At least 150,000, but less than 200,000 At least 200,000, but less than 250,000 At least 250,000, but less than 300,000 At least 300,000 [4B-S93:22] (2 points) Determine the variance of the hypothetical means for the pure premium.

53.9. (A) (B) (C) (D) (E)

Less than 10,000 At least 10,000, but less than 20,000 At least 20,000, but less than 30,000 At least 30,000, but less than 40,000 At least 40,000

53.10. [4B-S93:29] (2 points) You are given the following: •

The distribution for number of claims is binomial with parameters q and m  1.



The prior distribution of q has mean = 0.25 and variance = 0.07. Determine the Bühlmann credibility to be assigned to a single observation of one risk.

(A) (B) (C) (D) (E)

Less than 0.20 At least 0.20, but less than 0.25 At least 0.25, but less than 0.30 At least 0.30, but less than 0.35 At least 0.35

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 53

1053

Use the following information for questions 53.11 and 53.12: You are given the following: •

The number of claims for a single risk follows a Poisson distribution with mean m.



m is a random variable with E[m]  0.40 and Var ( m )  0.10.



The amount of an individual claim has a uniform distribution on [0, 100,000].



The number of claims and the amount of an individual claim are independent.

53.11. [4B-S94:22] (3 points) Determine the expected value of the pure premium’s process variance for a single risk. (A) (B) (C) (D) (E)

Less than 400 million At least 400 million, but less than 800 million At least 800 million, but less than 1200 million At least 1200 million, but less than 1600 million At least 1600 million

53.12. [4B-S94:23] (2 points) Determine the variance of the hypothetical means for the pure premium. (A) (B) (C) (D) (E)

Less than 400 million At least 400 million, but less than 800 million At least 800 million, but less than 1200 million At least 1200 million, but less than 1600 million At least 1600 million

53.13. [4B-S96:11,4B-F98:21] (3 points) You are given the following: •

The number of claims for a single risk follows a Poisson distribution with mean λ.



The amount of an individual claim is always 1,000λ.



λ is a random variable with the density function f (λ) 

4 , λ5

1 < λ < ∞.

Determine the expected value of the process variance of the aggregate losses for a single risk. (A) (B) (C) (D) (E)

Less than 1,500,000 At least 1,500,000, but less than 2,500,000 At least 2,500,000, but less than 3,500,000 At least 3,500,000, but less than 4,500,000 At least 4,500,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1054

Use the following information for questions 53.14 and 53.15: You are given the following: •

A large portfolio of automobile risks consists solely of youthful drivers.



The number of claims for one driver during one exposure period follows a Poisson distribution with mean 4 − g, where g is the grade point average of the driver.



The distribution of g within the portfolio is uniform on the interval [0, 4].

A driver is selected at random from the portfolio. During one exposure period, no claims are observed for this driver. 53.14. [4B-S97:4] (2 points) Determine the posterior probability that the selected driver has a grade point average greater than 3. (A) (B) (C) (D) (E)

Less than 0.15 At least 0.15, but less than 0.35 At least 0.35, but less than 0.55 At least 0.55, but less than 0.75 At least 0.75

53.15. [4B-S97:5] (2 points) A second driver is selected at random from the portfolio. During five exposure periods, no claims are observed for this second selected driver. Determine the Bühlmann credibility estimate of the expected number of claims for this second driver during the next exposure period. (A) (B) (C) (D) (E)

Less than 0.375 At least 0.375, but less than 0.425 At least 0.425, but less than 0.475 At least 0.475, but less than 0.525 At least 0.525

53.16. [4B-F99:20] (3 points) You are given the following: •

The number of claims follows a Poisson distribution with mean λ.



Claim sizes follow a distribution with density function f (x ) 

1 −x/λ ,0 λe

< x < ∞.



Given λ, the number of claims and claim sizes are independent.



The prior distribution of λ has density function g ( λ )  e −λ , 0 < λ < ∞. Determine the value of Bühlmann’s k for aggregate losses. Hint:

(A) 0

R

∞ 0

λ n e −λ dλ  n! (B) 3/5

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1

(D) 2

(E) ∞

Exercises continue on the next page . . .

EXERCISES FOR LESSON 53

1055

Use the following information for questions 53.17 and 53.18: You are given the following: •

In a large portfolio of automobile risks, the number of claims for one policyholder during one year m , where m is the number of miles driven each and follows a Bernoulli distribution with mean 100,000 every year by the policyholder.



The number of claims for one policyholder for one year is independent of the number of claims for the policyholder for any other year. The number of claims for one policyholder is independent of the number of claims for any other policyholder.



The distribution of m within the portfolio has density function m   , 0 < m ≤ 10,000    100,000,000  f (m )   20,000 − m   , 10,000 < m < 20,000  100,000,000 

A policyholder is selected at random from the portfolio. During Year 1, one claim is observed for this policyholder. During Year 2, no claims are observed for this policyholder. No information is available regarding the number of claims observed during Years 3 and 4. Hint: Use a change of variable such as p  m/100,000. 53.17. [4B-F97:9] (2 points) Determine the posterior probability that the selected policyholder drives less than 10,000 miles each year. (A) 1/3

(B) 37/106

(C) 23/54

(D) 1/2

(E) 14/27

53.18. [4B-F97:10] (3 points) Determine the Bühlmann credibility estimate of the expected number of claims for the selected policyholder during Year 5. (A) 3/31

(B) 1/10

(C) 7/62

(D) 63/550

(E) 73/570

53.19. The mortality rate for individual in a group is Q. Q varies by group. The distribution of Q is uniform on [0.005, 0.010]. Determine the Bühlmann credibility factor Z based on 100 observations of mortality from a single group. 53.20. Claim sizes follow a two-parameter Pareto distributed with parameters θ and α  3. θ varies among insureds according to the distribution with the following density function: f (θ) 

θe −θ/1000 . 1,000,000

Determine the Bühlmann credibility to assign to a single observation.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1056

53.21. [4B-S98:7] (2 points) You are given the following: •

The number of claims during one exposure period follows a Bernoulli distribution with mean p.



The prior density function of p is assumed to be f (p ) 

πp π sin , 2 2

0 < p < 1.

Determine the expected value of the process variance. Hint: (A) (B) (C) (D) (E)

1

Z 0

πp πp 2 sin dp  and 2 2 π

1

Z 0

πp 2 πp 4 sin dp  2 ( π − 2) . 2 2 π

4 ( π − 3) π2 2 (4 − π ) π2 4 ( π − 2) π2 2 π 4−π 2 ( π − 3)

53.22. [4B-S98:26] (3 points) You are given the following: •

The number of claims follows a Poisson distribution with mean m.



Claim sizes follow a distribution with mean 20m and variance 400m 2 .



m is a gamma random variable with density function f (m ) 



m 2 e −m , 2

0 < m < ∞.

For any value of m, the number of claims and claim sizes are independent. Determine the expected value of the process variance of the aggregate losses.

(A) (B) (C) (D) (E)

Less than 10,000 At least 10,000, but less than 25,000 At least 25,000, but less than 40,000 At least 40,000, but less than 55,000 At least 55,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 53

1057

53.23. [4B-S99:13] (3 points) You are given the following: •

The number of claims follows a distribution with mean λ and variance 2λ.



Claim sizes follow a distribution with mean µ and variance 2µ2 .



The number of claims and claim sizes are independent.



λ and µ have a prior probability distribution with joint density function f ( λ, µ )  1, 0 < λ < 1, 0 < µ < 1. Determine the value of Bühlmann’s k for aggregate losses.

(A) (B) (C) (D) (E)

Less than 3 At least 3, but less than 6 At least 6, but less than 9 At least 9, but less than 12 At least 12

Use the following information for questions 53.24 and 53.25: You are given the following: •

Claim sizes for a given policyholder follow a distribution with density function: f (x ) 



2x , b2

0 < x < b.

The prior distribution of b has density function g (b ) 

1 , b2

1 < b < ∞.

53.24. [4B-F99:4] (2 points) Determine the expected value of the process variance. (A) 0

(B) 1/18

(C) 4/9

(D) 1/2

(E) ∞

53.25. [4B-F99:5] (3 points) The policyholder experiences a claim of size 2. Determine the expected value of a second claim from this policyholder. (A) 1

(B) 3/2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2

(D) 3

(E) ∞

Exercises continue on the next page . . .

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1058

53.26. You are given the following: (i) For a given policyholder, claim count has mean m and variance m 1.5 . (ii) m has a lognormal distribution with parameters µ  0.1 and σ  0.4. (iii) The policyholder had 1 claim in year 1, 2 in year 2, and 0 in year 3. Determine the Bühlmann credibility estimate of the number of claims in year 4. (A) (B) (C) (D) (E)

Less than 1.06 At least 1.06, but less than 1.08 At least 1.08, but less than 1.10 At least 1.10, but less than 1.12 At least 1.12

53.27. [4-S00:37] You are given: (i) X i is the claim count observed for driver i for one year. (ii) X i has a negative binomial distribution with parameters β  0.5 and r i . (iii) µ i is the expected claim count for driver i for one year. (iv) The µ i ’s have an exponential distribution with mean 0.2. Determine the Bühlmann credibility factor for an individual driver for one year. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20

53.28. [4-S01:38] You are given the following information about workers’ compensation coverage: (i)

The number of claims for an employee during the year follows a Poisson distribution with mean

(100 − p ) /100 where p is the salary (in thousands) for the employee. (ii) The distribution of p is uniform on the interval (0, 100]. An employee is selected at random. During the last 4 years, the employee has had a total of 5 claims. Determine the Bühlmann credibility estimate for the expected number of claims the employee will have next year. (A) 0.6

(B) 0.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1.0

(D) 1.1

(E) 1.2

Exercises continue on the next page . . .

EXERCISES FOR LESSON 53

1059

53.29. [4-F01:18] You are given the following information about a book of business comprised of 100 insureds: (i)

Xi 

(ii)

P Ni

j1

Yi j is a random variable representing the annual loss of the i th insured.

N1 , N2 , . . . , N100 are independent random variables distributed according to a negative binomial distribution with parameters r (unknown) and β  0.2 . (iii) Unknown parameter r has an exponential distribution with mean 2. (iv) Yi1 , Yi2 , . . . , YiNi are independent and identically distributed random variables following a Pareto distribution with α  3.0 and θ  1000. Determine the Bühlmann credibility factor, Z, for the block of business. (A) 0.000

(B) 0.045

(C) 0.500

(D) 0.826

(E) 0.905

53.30. [4-F02:18] You are given: (i) Annual claim frequency for an individual policyholder has mean λ and variance σ2 . (ii) The prior distribution for λ is uniform on the interval [0.5, 1.5]. (iii) The prior distribution for σ2 is exponential with mean 1.25. A policyholder is selected at random and observed to have no claims in Year 1. Using Bühlmann credibility, estimate the number of claims in Year 2 for the selected policyholder. (A) 0.56

(B) 0.65

(C) 0.71

(D) 0.83

(E) 0.94

53.31. [4-F03:11] You are given: (i) Claim counts follow a Poisson distribution with mean θ. (ii) Claim sizes follow an exponential distribution with mean 10θ. (iii) Claim counts and claim sizes are independent, given θ. (iv) The prior distribution has probability density function: π (θ) 

5 , θ6

θ>1

Calculate Bühlmann’s k for aggregate losses. (A) (B) (C) (D) (E)

Less than 1 At least 1, but less than 2 At least 2, but less than 3 At least 3, but less than 4 At least 4

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1060

53.32. [4B-S95:4] (3 points) You are given the following: •

The number of losses for a single risk follows a Poisson distribution with mean m.



The amount of an individual loss follows an exponential distribution with mean 1000/m and variance (1000/m ) 2 .



m is a random variable with density function f ( m )  (1 + m ) /6, 1 < m < 3



The number of losses and the individual loss amounts are independent. Determine the expected value of the pure premium’s process variance for a single risk.

(A) (B) (C) (D) (E)

Less than 940,000 At least 940,000, but less than 980,000 At least 980,000, but less than 1,020,000 At least 1,020,000, but less than 1,060,000 At least 1,060,000

53.33. You are given: (i) Claim sizes follow a lognormal distribution with parameters µ and σ  2. (ii) The prior distribution of µ has probability density function: f ( µ )  5e −5µ

µ>0

Calculate Bühlmann’s k for claim sizes. 53.34. [4-F04:29] You are given: (i) Claim counts follow a Poisson distribution with mean λ. (ii) Claim sizes follow a lognormal distribution with parameters µ and σ. (iii) Claim counts and claim sizes are independent. (iv) The prior distribution has joint probability density function: f ( λ, µ, σ )  2σ, 0 < λ < 1, 0 < µ < 1, 0 < σ < 1 Calculate Bühlmann’s k for aggregate losses. (A) (B) (C) (D) (E)

Less than 2 At least 2, but less than 4 At least 4, but less than 6 At least 6, but less than 8 At least 8

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 53

1061

53.35. [C-S05:17] You are given: (i) The annual number of claims on a given policy has a geometric distribution with parameter β. (ii) The prior distribution of β has the Pareto density function π (β) 

α , ( β + 1) ( α+1)

0 3) Pr ( N  0)

Pr ( g > 3 | N  0)  Pr ( N  0)  Pr ( N  0 | g > 3) Pr ( g > 3) 

Z

1 4

Z

4

0

3

4

e g−4 dg 

1 g−4 4 4e 0

 14 (1 − e −4 )

e g−4 dg  14 (1 − e −1 )

1 − e −1  0.6439 1 − e −4

Pr ( g > 3 | N  0)  53.15.

1 4

µ  2, v ( θ )  4 − g, v  E[4 − g]  2, a  Var (4 − g ) 

16 12

(D)

 43 .

5 (4/3) 20 10   5 (4/3) + 2 26 13 6 E[X6 | X]  2 + Z (−2)   0.4615 13 Z

(C)

53.16. µ ( λ )  λ2 v ( λ )  λ ( λ2 + λ 2 )  2λ 3 We will use the hint three times! But this hint is in your tables as the n th moment of an exponential with mean 1. v  E 2λ

f

3

g

2



Z

0 g 4

a  Var ( λ 2 )  E λ

f

k

3 12  20 5

3 −λ

λ e

!

dλ  2 (6)  12

− E[λ 2 ]2  24 − 4  20

(B)

53.17. A rather difficult pair of questions. Probably not too many students got the 5 points these questions were worth. We will use the recommended change of variable, although another change of variable worth considering is p  m/10,000, which makes p vary between 0 and 2. With the recommended change of variable: m  100,000p C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 53

1065

dm  100,000 dp 100,000p    (100,000)  100p    100,000,000 ! f (p )    20,000 − 100,000p    (100,000)  20 − 100p  100,000,000 

0 < p < 0.1 0.1 < p < 0.2

We want to calculate Pr (1, 0 and p < 0.1) Pr (1, 0) Pr (1, 0 and p < 0.1)  Pr (1, 0 and p < 0.1) + Pr (1, 0 and p ≥ 0.1)

Pr ( p < 0.1 | 1, 0) 

We will obtain the numerator by integrating, from 0 to 0.1, Pr (1, 0)  p (1 − p ) over the density function of p. We will obtain the second term of the denominator by integrating the same integrand from 0.1 to 0.2. Pr (1, 0 and p < 0.1) 

0.1

Z

100p ( p )(1 − p ) dp

0

 100

Z

 100



Pr (1, 0 and p ≥ 0.1)  

0

0.2

Z

0.1

0.1 Z 0.2

( p 2 − p 3 ) dp 

0.001 0.0001 37 −  3 4 1200

(20 − 100p )( p )(1 − p ) dp

(20p − 120p 2 + 100p 3 ) dp 0.1   

0.008 − 0.001 0.0016 − 0.0001 0.04 − 0.01 − 120 + 100 2 3 4  0.3 − 0.28 + 0.0375  0.0575



 20

So the final answer is

Pr ( p < 0.1 | 1, 0)  53.18. We calculate µ and v.

37/1200 37 37   37/1200 + 0.0575 37 + 69 106

µ(p )  p





(B)

µ  E[P]  0.1

The process variance is Bernoulli, so v ( p )  p (1 − p )

v  E P (1 − P ) 

f

g

37 106 + 0.0575  1200 1200

by previous exercise’s solution

To calculate a  Var ( P ) , we need E P 2 . We can calculate it directly:

f

f

E P

2

g

0.1

Z 

0

g

3

100p dp +

Z

0.2 0.1



20p 2 − 100p 3 dp



100 (0.0001) 20 100  + (0.008 − 0.001) − (0.0016 − 0.0001) 4 3 4 0.14 7  0.0025 + − 0.0375  3 600 C/4 Study Manual—17th edition Copyright ©2014 ASM

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1066

A faster way is to observe that v  E P (1 − P )  E[P] − E P 2 , so

f

g

f

E[P 2 ]  E[P] − v  0.1 −

g

106 7  1200 600

Finishing up, 0.035 0.005 − 0.01  3 3 0.04 2 2 (0.005/3)   Z 2 (0.005/3) + 1.06/12 1.1 55 a

5.5 + 0.8 2 63  PC  0.1 + 0.4  55 55 550

!

(D)

53.19. Mortality is a Bernoulli variable—either a death occurs or it doesn’t—with mean Q. Then the hypothetical mean is Q and the process variance is Q (1 − Q ) . v ( Q )  Q (1 − Q )

v  E[Q] − E[Q 2 ]  0.0075 − a  Var ( Q ) 

Z

2 1 12 (0.005 )





2 1 12 (0.005 )

+ 0.00752 



0.000025 12

0.0893 12

100a 0.0025   0.02723 100a + v 0.0893 + 0.0025

53.20. θ 2 Prior is gamma with parameters 2 and 1000. µ(θ) 

1 θ  Var (Θ)  41 (2)(10002 )  500,000 2 4

!

a  Var

θ 2 3θ 2  4 4  3 v  E[v ( θ ) ]  6 (10002 )  4,500,000 4 a 500,000 Z   0.1 a + v 500,000 + 4,500,000

v ( θ )  θ2 −

53.21. The process variance for a Bernoulli is p (1 − p ) . v ( p )  p (1 − p )

v  E p (1 − p )  E[p] − E[p 2 ]

f

1 πp πp πp 2 πp  sin dp − sin dp 2 2 2 2 0 0 2 4  − 2 ( π − 2) π π 2 (4 − π ) 2π − 4π + 8  (B)  2 π π2

Z

C/4 Study Manual—17th edition Copyright ©2014 ASM

1

g

Z

EXERCISE SOLUTIONS FOR LESSON 53

1067

53.22. We use equation (14.4) to obtain that the process variance is m[400m 2 + (20m ) 2 ]. So v ( m )  m (800m 2 )  800m 3 We recognize the prior distribution of m, f ( m ) , as a gamma with α  3 (one less than the exponent) and θ  1 (the denominator of the exponent of e). Then the expected value of v ( m ) is obtained from the tables, which state that the third moment of a gamma is θ 3 ( α )( α + 1)( α + 2) . v  E[800m 3 ]  800 E[m 3 ]  800 (5)(4)(3)  48,000

(D)

53.23. µ ( λ, µ )  λµ We use the compound variance formula, equation (14.2), to calculate process variance. v ( λ, µ )  λ (2µ2 ) + 2λ ( µ2 )  4µ2 λ v  E 4µ2 λ  4 ( 13 )( 12 ) 

f

g

2 3

For the previous and next line, recall that if X has a uniform distribution on (0, 1) , then E[X]  1/2 and E[X 2 ]  1/3. a  Var ( λµ )  E λ2 µ2

f

k

g

1 − E[λµ]2  3

1 * 1 −. 3 2

  

!

!

,

2/3 96  7/144 7

2

1 + 1 1 7 /  −  2 9 16 144

-

(E)

53.24. We calculate the process variance by calculating the conditional mean and the conditional second moment, then subtracting the square of the conditional mean from the conditional second moment. 1 b2

Z

1 E[X | b]  2 b

Z

E[X | b]  2

Var ( X | b )  b 2

b 0 b 0



b

1 2x 3 2b 3 2  b  b 2 3 0 3b 2 3

2x 2 dx 

b

1 2x 4 b2  2x dx  2 2 b 4 0 3

1 4 b2 −  2 9 18



The prior distribution of b is a single-parameter Pareto with θ  1 and α  1. We know that the positive moments of such a distribution are infinite. (E) If you didn’t recognize the prior, you could calculate the expected value of the process variance by integrating the process variance over the prior distribution of b. 1 1 E[b 2 ]  v 18 18

C/4 Study Manual—17th edition Copyright ©2014 ASM



Z 1

b2 db  ∞ b2

(E)

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1068

53.25. We use Bayes theorem. As indicated on page 1002, whenever you are asked for expected value and not told how to calculate it, use Bayesian methods. Let g be the posterior distribution. g ( b | X1  2) 

f (2 | b ) g ( b ) (4/b 2 )(1/b 2 )  R∞ f (2) (4/b 4 ) db

b>2

2

Note that b > 2, since a policyholder’s claim must be no greater than b. This point is easy to miss, and if you miss it, the lower bound of the integral for the predictive expected value will be wrong, affecting the result. ∞

−4 ∞ 1 4 db   6 3b 3 2 b4 2 24 g ( b | X1  2)  4 b > 2 b

Z

To calculate the expected value of a second claim, we integrate x f ( x ) using the posterior density as a weight. E[X2 | X1  2]  

∞Z b

Z

Z2 ∞ 2

 16

0

48 b 3 db b6 3 ∞

Z 2

24 2x 2 dx db b4 b2

1 −1 db  8 2 3 b b

! ∞  2 2

(C)

53.26. This exercise shows that fractional moments have a use! In this solution, µ is a lognormal parameter, not the expected value of the hypothetical mean. Expected hypothetical mean  E[m]  exp (0.18)  1.1972 v  E[m 1.5 ]  exp 1.5 (0.1) + 1.52 (0.42 ) /2  e 0.33  1.3910



a  Var ( m )  e 2µ+σ



2



3a  0.3491 3a + v x¯  1

2

e σ − 1  e 0.36 e 0.16 − 1  0.2487







Z

PC  1.1972 − 0.3491 (0.1972)  1.1284

(E)

53.27. The hypothetical mean is µ i , which has an exponential distribution with mean 0.2, so the variance of the hypothetical means is the square of the mean, or a  0.22  0.04. The process variance is v ( r i )  r i (0.5)(1.5)  0.75r i . We are given that µ i  E[X i | r i ]  0.5r i , so E[µ i ]  0.2  0.5 E[r i ] E[r i ]  0.4 v  E[v i ]  0.75 E[r i ]  0.3 The credibility factor is Z

C/4 Study Manual—17th edition Copyright ©2014 ASM

a 0.04 2    0.1176 a + v 0.34 17

(C)

EXERCISE SOLUTIONS FOR LESSON 53 53.28. The hypothetical mean is

100−p 100 .

1069

The variance of this is

a  Var (1 − 0.01p )  0.0001 Var ( p )  0.0001

1 1002  , 12 12

since the variance of a uniform distribution on [0, θ] is θ 2 /12. The process variance is the same as the hypothetical mean, since the process is Poisson. The expected value of the process variance is v  E (1 − 0.01p )  1 − 0.01 (50)  0.5 since the mean of a uniform distribution on [0, θ] is θ/2. na 4/12 The credibility factor is Z    0.4. The expected value of the hypothetical mean, na + v 4/12 + 1/2 µ  0.5, so the credibility estimate is PC  0.6 (0.5) + 0.4 (1.25)  0.8 53.29. The hypothetical mean is E[N | r] E[Y | r]  0.2r means is a  10,000 (22 )  40,000. The process variance is E[N | r] Var ( Y | r ) + Var ( N | r ) E[Y | r]2  (0.2r )

1000 3−1

(B)

 100r. The variance of the hypothetical

2 (10002 ) − 5002 + r (0.2)(1.2) 5002 (3 − 1)(3 − 2)

!

 0.2 (750,000) r + 0.24 (250,000r )  150,000r + 60,000r  210,000r The expected value of the process variance is 210,000 (2)  420,000. There are 100 exposures, so n  100. The credibility factor is Z

na 4,000,000   0.9050 na + v 4,000,000 + 420,000

(E)

53.30. The hypothetical mean is λ, whose mean is µ  1 and whose variance is a  of a uniform distribution on [a, b] is ( b − a ) 2 /12. The process variance is σ2 whose mean is v  1.25. Then a 1/12 1   a + v 1/12 + 1.25 16 15 1 15 (1) + (0)   0.9375 PC  16 16 16 Z

(E)

53.31. The prior distribution is a single-parameter Pareto with α  5, θ  1, so 5 4 5 E[Θ2 ]  3 5 E[Θ3 ]  2 E[Θ] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 12

since the variance

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1070

E[Θ4 ]  5 5 Var (Θ )  5 − 3 2

!2 

20 9

2000 The hypothetical mean is 10θ 2 , so a  100 20 9  9 . The process variance for this compound Poisson distribution is θ (2)(10θ ) 2  200θ 3 , so v  200 52  500. So



k

500  2.25 2000/9

(C)

53.32. Let S be the pure premium. To evaluate the process variance, Var ( X | m ) , we use the compound variance formula Var ( S | m )  E[N | m] Var ( X | m ) + Var ( N | m ) E[X | m]2 1000 Var ( S | m )  m m

!2

1000 +m m

!2



2,000,000 . m

We calculate the expected value of the process variance by integrating. v  E[Var ( S | m ) ] 2,000,000 E m

"

3

Z 

1

#

1 + m 2,000,000 dm 6 m

3 2,000,000 1  + 1 dm 6 m 1 2,000,000  (ln 3 + 2)  1,032,871 6

Z





(D)

2

2

2

53.33. The hypothetical mean is e µ+σ /2  e µ+2 and the process variance is e 2µ+2σ − ( e µ+0.5σ ) 2  e 2µ ( e 8 − e 4 ) . The prior distribution of µ is exponential with mean 0.2. The variance of the hypothetical means is a  E[e 2µ+4 ] − E[e µ+2 ]2  e 4 (E[e 2µ ] − E[e µ ]2 )

and the expected value of the process variance is

v  E[e 2µ ( e 8 − e 4 ) ]  ( e 8 − e 4 ) E[e 2µ ]

Now, E[e aµ ] is the moment generating function of µ. According to the tables, the moment generating function of an exponential is M ( t )  (1 − θt ) −1

so with θ  0.2, M ( t )  (1 − 0.2t ) −1 . Then E[e µ ]  M (1)  1/0.8  1.25 and E[e 2µ ]  M (2)  1/0.6  5/3, and a  e 4 (5/3 − 1.252 )  (5/48) e 4

v  (5/3)( e 8 − e 4 ) k

C/4 Study Manual—17th edition Copyright ©2014 ASM

(5/3)( e 8 − e 4 )  857.57 (5/48) e 4

EXERCISE SOLUTIONS FOR LESSON 53

1071

2

53.34. The hypothetical mean is λe µ+σ /2 and for this compound Poisson distribution, the process vari2 ance is λe 2µ+2σ . Since λ, µ, and σ are independent because the joint density is the product of the individual densities: f ( λ, µ, σ )  2σ  (1)(1)(2σ )  f ( λ ) f ( µ ) f ( σ ) So we can calculate expectations separately and then multiply them together. Even if you didn’t notice this, you’d end up doing this anyway when calculating the triple integral. E[HM]  E[λ] E[e µ ] E[e σ 1

2 /2

]

1



1 2



  1 ( e − 1)(2) e 1/2 − 1  1.114686 2

!Z 0

e µ dµ

Z 0

2σe σ

2 /2



!

E HM 2  E λ 2 E e 2µ E e σ

f

g

f

1  3

!

g

f

g

f

2

g

e2 − 1 ( e − 1)  1.829700 2

!

a  1.829700 − 1.1146862  0.587175

E[PV]  E[λ] E e 2µ E e 2σ

f

g

f

2

g

1 e2 − 1 e2 − 1  5.102505 2 2 2 5.102505 k  8.6899 (E) 0.587175

!

!

!



53.35. The prior distribution is a Pareto with parameters θ  1 and α, with mean E[β]  1/ ( α −1) , second .  2 moment E[β ]  2 ( α − 1)( α − 2) , and variance Var ( β ) 

2

( α − 1)( α − 2)



1

( α − 1) 2



α

( α − 1) 2 ( α − 2)

The hypothetical mean is β. The overall mean is µ  1/ ( α − 1) . The variance of the hypothetical mean is a  Var ( β ) 

α

( α − 1) 2 ( α − 2)

The process variance is β (1 + β ) . The expected value of the process variance is v  E[β] + E[β 2 ]  Bühlmann’s k is

v a

1 2 α +  α − 1 ( α − 1)( α − 2) ( α − 1)( α − 2)

 α − 1. Credibility for 1 observation is Z 

number of claims in Year 2 is

α−1 PC  α

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

1 1  . The estimate of the 1 + ( α − 1) α

1 1 x+1 + (x )  α−1 α α

!

!

(D)

53. BÜHLMANN CREDIBILITY: CONTINUOUS PRIOR

1072

Quiz Solutions 53-1.

The Pareto has parameters α  3, θ  1. µ  E[β] 

1 2

!2

2 1 3 a  Var ( β )  −  2·1 2 4 1 2 3 v  E[β (1 + β ) ]  +  2 2·1 2 4a 3 2 Z   4a + v 3 + 3/2 3 2 1 1 1  (0) + 3 3 2 6

!

PC 

53-2. The hypothetical mean and process variance are both Λ2 . The expected value of the process variance, E[Λ2 ], may be computed as the variance plus the mean squared of Λ, or v  0.62 /12+0.42  0.19. The variance of the hypothetical means, Var (Λ2 ) , is computed as the fourth moment minus the square of the second moment. We’ve just computed the second moment as 0.19. The fourth moment will be calculated by integration. Notice that the density function of a uniform is the reciprocal of the range, or 1/0.6 here. E[Λ4 ] 

Z

0.7 0.1

λ 4 dλ 0.75 − 0.15   0.05602 0.6 (5)(0.6)

Then a  0.05602 − 0.192  0.01992. The Bühlmann k is 0.19/0.01992  9.53815

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 54

Bühlmann-Straub Credibility Reading: Loss Models Fourth Edition 18.6 or SN C-24-05 1.2 or Introduction to Credibility Theory 6.7, 7.2 This lesson deals with two generalizations of Bühlmann credibility. The first generalization is to a case in which exposure varies. The second is to a case in which the variance of the observations has a constant component as well as a component depending on the exposure.

54.1

Bühlmann-Straub model: Varying exposure

The Bühlmann model assumes one exposure in every period. We would like to generalize to a case where there are m j exposures in period j. If the process variance is v, then the variance of the mean of the observations in period j is v/m j , because the variance of the sample mean is the distribution variance divided by the number of observations. For example, suppose you observed 20 claims from a group of 5 in a period. This would translate into a sample mean of 4. The variance of this sample mean is 1/5 of the process variance. The generalization of the Bühlmann formula weights each observation with the reciprocal of its variance. For example, suppose you observed an individual for a period of 6 months and had c 1 claims. Then you observed the same individual for 8 months and had c 2 claims. Then you observed the individual for 3 months and had c 3 claims. Assume the expected process variance for a full year is v. Then the variance of the first observation is v/ (6/12) , the variance of the second observation is v/ (8/12) , and the variance of the third observation is v/ (3/12) . You would therefore give the first observation a weight of 6/12, the second observation a weight of 8/12, and the third observation a weight of 3/12. The observed mean would be (6/12) c1 + (8/12) c2 + (3/12) c3 (6/12) + (8/12) + (3/12) The credibility factor would be based on (6/12 + 8/12 + 3/12  17/12 observations, and would be 17/12 17/12 + k Notice that the above calculation is equivalent to using one month as the exposure unit, and then counting the first observation as 6 exposure units, the second as 8 exposure units, and the third as 3 exposure units. In fact, you may use any time amount as an exposure unit. After picking your exposure unit, you can proceed exactly as in the simple Bühlmann model. The next example shows you how easy this is. Example 54A Two urns A and B have balls labeled 0 and 1. The number labeled 0 and 1 is indicated in the following table: Urn A B

Number Labeled 0 80 60

Number Labeled 1 20 20

An urn is selected at random. Ten balls are drawn with replacement; the sum of the balls is 3. Then five balls are drawn from the same urn with replacement; they are all 0. After that, six more balls are drawn from the same urn with replacement. C/4 Study Manual—17th edition Copyright ©2014 ASM

1073

54. BÜHLMANN-STRAUB CREDIBILITY

1074

Determine the Bühlmann-Straub credibility estimate of the sum of the values on these 6 balls. Answer: You may be wondering what the big deal is. As long as fifteen balls were drawn and three 1’s showed up, does it matter that the three 1’s all appeared within the first ten draws? It doesn’t. Indeed, this type of problem can be solved by anybody who has read the previous lesson. There is nothing new about “Bühlmann-Straub” other than the second name. Let the exposure unit be 1 ball. Then 1 (0.2 + 0.25)  0.225 2 1 v A  (0.8)(0.2)  0.16 v B  (0.75)(0.25)  0.1875 v  (0.16 + 0.1875)  0.17375 2 Using the usual Bernoulli shortcut, a  (0.25 − 0.2) 2 (0.25)  0.000625. Since there were 15 exposure units, 15 (0.000625) Z  0.051195. The observed mean is calculated as the average of the numbers 15 (0.000625) + 0.17375   on the 15 balls: x¯  3/15  0.2. The prediction for the next 6 balls is 6 0.225 + Z (0.2 − 0.225)  1.3423 . We could let the exposure unit be any number of balls as long as we adjust µ, v, and a accordingly. For example, let’s redo the example using an exposure unit of 5 balls. Then the average sum of 5 balls for urn 1 is 1, and the average sum of 5 balls for urn 2 is 1.25. The variance of the sum of 5 (independent) balls is 5 times the variance of 1 ball, or 0.8 for urn 1 and 0.9375 for urn 2. Therefore 1 µ  (1 + 1.25)  1.125 2 1 v  (0.8 + 0.9375)  0.86875 2 Using the usual Bernoulli shortcut, a  (1.25 − 1) 2 (0.25)  0.015625. There are 3 exposure units of 5 balls 3 (0.015625) apiece, so Z   0.051195. The observed mean is x¯  3/3  1. The prediction for 3 (0.015625  ) + 0.86875   the next 6 balls is (6/5) 1.125 + Z (1 − 1.125)  1.3423 . µA  0.2

?

µ B  0.25

µ

Quiz 54-1 Individual losses on an insurance follow a Pareto distribution with parameters α  5 and Θ. The parameter Θ varies by insured uniformly on [5000, 7000]. For a group policyholder, you observe the following experience: Year

Number of individuals

Number of losses

Total losses

Year 1 Year 2

200 240

10 15

8,000 24,000

Calculate the Bühlmann-Straub credibility estimate of the size of one loss from this group. As you see, Bühlmann-Straub is simple when you’re given a fully specified model. It gets complicated only when you have to estimate the parameters yourself. The formulas for estimating sample variances (process variance, variance of hypothetical means) are simpler when you have individual information than when you are only given summarized information such as “10 losses summing to 8000”. We will discuss these formulas in Lesson 57.

54.2

Hewitt model: Generalized variance of observations

This section discusses an extension of the Bühlmann-Straub model developed by Hewitt. It is included here because Loss Models covers it. However, this extension is not covered by Herzog or Dean. Hence, C/4 Study Manual—17th edition Copyright ©2014 ASM

54.2. HEWITT MODEL: GENERALIZED VARIANCE OF OBSERVATIONS

1075

it should not appear on the exam. (Only one of the seven released exams in 2000–2004, when it was a required topic, had a question on it.) You’re safe skipping it. Or if you want to be sure you really understand Bühlmann-Straub, read up to Example 54B and skip the rest. Even though the above answer to Example 54A is good enough, let’s consider an alternative way of looking at the problem. This alternative way will help us understand the next generalization. na , the numerator and denominator can be divided by v to obtain In the formula Z  na+v Z

a 1+

n v  a nv

(54.1)

What we see in this formula is that a gets multiplied by n/v, which is the reciprocal of the variance of the mean of n observations each having variance v. In other words, the “real” formula for Z is that you multiply a by the reciprocal of the variance of the observation, then divide by 1 plus that product. The reason you grant greater credibility to a larger number of observations is that the variance of the mean is lowered when you have more observations. This makes the reciprocal higher, and the resulting Z is then higher. Thus in Example 54A, we could say that there was only one observation consisting of 15 balls. The v  0.17375  0.0115833. Then the credibility average per ball was 3/15. The variance of this observation if 15 15 factor is a/0.0115833 0.000625/0.0115833   0.051195 Z 1 + a/0.0115833 1 + 0.000625/0.0115833 Another way to look at Example 54A is that there are two observations. The first observation has 10 balls, so its variance is v/10. The second observation has 5 balls, so its variance is v/5. The sample mean is 10X1 + 5X2 X¯  15 and its variance is

v v + 52 Var ( X¯ )  102 10 5

!

!! ,

152 

v 15

We see that we end up with v divided by the number of exposure units, so that the reciprocal is n/v, the parenthesized expression in equation (54.1). You can generalize that equation by replacing n/v with the sum of the reciprocals of the variances of the observations, whatever it is. The sample mean is the average of the observations weighted by the reciprocals of their variances. Example 54B Annual claim counts are Bernoulli with probability Q. The parameter Q varies by group and is 0.1 with probability 0.75, 0.2 with probability 0.25. The process variance is v. You have the following experience: Group

Mean annual claim count

Variance of mean annual claim count

#1 #2 #3

0.10 0.12 0.09

0.2v 0.25v 0.3v

Calculate the Bühlmann-Straub estimate of claim count for this group. Answer: Experience is weighted using the reciprocals of the variances, so x¯  C/4 Study Manual—17th edition Copyright ©2014 ASM

0.10/0.2 + 0.12/0.25 + 0.09/0.3  0.103784 1/0.2 + 1/0.25 + 1/0.3

54. BÜHLMANN-STRAUB CREDIBILITY

1076

The expression n/v in formula (54.1) for Z is replaced with the sum of the reciprocals of the variances, or 1 1 1 12.333 + +  . We calculate the Bühlmann parameters now. 0.2v 0.25v 0.3v v µ  E[Q]  0.125 a  Var ( Q )  (0.12 )(0.25)(0.75)  0.001875 v  E[Q (1 − Q ) ]  E[Q] − E[Q 2 ]  0.125 − (0.001875 + 0.1252 )  0.1075 0.001875 (12.333/0.1075) Z  0.177033 1 + 0.001875 (12.333/0.1075) PC  0.125 + 0.177033 (0.103784 − 0.125)  0.121244



Suppose m j is the exposure for class j, and we observe an average loss per exposure (not per class!) of X j . In the above examples, where v was the expected process variance for each exposure, the variance of the mean losses per exposure in class j was v/m j . In the Hewitt model, however, the expected process variance is larger: w + v/m j , where w is some positive constant or a positive function of the class. This will lower credibility. µ (the expected hypothetical mean, or the overall mean) and a are calculated the same way as before. However, instead of using v, we use the generalized formula based on equation (54.1): Z

am ∗ 1 + am ∗

where m ∗ is the sum of the reciprocals of the variances for all the exposures. m ∗ replaces variance in each class is v/m j , m ∗ is: m∗ 

X



X

j

n v.

Since the

1 w + v/m j mj wm j + v

j

(54.2)

(54.3)

¯ is calculated as a weighted mean, weighted by the reciprocals of the The mean of the observations, x, variances. In other words

P x¯  P 

Xj j w+v/m j 1 j w+v/m j

1 X mjXj m∗ wm j + v

(54.4)

j

Notice how equation (54.1) is a special case of equation (54.2) where m j  1 for all j and w  0. If w  0 and m j  m for all j, where m is some constant, then equation (54.2) reduces to equation (54.1) with n replaced with nm. Example 54C An insurance portfolio has two types of risk, A and B, each comprising half the portfolio. The mean claim count per year for a risk of type A is 0.1. The mean claim count for a risk of type B is 0.3. For either type, the variance of the claim count is 0.1 + 1/m j , where m j is the number of exposures. A group in the portfolio has 40 members with 5 claims in the first year, and 10 members with 0 claims in the second year. 1. Determine the Bühlmann-Straub credibility estimate for the number of claims per member in the third year. C/4 Study Manual—17th edition Copyright ©2014 ASM

54.2. HEWITT MODEL: GENERALIZED VARIANCE OF OBSERVATIONS

1077

2. Determine the limit of the Bühlmann-Straub credibility factor as m j goes to infinity. Answer: 1. µ  21 (0.1 + 0.3)  0.2 and using the Bernoulli shortcut, a  41 (0.22 )  0.01. Since there are 40 exposures in the first year, the variance in the first year is v1  0.1 +

1  0.125. 40

Since there are 10 exposures in the variance in the second year is v2  0.1 +

1  0.2. 10

The weights to be used for calculating the mean of the observations are the reciprocals of these variances, or v11  8 which is applied to the first year observation and v12  5 which is applied to the second year observation. The sum of these weights is m ∗  8 + 5  13. The observation of the first year losses per exposure is 5 claims/40 members = 81 . The observation of the second year losses per member is 0 claims/10 members = 0. Therefore, the mean of the observations is x¯ 

8

1 8

+ 5 (0) 13



1 13

am ∗  (0.01)(13)  0.13. Using equation (54.2) we calculate the credibility factor: Z The credibility premium is PC 

0.13 13 am ∗   . ∗ 1 + am 1 + 0.13 113

13 1  113 13

+

100 113 (0.2)



21 113

 0.1858 .

2. As m j goes to infinity, v1 and v2 go to 0.1. Then m ∗ , which is Z→

1 v1

+

1 v2

goes to 20. So

(0.01) 20 0.2 1   1 + (0.01) 20 1.2 6



Unlike the situation without the extra w of variance, credibility does not go to 1 as m j goes to infinity. In fact, the v component of the variance will go to zero (since it is divided by m j ) and you will be left with the w component, obtaining an equation that looks like equation (54.1) with v replaced with w: Z

an/w 1 + an/w

Credibility will still go to 1 as the number of years goes to infinity. If you remember that exposures are weighted by the reciprocal of the variance in calculating the mean of the observations, and you remember formula (54.2), you should be able to do this type of problem. The end of the textbook section has another “generalization”, but all this generalization does is allow a to vary from case to case, depending on the total number of exposures. In each case, the credibility factor would be calculated using the value of a for the case, in exactly the same way as we have been calculating it until now.

Coverage of this material in the three syllabus options The Bühlmann-Straub model of Section 54.1 is covered by all three syllabus reading options and is required. The Hewitt model of Section 54.2 is covered only by Loss Models, and would be hard to derive by yourself. I therefore do not expect any exam questions on it. C/4 Study Manual—17th edition Copyright ©2014 ASM

54. BÜHLMANN-STRAUB CREDIBILITY

1078

Exercises 54.1. For a portfolio of insurance risks, aggregate losses per year per exposure follow a normal distribution with mean θ and variance 1,000,000. θ varies by class, as indicated in the following table: Class

Mean Aggregate Losses Per Year Per Exposure

Percent of Business in Class

A B C

1000 1500 2000

70% 20% 10%

A randomly selected risk has the following experience over 3 years: Year

Number of Exposures

Aggregate Losses

1 2 3

12 15 13

12,000 18,000 14,000

Determine the Bühlmann-Straub estimate of mean aggregate losses per year per exposure in the next year for this risk. 54.2. Loss size for an insurance coverage follows a two parameter Pareto distribution with parameters α  3 and θ. θ varies by insured according to an exponential distribution. An insured has 4 losses in the first year with average size 1000 and 8 losses in the second year with average size 1300. The resulting Bühlmann-Straub estimate of average loss size for this insured is 1100. Determine the mean of the exponential distribution. 54.3. For a portfolio of insurance risks, claim frequency per month has a Poisson distribution with mean λ/12. λ varies by insured, and its distribution has density function π ( λ )  81 (4 − λ ) ,

0 ≤ λ ≤ 4.

A given risk has the following experience: Exposure Period

Number of Claims

6 months 12 months 3 months

0 2 0

Determine the Bühlmann-Straub estimate of the number of claims for this risk in the next 9 months. 54.4.

Two urns contain balls marked 0 or 1 as follows:

Urn I Urn II 1.

Number of Balls Marked 0

Number of Balls Marked 1

12 14

8 6

An urn is selected at random. Three balls are drawn from it with replacement. The sum of the balls is

Determine the Bühlmann-Straub credibility estimate of the sum of the next 2 balls drawn from the same urn. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 54

1079

54.5. For a portfolio of insurance risks, claim frequency per year per exposure has a negative binomial distribution with parameters r and β  0.2. r varies by class, and has an exponential distribution with probability density π ( r )  2e −2r r ≥ 0. For a randomly selected risk of unknown class, experience for 3 years is as follows: Year

Number of Exposures

Total Number of Claims

1 2 3

100 110 140

50 50 40

Determine the Bühlmann-Straub estimate of average number of claims per exposure in the next year for this risk. 54.6.

[1999 C4 Sample:28] Four urns contain balls marked either 1 or 3 in the following proportions: Urn 1 2 3 4

Marked 1 p1 p2 p3 p4

Marked 3 1 − p1 1 − p2 1 − p3 1 − p4

An urn is selected at random (with each urn being equally likely) and balls are drawn from it in three separate rounds. In the first round, two balls are drawn with replacement. In the second round, one ball is drawn with replacement. In the third round, two balls are drawn with replacement. After two rounds, the Bühlmann-Straub credibility estimate of the total of the values on the two balls to be drawn in the third round could range from 3.8 to 5.0 (depending on the results of the first two rounds). Determine the value of Bühlmann-Straub’s k. 54.7. [4-F00:38] An insurance company writes a book of business that contains several classes of policyholders. You are given: (i) The average claim frequency for a policyholder over the entire book is 0.425. (ii) The variance of the hypothetical means is 0.370. (iii) The expected value of the process variance is 1.793. One class of policyholders is selected at random from the book. Nine policyholders are selected at random from this class and are observed to have produced a total of seven claims. Five additional policyholders are selected at random from the same class. Determine the Bühlmann credibility estimate for the total number of claims for these five policyholders. (A) 2.5

(B) 2.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 3.0

(D) 3.3

(E) 3.9

Exercises continue on the next page . . .

54. BÜHLMANN-STRAUB CREDIBILITY

1080

54.8.

[4-F01:26] You are given the following data on large business policyholders:

(i)

Losses for each employee of a given policyholder are independent and have a common mean and variance. (ii) The overall average loss per employee for all policyholders is 20. (iii) The variance of the hypothetical means is 40. (iv) The expected value of the process variance is 8000. (v) The following experience is observed for a randomly selected policyholder: Year 1 2 3

Average Loss per Employee 15 10 5

Number of Employees 800 600 400

Determine the Bühlmann-Straub credibility premium per employee for this policyholder. (A) (B) (C) (D) (E)

Less than 10.5 At least 10.5, but less than 11.5 At least 11.5, but less than 12.5 At least 12.5, but less than 13.5 At least 13.5

54.9. [4-F02:32] You are given four classes of insureds, each of whom may have zero or one claim, with the following probabilities: Class

Number of Claims 0 1 0.9 0.1 0.8 0.2 0.5 0.5 0.1 0.9

I II III IV

A class is selected at random (with probability 14 ), and four insureds are selected at random from the class. The total number of claims is two. If five insureds are selected at random from the same class, estimate the total number of claims using Bühlmann-Straub credibility. (A) 2.0

(B) 2.2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2.4

(D) 2.6

(E) 2.8

Exercises continue on the next page . . .

EXERCISES FOR LESSON 54

1081

54.10. [4-F03:27] You are given: (i)

The number of claims incurred in a month by any insured has a Poisson distribution with mean λ. (ii) The claim frequencies of different insureds are independent. (iii) The prior distribution is gamma with probability density function: f (λ)  (iv)

(100λ ) 6 e −100λ 120λ

Month

Number of Insureds

Number of Claims

1 2 3 4

100 150 200 300

6 8 11 ?

Determine the Bühlmann-Straub credibility estimate of the number of claims in Month 4. (A) 16.7

(B) 16.9

(C) 17.3

(D) 17.6

(E) 18.0

54.11. For an insurance coverage, the number of claims per month has a Bernoulli distribution with mean θ. θ varies by policyholder, and has mean 0.05 and variance 0.004. A certain policyholder begins coverage on June 1, 2007. The policyholder experiences no claims in 2007 and 2008, and 1 claim in 2009. Only experience up to September 30, 2009 has been recorded. Determine the Bühlmann-Straub credibility estimate of the number of claims in 2010 for this policyholder. 54.12. [4-F04:9] Members of three classes of insureds can have 0, 1 or 2 claims, with the following probabilities: Class I II III

Number of Claims 1 0.0 0.1 0.2

0 0.9 0.8 0.7

2 0.1 0.1 0.1

A class is chosen at random, and varying numbers of insureds from that class are observed over 2 years, as shown below: Year 1 2

Number of Insureds 20 30

Number of Claims 7 10

Determine the Bühlmann-Straub credibility estimate of the number of claims in Year 3 for 35 insureds from the same class. (A) 10.6

(B) 10.9

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 11.1

(D) 11.4

(E) 11.6

Exercises continue on the next page . . .

54. BÜHLMANN-STRAUB CREDIBILITY

1082

The following questions, to the end of the lesson, are based on material not covered by Herzog or Dean. Thus they are on a topic unlikely to appear on an exam. 54.13. For a portfolio of insurance risks, average aggregate losses per exposure have mean ξ and variance 100,000 + 2,000,000 , where m j is the number of exposures in year j. ξ varies by risk, and has mean 2,000 and mj variance 2,000,000. The following is the experience for this risk over three years: Year

Number of Exposures

Average Losses Per Exposure

1 2 3

20 10 10

2000 1800 1900

Determine the Bühlmann-Straub estimate of average aggregate losses per exposure in the next year for this risk. Use the following information for questions 54.14 and 54.15: For a portfolio of insurance risks, aggregate losses have mean ξ and variance 100,000 + 600,000 m j , where m j is the number of exposures in year j. ξ has a two parameter Pareto distribution with parameters α  4, θ  1200. The number of exposures in each year over a five year period are 50, 60, 40, 50, 80. 54.14. Determine the Bühlmann-Straub credibility factor for this number of exposures. 54.15. Determine the limit of the Bühlmann-Straub credibility factor as the number of exposures goes to infinity. 54.16. An insurance portfolio has 2 classes of insureds. Let m j be the number of exposures in a class for a specific year. Aggregate losses per exposure for these classes, as a function m j , is as follows: Aggregate Losses Per Exposure Class

Mean

Variance of Mean

A

2100

15,000 +

B

3000

30,000 +

10,000 mj 40,000 mj

Class A has twice as many policyholders as Class B. For a certain insured selected at random, 2 years of experience are as follows: Year

Number of Exposures

Average Losses Per Exposure

1 2

5 10

2000 2500

Determine the Bühlmann-Straub credibility premium per exposure for this insured for year 3.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 54

1083

54.17. [4-S01:23] You are given the following information about a single risk: (i) The risk has m exposures in each year. (ii) The risk is observed for n years. (iii) The variance of the hypothetical means is a. (iv) The expected value of the annual process variance is w +

v m.

Determine the limit of the Bühlmann-Straub credibility factor as m approaches infinity. n (A) 2 n + n aw n (B) n + wa n (C) n + va n (D) n + w+v a (E) 1 Additional released exam questions: C-F06:19, C-S07:32,36

Solutions 54.1. Some students mistakenly think that the exposure unit in this problem is a year, and therefore n  3. If you were trying to predict aggregate losses per year, this would be true. But since you are trying to predict aggregate unit per exposure-year, the appropriate unit is an exposure-year: 12 + 15 + 13  40. The observed average losses per exposure year is: x¯ 

12,000 + 18,000 + 14,000  1100 12 + 15 + 13

The hypothetical mean is θ. The overall mean is µ  E[θ]  0.7 (1000) + 0.2 (1500) + 0.1 (2000)  1200 The process variance is a constant:

v  1,000,000

The variance of the hypothetical means is a  Var ( θ )  0.7 (1000 − 1200) 2 + 0.2 (1500 − 1200) 2 + 0.1 (2000 − 1200) 2  110,000











Finishing up, Z

40a 40 (110,000)   0.8148 40a + v 40 (110,000) + 1,000,000

PC  1200 + 0.8148 (1100 − 1200)  1118.52

C/4 Study Manual—17th edition Copyright ©2014 ASM



54. BÜHLMANN-STRAUB CREDIBILITY

1084

54.2.

The mean of the observations is x¯ 

4 (1000) + 8 (1300)  1200 12

Let λ  the mean of the exponential. θ 2 λ µ 2 a  14 λ 2

µ(θ) 

2θ 2 θ v (θ)  − 2·1 2

!2

 34 θ 2

v  E 34 θ2  43 (2λ 2 )  32 λ 2 v k 6 a 12 2 Z  12 + 6 3 ! 2 1 λ 1100  PC  (1200) + 3 3 2 λ  900 2 λ  1800

f

g

54.3. Let one year be the exposure unit. λ is both the hypothetical mean and the process variance for the Poisson distribution describing the process. x¯ 

2 (12) 2 (12) 8   6 + 12 + 3 21 7

1 µ  v  E[λ]  8

4

Z 0

4

λ2 λ 3 4  λ (4 − λ ) dλ  − 4 24 0 3

Instead of integrating, you can use the appendix’s formula for the beta distribution with θ  4, a  1, b  2. We will use the formula for the second moment. E[λ 2 ] 

42 Γ (3) Γ (3) 42 (2)(2) 8   Γ (1) Γ (5) 24 3

8 4 a − 3 3 Z

!2

7 8 4 9 7 8 4 4 9 + 3



8 9



56 7  104 13

3 *4 7 8 4 + 3 7 /1+ PC  . + − 4 3 13 7 3 4 13



,



!

−4 12  21 13

!

-

54.4. Let the exposure unit be one ball. x¯  13 , µ1  0.4, µ2  0.3, µ  0.35. a  41 (0.12 )  0.0025.   0.225 3 1 1 1 v1  0.24, v2  0.21, v  0.225. k  0.0025  90. Z  3+90  31 . PC  2 0.35 + 31 ( 3 − 0.35)  0.6989 .



C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 54

1085

54.5. We are trying to compute the number of claims per exposure in a year, so the exposure unit is exposure-years, not years, and we have n  100 + 110 + 140  350 of those, not 3. The average number of claims per exposure-year is 140 50 + 50 + 40   0.4 x¯  100 + 110 + 140 350 The hypothetical mean is µ ( r )  0.2r and since r is exponential with mean 0.5, the overall mean is µ  E[µ ( r ) ]  E[0.2r]  0.2 E[r]  0.1 The process variance of the negative binomial is v ( r )  rβ (1 + β )  ( r )(0.2)(1.2)  0.24r The expected value of the process variance is v  E[0.24r]  (0.24)(0.5)  0.12 The variance of the exponential is 0.52  0.25, so the variance of the hypothetical means is a  Var (0.2r )  (0.2) 2 (0.25)  0.04 (0.25)  0.01 Finishing up, 350 (0.01) 3.5 350a   350a + v 350 (0.01) + 0.12 3.62 3.5 PC  0.1 + (0.4 − 0.1)  0.3901 3.62 Z

54.6. The observed mean x¯ must be between 1 and 3. The Bühlmann-Straub estimate per ball is between 1.9 and 2.5. So an observed mean of 1 must lead to an estimate of 1.9, and an observed mean of 3 must lead to an estimate of 2.5. If µ is the overall mean, then Z + (1 − Z ) µ  1.9

3Z + (1 − Z ) µ  2.5

Subtracting the first equation from the second, we obtain Z  0.3. Then, since there are 3 observations (2 in the first round, 1 in the second), 3  0.3 3+k k 7 The above solution assumes that each ball drawn is an observation. If you consider 2 balls as a single 1.5  0.3, k  3.5. observation, then there were 1.5 observations and 1.5+k 54.7.

Since 9 policyholders produced 7 claims, x¯ 

7 9

9a 9 (0.370)   0.65001 9a + v 9 (0.370) + 1.793 PC  µ + Z ( x¯ − µ )  0.425 + 0.65001 ( 97 − 0.425)  0.65431 Z

That is the Bühlmann-Straub credibility premium for 1 policyholder. For 5 policyholders, 5 (0.65431)  3.2715 C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

54. BÜHLMANN-STRAUB CREDIBILITY

1086

54.8.

The total number of exposures is 800 + 600 + 400  1800 and the average loss per exposure is x¯ 

15 (800) + 10 (600) + 5 (400)  11 19 1800

k  8000/40  200. The credibility is Z  1800/ (1800 + k )  0.9. The credibility premium is 20 + 0.9 (11 91 − 20)  12 . (C) 54.9.

We calculate the hypothetical means and their squares to get the variance of the hypothetical means. 0.1 + 0.2 + 0.5 + 0.9  0.425 4 0.01 + 0.04 + 0.25 + 0.81 E[µ (Θ) 2 ]   0.2775  4 a  Var µ (Θ)  0.2775 − 0.4252  0.096875 E[µ (Θ) ] 

Each class has a Bernoulli distribution, so the expected process variance is v

(0.9)(0.1) + (0.8)(0.2) + (0.5)(0.5) + (0.1)(0.9) 4

The credibility is Z The answer is

 0.1475

4a 4 (0.096875)   0.7243 4a + v 4 (0.096875) + 0.1475

5 0.425 + 0.7243 (0.5 − 0.425)  5 (0.4793)  2.3966





(C)

54.10. For the gamma distribution, α  6 (as we can see from the exponent on λ) and θ  0.01 (the reciprocal of the exponent on e). µ  v  αθ  0.06 a  αθ2  0.0006 v k   100 a 450 9 Z  450 + 100 11 6 + 8 + 11 x¯   0.05556 450   9 300PC  300 0.06 + (0.05556 − 0.06)  300 (0.056367)  16.9091 11 54.11. We will use one month as our exposure unit. Then µ  E[θ]  0.05 a  Var ( θ )  0.004 v  E[θ (1 − θ ) ]  E[θ] − E[θ 2 ]  0.05 − (0.052 + 0.004)  0.0435 v k   10.875 a There are 28 months of exposure and 1 claim per 28 months, so 28  0.720257 28 + 10.875 PC  0.720257 (1/28) + (1 − 0.720257)(0.05)  0.039711 Z

Annualizing the estimate, we get 12 (0.039711)  0.47653 . C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

EXERCISE SOLUTIONS FOR LESSON 54

1087

54.12. First we calculate the hypothetical means. µ (I)  0.1 (2)  0.2 µ (II)  0.1 (1) + 0.1 (2)  0.3 µ (III)  0.2 (1) + 0.1 (2)  0.4 0.2 + 0.3 + 0.4 µ  0.3 3 0.22 + 0.32 + 0.42 E[µ (Θ) 2 ]   0.096667 3   a  Var µ (Θ)  0.096667 − 0.32  0.006667 Now we calculate the process second moments and variances. E N 2 | I  0.1 (4)  0.4

f

v (I)  0.4 − 0.22  0.36

g

E N 2 | II  0.1 (1) + 0.1 (4)  0.5

f

v (II)  0.5 − 0.32  0.41

g

E N 2 | III  0.2 (1) + 0.1 (4)  0.6

f

v (III)  0.6 − 0.42  0.44

g

v

0.36 + 0.41 + 0.44  0.4033 3

There are 50 insureds, or observations, so Z x¯ 

17 50

50 (0.006667)  0.4525 50 (0.006667) + 0.4033

 0.34, so the premium for 35 insureds PC  35 0.3 + 0.4525 (0.34 − 0.3)  11.1335





(C)

54.13. The variance in year 1 is 100,000+2,000,000/20  200,000. The variance in years 2 and 3 is 100,000+ 2,000,000/10  300,000. 1 2 7 +  200,000 300,000 600,000   1 2,000 3,700 6000 + 7400 13,400 +   x¯  ∗ m 200,000 300,000 7 7 µ  2000, a  2,000,000 am ∗ 140/6 70 Z   1 + am ∗ 146/6 73 ! 70 600 PC  2000 −  1917.81 73 7 m∗ 

54.14. We use equation (54.2). We must sum the reciprocals of the variances to get m ∗ . They are:

C/4 Study Manual—17th edition Copyright ©2014 ASM

54. BÜHLMANN-STRAUB CREDIBILITY

1088

Year

Exposures

Variance

1

50

100,000 +

600,000  112,000 50

2

60

100,000 +

600,000  110,000 60

3

40

100,000 +

600,000  115,000 40

4

50

100,000 +

600,000  112,000 50

5

80

100,000 +

600,000  107,500 80

Therefore m∗ 

1 1 1 1 1 + + + +  0.000044946 112,000 110,000 115,000 112,000 107,500

The hypothetical mean ξ has a Pareto distribution with mean µ and variance a

1200  400 3

2 (12002 ) − 4002  480,000 − 160,000  320,000 (3)(2)

So by equation (54.2), credibility is Z

320,000 (0.000044946) am ∗   0.9350 ∗ 1 + am 1 + 320,000 (0.000044946)

54.15. ∗

m 

5 X j1

Z

5 1 → 100,000 + 600,000/m j 100,000

am ∗ 5a/100,000 5 (320,000) /100,000 16 →   1 + am ∗ 1 + 5a/100,000 1 + 5 (320,000) /100,000 17

54.16. In this question, the variance varies by class. To calculate v j , we use the average of the process variances, which is 2 10,000 1 40,000 20,000 vj  15,000 + + 30,000 +  20,000 + . 3 mj 3 mj mj

!

So v1  24,000 and v2  22,000. Continuing, µ  32 (2100) + 31 (3000)  2400 C/4 Study Manual—17th edition Copyright ©2014 ASM

!

QUIZ SOLUTIONS FOR LESSON 54

1089

a  ( 23 )( 13 )(9002 )  180,000 1 1 46,000 m∗  +  24,000 22,000 (22,000)(24,000)   2,500 1 2,000 + x¯  ∗ m 24,000 22,000   (22,000)(24,000) 2,000 (22,000) + 2,500 (24,000)  46,000 (24,000)(22,000)  2260.87 180,000 (46,000) am∗   15.6818 (22,000)(24,000) am ∗ 15.6818 Z   0.9401 ∗ 1 + am 16.6818 PC  2400 − 0.9401 (2400 − 2260.87)  2269.21 54.17. This is the only question on the Hewitt model that ever appeared on a released exam. As m → ∞, v w n n w+ → w, so the k goes to and credibility goes to  . (B) m a n + k n + w/a

Quiz Solutions 54-1.

The first two columns of the table are not used. The observed mean loss size is x¯ 

8,000 + 24,000  1,280 10 + 15

If we treat each loss as an observation, there are n  10 + 15  25 observations. The Bühlmann parameters are Θ 4 !2 5Θ2 2Θ2 Θ  v (Θ )  − 12 4 48 6000 µ  1500 4 ! 5 20002 v + 60002  3,784,722 48 12

µ (Θ ) 

1 20002   20,833 13 16 12 3,784,722   181.667 20,833 31 25   0.120968 25 + 181.667  1500 + 0.120968 (1280 − 1500)  1473.39

!

a k Z PC

C/4 Study Manual—17th edition Copyright ©2014 ASM

1090

C/4 Study Manual—17th edition Copyright ©2014 ASM

54. BÜHLMANN-STRAUB CREDIBILITY

Lesson 55

Exact Credibility Reading: Loss Models Fourth Edition 15.3, 18.7 or SN C-21-01 5.4 or Introduction to Credibility Theory 8.4–8.5 When the model distribution is a member of the linear exponential family and a conjugate prior distribution is used as the prior hypothesis, the posterior mean is a linear function of the mean of the observations. If the prior mean exists, since the Bühlmann credibility estimate is the least squares linear estimate of the posterior mean, it is exactly equal to the posterior mean in this case. Loss Models discusses criteria for the existence of the prior mean, but neither of the other two syllabus options discusses this in detail, so I don’t expect any exam questions on it. The only exam questions I expect on this material are to test your understanding of the fact that the credibility (or Bühlmann) and Bayesian premiums are the same. Thus you may be asked for the Bühlmann credibility estimate for situations where the model distribution is a member of the linear exponential family and the prior is the conjugate prior. The fastest way to solve such a question is to realize that the Bühlmann estimate is the same as the Bayesian, and that the Bayesian estimate simply requires modifying the parameters as discussed in Lessons 47–50. Table 55.1 summarizes Bayesian information for the conjugate priors that we discussed: priors, posteriors, predictives, as well as the Bühlmann k. Example 55A For an insurance coverage, claim count for an insured in each period follows a Poisson distribution with mean λ. The parameter λ varies by insured, and follows an exponential distribution with mean 0.2. A single insured is selected at random and observed over 10 periods. The average number of claims ¯ submitted is x. The Bayesian credibility estimate of the number of claims submitted by this insured in the next period can be expressed as Z x¯ + (1 − Z )(0.2) . Determine Z. Answer: Since Bayesian and Bühlmann credibility are equal in this case, we calculate Bühlmann’s Z. We obtain k from Table 55.1. 1 1  5 θ 0.2 10 2 Z  10 + 5 3 k



Coverage of this material in the three syllabus options This material is covered in all three syllabus options to some extent, and the linear exponential family is discussed in the required part of Loss Models as well. The Herzog text even defines exponential (not necessarily linear) families. However, exams will limit themselves to the sort of questions in the exercises here.

C/4 Study Manual—17th edition Copyright ©2014 ASM

1091

55. EXACT CREDIBILITY 1092

Table 55.1: Priors, posteriors, predictives, and Bühlmann v , a , and k for linear exponential model/conjugate prior pairs.

Prior Gamma α γ  1/θ

Posterior Gamma α ∗  α + n x¯ γ+n

Predictive Negative binomial r  α∗

β  1/γ∗ Bernoulli

Beta

q

γ∗

Beta

Normal

 a + n x¯

a

µ  µ∗

a∗ a∗ + b∗

Normal

a∗

vµ + na x¯ v + na

σ2  a∗ + v

b ∗  b + n (1 − x¯ )

Normal

µ∗ 

av na + v

b

µ

a∗ 

Inverse gamma

α  α∗ θ  θ∗

Pareto

a Inverse gamma

α∗  α + n

θ∗  θ + n x¯

αθ

Bühlmann v

( a + b ) 2 ( a + b + 1)

αθ2

Bühlmann a

1/θ

Bühlmann k

α−1

v/a

a+b

( a + b )( a + b + 1)

a

θ2

ab

v

( α − 1) 2 ( α − 2)

θ2

ab

( α − 1)( α − 2)

In the following table, parameters in the Predictive column before the equal sign and any parameters in the Model column are not the same as parameters in the Prior and Posterior columns, even when they have the same names. Also, the Bühlmann a is not the same as the parameter a of a beta distribution. Model Poisson(λ)

Bernoulli(q)

Normal(θ, v)

Exponential(θ) α θ

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 55

1093

Exercises 55.1. [4B-S90:47] (1 point) The probability distribution function of claims per year for an individual risk is a Poisson distribution with parameter h. The prior distribution of h is a gamma distribution given by g ( h )  h · e −h for 0 < h < ∞. What is the Bühlmann credibility to be assigned to a single observation?

(A) 1/4

(B) 1/3

(C) 1/2

(D) 2/3

(E) 1/ (1 + h )

55.2. [4B-S92:29] (2 points) The number of claims for an individual risk in a single year follows a Poisson distribution with parameter λ. The parameter λ has for a prior distribution the following gamma density function: f ( λ )  12 e −λ/2 , λ > 0. You are given that three claims arose in the first year. Determine the Bühlmann credibility estimate for the expected number of claims in the second year. (A) (B) (C) (D) (E)

Less than 2.25 At least 2.25, but less than 2.50 At least 2.50, but less than 2.75 At least 2.75, but less than 3.00 At least 3.00 [4B-S94:26] (2 points) You are given the following:

55.3. •

For an individual risk in a population, the number of claims for a single exposure period follows a Poisson distribution with mean µ.



For the population, µ is distributed according to an exponential distribution with mean 0.1, g ( µ )  10e −10µ ,

µ > 0.



An individual risk is selected at random from the population.



After one exposure period, one claim has been observed.

Determine the Bühlmann credibility factor, Z, assigned to the number of claims for a single exposure period. (A) 1/10 (B) 1/11 (C) 1/12 (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) 1/14

[4B-F94:3] (2 points) You are given the following:

55.4. •

The number of claims for a single risk follows a Poisson distribution with mean m.



m is a random variable having a prior gamma distribution with mean  0.50.



The value of k in Bühlmann’s partial credibility formula is 10.



After five exposure periods, the posterior distribution is gamma with mean  0.60. Determine the number of claims observed in the five exposure periods.

(A) 3

(B) 4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 5

(D) 6

(E) 10 Exercises continue on the next page . . .

55. EXACT CREDIBILITY

1094

55.5. For a given policyholder, losses have a normal distribution with mean Θ and variance 5,000,000. The distribution of Θ is normal with mean 18,000 and variance 2,000,000. Determine the Bühlmann k for this coverage. 55.6. For a given policyholder, losses have a normal distribution with mean Θ and variance 1,000,000. The distribution of Θ is normal with mean 12,000 and variance 200,000. For a given risk, 10 losses averaging 11,000 are observed. Determine the Bühlmann credibility factor for this experience. [4B-S97:9, 1999 C4 Sample:34] (3 points) You are given the following:

55.7. •

The number of claims for Risk 1 during a single exposure period follows a Bernoulli distribution with mean p.



The prior distribution for p is uniform on the interval [0, 1].



The number of claims for Risk 2 during a single exposure period follows a Poisson distribution with mean θ.



The prior distribution for θ has the density function f ( θ )  βe −βθ ,



0 < θ < ∞, β > 0.

The loss experience of both risks is observed for an equal number of exposure periods.

Determine all values of β for which the Bühlmann credibility of the loss experience of Risk 2 will be greater than the Bühlmann credibility of the loss experience of Risk 1. Hint:

R

∞ 0

(A) β > 0 55.8.

θ 2 βe −βθ dθ  2/β 2 (B) β < 1

(C) β > 1

(D) β < 2

(E) β > 2

[4-F02:3] You are given:

(i)

The number of claims made by an individual insured in a year has a Poisson distribution with mean λ. (ii) The prior distribution for λ is gamma with parameters α  1 and θ  1.2. Three claims are observed in Year 1, and no claims are observed in Year 2. Using Bühlmann credibility, estimate the number of claims in Year 3. (A) 1.35 55.9. (i)

(B) 1.36

(C) 1.40

(E) 1.43

[4-F04:1] You are given: The annual number of claims for an insured has probability function: 3 x p (x )  q (1 − q ) 3−x , x

!

(ii)

(D) 1.41

The prior density is π ( q )  2q,

x  0, 1, 2, 3

0 < q < 1.

A randomly chosen insured has zero claims in Year 1. Using Bühlmann credibility, estimate the number of claims in Year 2 for the selected insured. (A) 0.33

(B) 0.50

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1.00

(D) 1.33

(E) 1.50 Exercises continue on the next page . . .

EXERCISES FOR LESSON 55

1095

Use the following information for questions 55.10 and 55.11: You are given the following: •

The probability that a particular baseball player gets a hit in any given attempt is θ.



θ does not vary by attempt.



The prior distribution of θ is assumed to follow a distribution with mean 1/3, variance ab , and density function ( a + b + 1)( a + b ) 2 f (θ) 



Γ ( a + b ) a−1 θ (1 − θ ) b−1 , 0 ≤ θ ≤ 1. Γ(a )Γ(b )

The player is observed for nine attempts and gets four hits.

55.10. [4B-F98:23] (2 points) If the prior distribution is constructed so that the credibility of the observations is arbitrarily close to zero, determine which of the following is the largest. (A) f (0)

(B) f (1/3)

(C) f (1/2)

(D) f (2/3)

(E) f (1)

55.11. [4B-F98:24] (3 points) If the prior distribution is constructed so that the variance of the hypothetical means is 1/45, determine the probability that the player gets a hit in the tenth attempt. (A) 1/3

(B) 13/36

(C) 7/18

(D) 5/12

(E) 4/9

55.12. [4B-S91:33] (3 points) Assume that the number of claims, r, made by an individual insured in one year follows the binomial distribution 3 r p (r )  θ (1 − θ ) 3−r , for r  0, 1, 2, 3. r

!

Also assume that the parameter, θ, has the p.d.f. g ( θ )  6 ( θ − θ 2 ) , for 0 < θ < 1. What is the Bühlmann credibility assigned to a single observation? (A) 3/8

(B) 3/7

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1/2

(D) 3/5

(E) 3/4

Exercises continue on the next page . . .

55. EXACT CREDIBILITY

1096

55.13. [4B-S99:24] (2 points) You are given the following: •

The number of errors that a particular baseball player makes in any given game follows a Poisson distribution with mean θ.



θ does not vary by game.



The prior distribution of θ is assumed to follow a distribution with mean 1/10, variance αβ2 , and density function e −θ/β θ α−1 , 0 < θ < ∞. f (θ)  βα Γ(α)



The player is observed for 60 games and makes one error.

If the prior distribution is constructed so that the variance of the hypothetical means is 1/400, determine the expected number of errors that the player will make in the next 60 games. (A) (B) (C) (D) (E)

Less than 0.5 At least 0.5, but less than 2.5 At least 2.5, but less than 4.5 At least 4.5, but less than 6.5 At least 6.5

Use the following information for questions 55.14 and 55.15: You are given: (i)

The probability of an individual having exactly one claim in one exposure period is p, while the probability of no claims is 1 − p. (ii) p is a random variable with density function f ( p )  6p (1 − p ) , 0 ≤ p ≤ 1. 55.14. [4B-S94:2] (3 points) Determine the Bühlmann credibility factor, Z, for the number of observed claims for one individual for one exposure period. 1 (A) 12 (B) 16 (C) 51 (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D)

1 4

55.15. [4B-S94:3] (2 points) You are given the following: (i) An individual is selected at random and observed for 12 exposure periods. (ii) During the 12 exposure periods, the selected individual incurs 3 claims. Determine the probability that the same individual will have one claim in the next exposure period. (A)

1 4

(B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 7

(C)

2 7

(D)

3 8

(E)

5 16

Exercises continue on the next page . . .

EXERCISES FOR LESSON 55

1097

55.16. [4B-F94:2] (3 points) You are given the following: (i) (ii) (iii) (iv) (v)

The density function f ( x | α ) is a member of the linear exponential family. The prior distribution of α, π ( α ) , is a conjugate prior distribution for the density function f ( x | α ) . X l and X2 have f ( x | α ) as their density function. E[X1 ]  0.50 and E[X2 | X1  3]  1.00. The variance of the hypothetical means, Varα (E[X | α])  6.

Determine the expected value of the process variance, Eα [Var ( X | α ) ].

(A) (B) (C) (D) (E)

Less than 2.0 At least 2.0, but less than 12.0 At least 12.0, but less than 22.0 At least 22.0, but less than 32.0 At least 32.0

Additional released exam questions: C-F05:2

Solutions 55.1. This is a Poisson/gamma conjugate prior, with the gamma having parameters α  2, θ  1. Therefore, k  1/θ  1 and Z  1/ (1 + 1)  1/2 . (C) 55.2. This is a Poisson/gamma conjugate prior, with the gamma having parameters α  1, θ  2. (The 1 gamma is in fact an exponential.) The mean of the exponential is 2. k  12 and Z  1+1/2  23 . The Bühlmann credibility estimate is 1 2 PC  (2) + (3)  8/3 (C) 3 3 55.3. This is a Poisson/gamma pair, the gamma having parameters α  1 and θ  0.1. Therefore, k  1 1/θ  10. Then Z  1+10  1/11 . (B) 55.4.

By the credibility premium formula PC  (1 − Z ) µ + Z x¯

0.6  (1 − Z )(0.5) + Z x¯  Z ( x¯ − 0.5) + 0.5 However, we are given that n  5 and k  10, so Z 

5 1  , and 5 + 10 3

1 ( x¯ − 0.5) 3 x¯  0.3 + 0.5  0.8

0.1 

The number of claims is 5 x¯  5 (0.8)  4 . (B) We didn’t need Poisson/gamma properties for this one. 55.5. You don’t need the special properties of the normal/normal to do this. You are given that the hypothetical mean’s (θ’s) variance is 2,000,000 and that the process variance is constant (5,000,000), so k

C/4 Study Manual—17th edition Copyright ©2014 ASM

v 5,000,000   2.5 a 2,000,000

55. EXACT CREDIBILITY

1098

2 10  . 10 + 5 3 55.7. Fortunately we don’t have to work things out from first principles which would then use the hint and justify the 3 points this question was granted. From exact credibility, the Bühlmann k for Risk 1 is a + b  1 + 1  2, and the Bühlmann k for Risk 2 is β. Lower k means higher credibility, so the answer is β < 2 . (D) 55.6.

v  1,000,000 and a  200,000, so k  v/a  5 and Z 

55.8. Bühlmann credibility is the same as Bayesian credibility in this case, so we use the conjugate prior. 1 The posterior parameters for the gamma are α ∗  1 + 3  4 and γ∗  1.2 + 2  3.4 1.2 . The answer is 4 (1.2/3.4)  1.4118 . (D) 55.9. In this case, Bühlmann credibility is the same as Bayesian, and we use the conjugate prior. The number of claims is binomial with m  3, so 1 year of claims is like 3 exposure periods of a Bernoulli. The prior density is a beta with a  2, b  1, as we can recognize by adding 1 to the powers on q (1) and 1 − q (0). Then a ∗  2 and b ∗  1 + 3  4. The number of claims in Year 2 is a∗  1 3 a∗ + b∗

!

(C)

9 a 55.10. Credibility is Z  9+a+b , and a+b  13 , so b  2a and Z  9/ (9 + 3a ) . Credibility Z will go to zero if a goes to infinity. The mode of f is most easily determined by logging and differentiating it:

ln f ( θ )  ln Γ ( a + b ) − ln Γ ( a ) − ln Γ ( b ) + ( a − 1) ln θ + ( b − 1) ln (1 − θ ) d ln f ( θ ) a − 1 b − 1  − 0 dθ θ 1−θ ( a − 1)(1 − θ ) − ( b − 1) θ  0 θ ( a + b − 2)  a − 1 a−1 θ a+b−2

So ( a − 1) / ( a + b − 2) is the mode. Setting b  2a, the mode is ( a − 1) / (3a − 2) . As a goes to infinity, goes to 1/3 . (B)

55.11. Since b  2a and

a−1 3a−2

1 ab  , we get 2 45 ( a + b + 1)( a + b ) 2a 2 1  2 45 (3a + 1)(3a ) (3a + 1)(9)  45 2 a3

The predictive mean is equal to the posterior mean. The posterior beta has parameters a∗  3 + 4  7 and 7 7 b ∗  6 + 5  11, so the posterior mean is  . (C) 7 + 11 18 55.12. For the Bernoulli-beta conjugate prior pair, the Bühlmann k  a + b. Here, g ( θ )  6θ (1 − θ )

so a  b  2 and the Bühlmann k is 4. Since the binomial has m  3, there are 3 Bernoulli observations. The Bühlmann credibility factor is 3/ (3 + 4)  3/7 . (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 55

1099

55.13. The hypothetical mean is θ. (See page 874 for the definition of hypothetical mean.) f ( θ ) is the density of a gamma distribution with parameters α and β. Var ( θ )  αβ  β

1  αβ2 400 1 10 1 40

α  4, γ  40 α ∗  4 + 1  5, γ∗  40 + 60  100

5 60  3 100

!

(C)

55.14. The Bühlmann k for a Bernoulli/beta pair is a + b. Here, the beta has a  b  2, so the Bühlmann k  4 and the credibility factor is 1/ (1 + 4)  1/5 . (C) 55.15. The revised beta parameters are a → 2 + 3  5, b → 2 + 12 − 3  11. The probability of a claim is a/ ( a + b )  5/16 . (E) 55.16. Bayesian and Bühlmann estimation are equivalent, so we can deduce Bühlmann credibility factors using the conditional expectation of X2 given X1 . Since E[X2 | X1  3]  1.00, and 1.00  (1 − Z ) µ + Z x¯  0.8 (0.5) + 0.2 (3) , we see that the credibility factor for one observation is 0.2. Then 1/ (1 + k )  0.2 implying k  4. The expected value of the process variance divided by the variance of the hypothetical means is 4. Then the expected value of the process variance is 4 (6)  24 . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM

1100

C/4 Study Manual—17th edition Copyright ©2014 ASM

55. EXACT CREDIBILITY

Lesson 56

Bühlmann As Least Squares Estimate of Bayes Reading: Loss Models Fourth Edition 18.4 or SN C-21-01 4 (introduction) and SN C-24-05 Appendix A or Introduction to Credibility Theory 10.2 Questions for which regression is helpful are rare, but do occur. There are two questions on least squares estimation of Bayesian credibility among the released exams since 2000. There were two questions on graphs on recent exams. There was also one question on covariance, which is also discussed in this lesson.

56.1

Regression

Loss Models fourth edition 18.4, Dean Appendix A, and Herzog 10.3 each derive the Bühlmann method as the linear least squares estimate of the Bayesian predictive mean. While the first two are within the syllabus, the Herzog reading is not on the syllabus, not even as background material. So you’re not responsible for the derivation. However, you could be asked to determine the set of Bühlmann credibility predictions corresponding to a set of Bayesian predictions, using the fact that the Bühlmann credibility estimate is a least squares estimate of the Bayesian premium. Given a series of Bayesian predictions, you can deduce the Bühlmann predictions, since they are a weighted least squares estimate of the Bayesian predictions. Your knowledge of two-variable regression will make these calculations easy. You may have learned this model as part of VEE-Econometrics. If not, here is the formula. Suppose you want to estimate Yi by Yˆ i which is a linear function of X i : Yˆ i  α + βX i Moreover, you would like to select α and β in such a way as to minimize the weighted least square difference, where p i is the weight of observation i: Minimize

X

p i ( Yˆ i − Yi ) 2

We can treat the ( X i , Yi ) pairs as a joint probability distribution with Pr ( X i , Yi )  p i , and use its moments. The formulas for α and β in terms of the moments are Cov ( X, Y ) Var ( X ) α  E[Y] − β E[X] β

In the context of Bühlmann credibility, X i are the observations, Yi are the Bayesian predictions and Yˆ i are the Bühlmann predictions. So β is the credibility factor Z. Also, the overall mean of the Bayesian predictions equals the original mean, since each Yi is E[X n+1 | X], and E[Y]  E E[X n+1 | X]  E[X n+1 ]

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

1101

56. BÜHLMANN AS LEAST SQUARES ESTIMATE OF BAYES

1102

so E[Y]  E[X], and the equation for α becomes α  (1 − Z ) E[X].

(56.1)

You can calculate variances and covariances as follows: Var ( X ) 

X

Cov ( X, Y ) 

X

p i X i2 − E[X]2 p i X i Yi − E[X] E[Y]

Example 56A You are given the following information about the Bayesian estimates of an event: Initial Probability Bayesian Outcome of Outcome Estimate X1 Pr ( X1 ) E[X2 | X1 ] 0 2 8

0.50 0.25 0.25

1 2 —

Determine the Bühlmann estimates of E[X2 | X1 ] for each possible outcome. Answer: First let’s fill in E[X2 | 8], using the fact that the expected value of the third column must equal the expected value of X1 . E[X1 ]  0.5 (0) + 0.25 (2) + 0.25 (8)  2.5 E E[X2 | X1 ]  2.5  0.5 (1) + 0.25 (2) + 0.25 (E[X2 | 8])

f

g

from which it follows that E[X2 | 8]  6. Let’s now calculate the credibility factor Z: Var ( X ) 

X

Cov ( X, Y ) 

X

Z

px 2 − E[X]2  0.5 (0) + 0.25 (4) + 0.25 (64) − 2.52  10.75

px y − E[X] E[Y]  0.5 (0) + 0.25 (2)(2) + 0.25 (8)(6) − 2.52  6.75

6.75 27  10.75 43

So the Bühlmann prediction given X1 , or PC ( X i ) , is PC (0)  (1 − Z )(2.5)  PC ( 2 ) 

40 94 + 2Z  43 43

PC ( 8 ) 

40 256 + 8Z  43 43

40 43

Calculator Tip Regression questions are rare. And statistical calculators cannot perform weighted least squares regression. In Example 56A, we can get around its inability to do weighted regression by entering the outcome 0 twice. The example can be solved as follows, once you determine that

C/4 Study Manual—17th edition Copyright ©2014 ASM



56.2. GRAPHIC QUESTIONS

1103

Calculator Tip E[X2 | 8]  6. Clear table

Enter outcomes in column 1

Enter Bayesian estimates in column 2

Check statistics

data data 4

0

s% 0 s% 2 s% 8 enter

t% 1 s% 1 s% 2 s% 6 enter 2nd [stat]2 (Set XDATA to L1, YDATA to L2, move cursor to CALC) enter

L1

L2

L3

L2

L3

L2 1 2 6

L3

L1(1)= L1 0 2 8 L1(5)= L1 0 2 8

L2(5)= 2-Var:L1,L2 1:n=4 2:¯x=2.5 3↓Sx=3.785938897

The calculator strangely uses a for the slope parameter of the regression and b for the intercept; usually, β is used for the former and α for the latter. So statistic D, a, is Z, and is 0.6279069767, which ¯ or 40/43, and is shown as 0.9302325581. Using these, you can calculate is 27/43. Statistic E is (1− Z ) x, three probabilities. If you want to be fancy, you can enter the formula 0.9302325581 + 0.6279069767L1 in column 3 to calculate the Bühlmann estimates.

56.2

Graphic questions

Two released exams featured a question with five graphs of purported Bühlmann and Bayesian estimates, and asked you which graph was possible. There were a couple of rules you used to eliminate the bad graphs: 1. The Bayesian prediction must always be within the range of the hypothetical means. If hypothetical means (expectations given θ) range from 2 to 6, the Bayesian estimate cannot be below 2 or above 6 (although the Bühlmann estimate may be). 2. The Bühlmann predictions must lie on a straight line as a function of the observation, or the mean of the observations. 3. There should be Bayesian predictions both above and below the Bühlmann line. This follows from the fact that the Bühlmann prediction is a least squares estimate. However, symmetry is not required, since it is a weighted least squares estimate. It is possible that the Bayesian estimate is much lower at one point and only slightly higher at other points, since the probability of the former point may be low. C/4 Study Manual—17th edition Copyright ©2014 ASM

Est. No. of Claims in Year 2

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Est. No. of Claims in Year 2

56. BÜHLMANN AS LEAST SQUARES ESTIMATE OF BAYES

1104

Bühlmann Bayes

0

2

4

6

8

10

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Bühlmann Bayes

0

No. of Claims in Year 1

2

Bühlmann Bayes

0

2

4

6

8

10

(b)

Est. No. of Claims in Year 2

Est. No. of Claims in Year 2

(a)

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

4

No. of Claims in Year 1

6

8

10

No. of Claims in Year 1

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Bühlmann Bayes

0

2

4

6

8

10

No. of Claims in Year 1

(c)

(d)

Figure 56.1: Graphs for Example 56B

4. The Bühlmann prediction must be between the overall mean and the mean of the observations. If there’s only one observation, this means that the Bühlmann prediction must be between the overall mean and the observation. The fourth rule was not used in these questions. Example 56B The graphs in Figure 56.1 purport to be the Bühlmann and Bayesian predictions of a second observation given the first observation. The expected number of claims in Year 2 ranges between 1 and 3, depending on the observed number of claims in Year 1. Discuss which of the graphs is possible. Answer: Subfigure 56.1a is no good since the Bayesian estimate is always higher than the Bühlmann estimate. Subfigure 56.1b is no good because the Bühlmann estimate is not linear. Subfigure 56.1c is no good because the Bayesian estimate is outside the range of expected number of observations. Subfigure 56.1d is OK, even though the Bayesian estimate appears to be biased downwards from the Bühlmann estimate, since the probability of the points for which it is lower may be small. 

C/4 Study Manual—17th edition Copyright ©2014 ASM

56.3. Cov ( X i , X j )

?

1105

Quiz 56-1 For different classes of insureds, average aggregate losses vary between 1000 and 2200. The overall average of aggregate losses is 1400. An insured incurs an average of 800 in aggregate losses over a period of 3 years. 1. What are the lowest and highest possible Bayesian predictions of aggregate losses for this insured if the mean square error loss function is used? 2. What are the lowest and highest possible Bühlmann predictions of aggregate losses for this insured?

56.3

Cov ( X i , X j )

A part of the derivation that appeared on the Fall 2005 exam was the calculation of the covariance of two events in a Bühlmann model. In a Bühlmann model, we assume that for every hypothesis Θ, the events X i | Θ are independent and have the same hypothetical mean µ (Θ) and the same process variance v (Θ) . However, Cov ( X i , X j ) is not 0. Indeed, the Bühlmann model would be useless if it were zero, since we’re trying to predict X n+1 using X1 . . . X n . What is the covariance, for i , j? We’ll calculate it. The trick is to use the conditional expectation formula to express it in terms of X i | Θ, and then we can use independence to factor. Cov ( X i , X j )  E[X i X j ] − E[X i ] E[X j ]

f

g

f

g

f

 EΘ E[X i X j | Θ] − EΘ E[X i | Θ] EΘ E[X j | Θ]

g

 EΘ E[X i | Θ] E[X j | Θ] − EΘ [µ (Θ) ] EΘ [µ (Θ) ]

f

g

 EΘ [µ (Θ) 2 ] − Eθ [µ (Θ) ]

2

 Var µ (Θ)  a





where a is the variance of the hypothetical means. So Cov ( X i , X j )  a for i , j. For i  j, Cov ( X i , X j )  Var ( X i ) , and we use the conditional variance formula to obtain Var ( X i )  EΘ Var ( X i | Θ) + VarΘ E[X i | Θ]

f

g





 E[v (Θ) ] + Var µ (Θ)  v + a





Example 56C Claim sizes for a group follow an inverse gamma distribution with parameters α  3 and Θ. Θ varies by insured according to an inverse Gaussian distribution with parameters µ  6, θ  2. You observe two years of experience from a member of the group, X1 and X2 . Calculate Cov ( X1 , X2 ) . Answer: All we need is a. The hypothetical mean is E[X | Θ]  The variance of an inverse Gaussian is Var (Θ)  Therefore

C/4 Study Manual—17th edition Copyright ©2014 ASM

Θ Θ  . 3−1 2

µ 3 63   108. θ 2

Θ 108 a  Var  2  27 . 2 2

!



56. BÜHLMANN AS LEAST SQUARES ESTIMATE OF BAYES

1106

Coverage of this material in the three syllabus options The fact that the Bühlmann estimator is a least squares approximation of the Bayesian estimator is something mentioned in all three syllabus reading options, and you must know it. You are responsible for the general properties of Bayesian and Bühlmann estimators discussed in Section 56.2. While Cov ( X i , X j ) is only explicitly mentioned in Loss Models, an exam can reasonably expect you do figure out how to calculate it. You are not responsible for the derivation of the Bühlmann formulas.

Exercises 56.1. [4B-S90:57] (3 points) Let X1 be the outcome of a single trial and let E[X2 | X1 ] be the expected value of the outcome of a second trial as described in the table below. Outcome k

Initial Probability of Outcome

0 3 12

1/3 1/3 1/3

Bayesian Estimate E[X2 | X1  k] 1 6 8

Which of the following represents the Bühlmann credibility estimates corresponding to the Bayesian estimates (1, 6, 8)? (A) (3, 5, 10)

(B) (2, 4, 10)

(C) (2.5, 4.0, 8.5)

(D) (1.5, 3.375, 9.0) (E) (1, 6, 8)

56.2. [Based on 4B-S92:24] (2 points) Let X1 be the outcome of a single trial and let E[X2 | X1 ] be the expected value of the outcome of a second trial. You are given the following information: Outcome T

Pr ( X1  T )

1 8 12

1/3 1/3 1/3

Bühlmann Credibility Estimate for E[X2 | X1  T] — — —

Bayesian Estimate for E[X2 | X1  T] 4.2 5.4 —

Determine the Bühlmann credibility estimate for E[X2 | X1  12]. [4B-S93:6] (1 point) Which of the following are true?

56.3. 1.

Bühlmann credibility estimates are the best linear least squares approximations to estimates from Bayesian analysis.

2.

Bühlmann credibility requires the assumption of a distribution for the underlying process generating claims.

3.

Bühlmann credibility estimates are equivalent to estimates from Bayesian analysis when the likelihood density function is a member of a linear exponential family and the prior distribution is the conjugate prior.

(A) 1

(B) 2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 3

(D) 1,2

(E) 1,3

Exercises continue on the next page . . .

EXERCISES FOR LESSON 56

1107

You are given the following information concerning an insurance coverage:

56.4.

Number of Losses

Probability

0 1 2

1/4 1/2 1/4

Bayesian Premium E[X2 | X1  n] 0.5 0.9 1.7

Determine the Bühlmann credibility factor for this experience. You are given the following information concerning an insurance coverage:

56.5.

Number of Losses

Probability

0 2 5

0.5 0.3 0.2

Bayesian Premium E[X2 | X1  n] 1.0 1.0 4.0

Determine the three Bühlmann credibility premiums corresponding to the three aggregate loss levels of 0, 2, and 5. [4B-F93:24] (3 points) You are given the following:

56.6. •

An experiment consists of three possible outcomes, R 1  0, R 2  2, R 3  14.



The a priori probability distribution for the experiment’s outcome is: Outcome, R i 0 2 14



For each possible outcome, Bayesian analysis was used to calculate predictive estimates, E i , for the second observation of the experiment. The predictive estimates are: Outcome, R i 0 2 14



Probability, Pi 2/3 2/9 1/9

Bayesian Analysis Predictive Estimate E i Given Outcome R i 7/4 55/24 35/12

The Bühlmann credibility factor after one experiment is 1/12. Determine the values for the parameters a and b that minimize the expression: 3 X i1

(A) (B) (C) (D) (E)

a a a a a

Pi ( a + b · R i − E i ) 2 .

 1/12; b  11/12  1/12; b  22/12  11/12; b  1/12  22/12; b  1/12  11/12; b  11/12

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

56. BÜHLMANN AS LEAST SQUARES ESTIMATE OF BAYES

1108

[4-F02:7] You are given the following information about a credibility model:

56.7.

First Observation 1 2 3

Unconditional Probability 1/3 1/3 1/3

Bayesian Estimate of Second Observation 1.50 1.50 3.00

Determine the Bühlmann credibility estimate of the second observation, given that the first observation is 1. (A) 0.75

(B) 1.00

(C) 1.25

(D) 1.50

(E) 1.75

[4B-F97:12] (2 points) You are given the following:

56.8. •

A portfolio consists of a number of independent insureds.



Losses for each insured for each exposure period are one of three values: a, b, or c.



The probabilities for a, b, and c vary by insured, but are fixed over time.



The average probabilities for a, b, and c over all insureds are 5/12, 1/6, and 5/12, respectively.

One insured is selected at random from the portfolio and its losses are observed for one exposure period. Estimates of the same insured’s expected losses for the next exposure period are as follows: Observed Losses a b c

Bayesian Analysis Estimate 3.0 4.5 6.0

Bühlmann Credibility Estimate x 3.8 6.1

Determine x. (A) (B) (C) (D) (E) 56.9.

Less than 1.75 At least 1.75, but less than 2.50 At least 2.50, but less than 3.25 At least 3.25, but less than 4.00 At least 4.00 For a group medical coverage, you are given:

(i) X i is the number of claims submitted by a group in year i. (ii) Var ( X i )  28. (iii) Cov ( X i , X j )  12. Calculate the Bühlmann credibility assigned to 2 years of experience from this group.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 56

1109

56.10. [C-S05:32] For five types of risks, you are given: (i) The expected number of claims in a year for these risks ranges from 1.0 to 4.0. (ii) The number of claims follows a Poisson distribution for each risk. During Year 1, n claims are observed for a randomly selected risk. For the same risk, both Bayes and Bühlmann credibility estimates of the number of claims in Year 2 are calculated for n  0,1,2,. . . ,9. Which graph represents these estimates? (See next page)

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

56. BÜHLMANN AS LEAST SQUARES ESTIMATE OF BAYES

1110

(B) 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Est. No. of Claims in Year 2

Est. No. of Claims in Year 2

(A)

Bühlmann Bayes

0

2

4

6

8

10

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Bühlmann Bayes

0

No. of Claims in Year 1

6

8

10

(D) 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Est. No. of Claims in Year 2

Est. No. of Claims in Year 2

4

No. of Claims in Year 1

(C)

Bühlmann Bayes

0

2

4

6

8

10

No. of Claims in Year 1

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Bühlmann Bayes

0

2

4

6

8

10

No. of Claims in Year 1

Additional released exam questions: C-F05:26, C-S07:2

C/4 Study Manual—17th edition Copyright ©2014 ASM

4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0

Bühlmann Bayes

0

2

4

6

8

No. of Claims in Year 1

(E) Est. No. of Claims in Year 2

2

10

EXERCISE SOLUTIONS FOR LESSON 56

1111

Solutions 56.1. The straightforward way to do this 3 point exercise is by regression. Let X be the outcome and Y the Bayesian estimate. Then: E[X]  E[Y]  5 Cov ( X, Y )  13 (0 · 1 + 3 · 6 + 12 · 8) − 52  13 Var ( X )  13 (02 + 32 + 122 ) − 52  26 Cov ( X, Y ) 13 1   Z Var ( X ) 26 2 α  (1 − Z ) X¯  5 (0.5)  2.5

So the Bühlmann estimates are 2.5, 2.5 + 0.5 (3)  4, and 2.5 + 0.5 (12)  8.5, or (2.5, 4.0, 8.5) . (C) However, the set of choices is so limited that you could eliminate the incorrect choices without doing this work. First of all, since the Bühlmann estimate is a linear function of the outcome, the pattern has to be such that the difference between the third and second numbers is 3 times the difference between the second and first numbers. Thus (A) and (E) are out. The mean of the Bühlmann estimates has to equal the mean of the outcomes, or 5, so (A), (B), and (D) are out. This only leaves (C). 56.2. First, E[X1 ]  1+8+12  7. This means E[X2 ]  E E[X2 | X1  T] must also be 7, so the average of 3 the last column, the Bayesian estimate, must be 7, meaning

f

g

4.2 + 5.4 + E[X2 | X1  12] 7 3 E[X2 | X1  12]  21 − 4.2 − 5.4  11.4 Now we perform regression of the Bayesian estimate, which we’ll call Y, over the outcomes which we’ll call X. We already calculated E[X]  7.

(1 − 7) 2 + (8 − 7) 2 + (12 − 7) 2

62  3 3 (1)(4.2) + (8)(5.4) + (12)(11.4) Cov ( X, Y )  − 72  12.4 3 12.4  0.6 β 62/3 α  7 − (7)(0.6)  2.8 Var ( X ) 

The Bühlmann estimate when X1  12 is α + 12β  2.8 + 12 (0.6)  10 . 56.3. 2 is false because all you need are the mean and variance. The other two statements are true and have been discussed in this lesson and the previous one. (E) 56.4. Var ( X )  41 (02 ) + 12 (12 ) + 14 (22 ) − 12  0.5

Cov ( X, Y )  21 (0.9) + 14 (2)(1.7) − 12  0.3 Cov ( X, Y ) 0.3 Z   0.6 Var ( X ) 0.5

C/4 Study Manual—17th edition Copyright ©2014 ASM

56. BÜHLMANN AS LEAST SQUARES ESTIMATE OF BAYES

1112

56.5. x¯  1.6 Var ( X )  0.3 (4) + 0.2 (25) − 1.62  3.64

Cov ( X, Y )  0.3 (2) + 0.2 (20) − 1.62  2.04 Cov ( X, Y ) 2.04 Z  Var ( X ) 3.64 1.6 PC (0)  1.6 ( 3.64 )  0.7033

PC (2)  0.7033 + 2Z  1.8242 PC (5)  0.7033 + 5Z  3.5055 56.6. 2 9 (2)

You’re given half the answer: b is the Bühlmann credibility factor, or 1/12 . The mean is 23 (0) +

+ 91 (14)  2, so a  2 (1 −

1 12 )

 22/12 . (D)

56.7. The variance of the observations is 13 (12 + 02 + 12 )  23 . The covariance of the observations and the Bayesian estimates is Cov ( X, Y )  So the credibility factor is Z 

0.50 2/3

1 3



1 (1.50) + 2 (1.50) + 3 (3) − (2)(2)  0.50



 0.75 and the constant is (1 − 0.75)(2)  0.50 by equation (56.1). Then

the Bühlmann credibility estimate is 0.50 + 0.75 (1)  1.25 . (C) 56.8.

The mean of the Bühlmann estimates equals the mean of the Bayesian estimates, so 5 (3.0) + 12 5 (x ) + 12

1 (4.5) + 6 1 (3.8) + 6

5 (6.0)  4.5 12 5 (6.1)  4.5 12 ! 12 x (4.5 − 3.175)  3.18 5

(C)

56.9. As we discussed on page 1105, Var ( X i )  a + v and Cov ( X i , X j )  a. Therefore, a  12, v  16, and for 2 years of experience 2a 24 Z   0.6 2a + v 40 56.10. The Bühlmann estimate must be a straight line, eliminating (E), and a least squares unbiased estimator of Bayes, eliminating (C) where all estimates are below Bayes. It may go below 1 or above 4, since it is a weighted average of actual number of claims (which can be anything) and expected. Bayes, however, cannot be below 1 or above 4 since it is a weighted average of the expected number of claims (weighted with the posterior).This eliminates (B) and (D). We’re left with (A).

Quiz Solutions 56-1. 1. 2.

The Bayesian prediction must be between 1000 and 2200. The Bühlmann prediction must be between 800 and 1400.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 57

Empirical Bayes Non-Parametric Methods Reading: Loss Models Fourth Edition 19.1–19.2 or SN C-24-05 2.1 or Introduction to Credibility Theory 6.6.1– 6.6.3, 7.3 While the name of this topic in two of the textbooks is as above, it is called Empirical Bayesian NonParametric Estimation in the Herzog textbook. We have now spent a lot of time discussing how Bühlmann estimation works in theory. In practice, you will not have any distributions for the model or for the prior distribution. Instead, you will estimate the Bühlmann parameters from empirical data. By doing this, you will derive an estimate of the Bayesian prediction (remember, Bühlmann credibility is a least squares estimate of the Bayesian prediction) without having to specify any parametric distributions. Hence the name “Empirical Bayes Non-Parametric Estimation”. So it all reduces to estimating µ, v, and a. Easy, no? Well, to make it more challenging, we’re going to insist on unbiased estimators for µ, v, and a. Then you’ll calculate k and Z the usual way. Now, k  v/a, and the quotient of two unbiased estimators is biased (E[a/b] , E[a]/ E[b]), so k and Z and PC will all be biased anyway, but at least we tried. Sometimes, you may have some idea of a good parametric distribution for the model, even if you have no idea about a good distribution for the prior. If so, this parametric distribution may relate µ, v, and a automatically. You’ll only need to compute two of the three parameters, and the third one will follow from the model. This is called “Empirical Bayes Semi-Parametric Estimation”, and we’ll deal with this in the next lesson. In pre-2000 exams, no questions were asked on this material. Since 2000, however, the Exam C/4 exam committee has focused on this material with a vengeance. Expect at least one question on every exam on these two lessons, and two questions is possible as well. You will have to memorize the formulas—unless you want to rederive them each time you need them (and you won’t have time for that on an exam)—but most of them aren’t as bad as they look. In each of the empirical Bayes methods, you should concentrate on how estimators µ, v, and a are derived. Once you have estimators for these three, the formulas for k and Z are the same as before. These methods can be used in the following two situations: 1. There are r policyholder groups, and each one is followed for n i years, where n i may vary by group. Experience is provided by year. 2. There are r policyholder groups, and each group contains n i policyholders. Experience is provided for one year by policyholder. (If more than one year of experience is provided, each combination of a policyholder and a year can be treated separately.) Credibility is calculated for each group, not for each policyholder. Most of our examples will be for the first situation. The following notation is used in all three syllabus options: • There are r policyholder groups. Policyholder groups are indexed by i, so i goes from 1 to r. C/4 Study Manual—17th edition Copyright ©2014 ASM

1113

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1114

• There are n i years of experience (for the first situation) or n i policyholders (for the second situation) for each policyholder group. Years (for the first situation) or policyholders (for the second situation) are indexed by j, so j goes from 1 to n i . If n i is the same for all i, it can be written without a subscript as n.1 For the first situation: • There are m i j policyholders in group i in year j. There are a total of m i exposure-years in group i over all years; in other words mi 

ni X

mi j .

j1

There are a total of m exposure-years over all groups; in other words m

r X

mi .

i1

• The average per policyholder in group i in year j (this could be the count or the size or aggregate losses) are x i j . The average per policyholder in group i over all years is x¯ i ; in other words,

Pn i x¯ i 

j1

mi j xi j

mi

.

¯ in other words The average claims per policyholder overall is x; x¯ 

Pr i1 m i x¯ i P mi

In the second P situation, with policyholders indexed by j and group indexed by i, m i j  1 for all i and j. Then m i  j m i j is the total number of policyholders in group i. As mentioned above, the point of this lesson is to develop efficient unbiased estimators for µ, v, and a. The variances (v and a) are functions of sums of square differences from the sample mean. But to make these estimators unbiased, we have to come up with the right factors to multiply the sum of the square difference by.

57.1

Uniform exposures

First we study the Bühlmann framework: there is one exposure in every cell, and every individual is observed for the same number of years. In other words, m i j  1 for all i and j, and n i  n, a constant, for all i. For each individual, the mean and variance of the underlying distribution of losses is fixed. In other words, each X i j has mean µ (Θ) and variance v (Θ) , conditional on Θ, the hypothesis, which may vary by i but not by j. The overall expected value of each X i j is µ. Thus an unbiased estimator for µ is the sample mean: µˆ  x¯ 

r n 1 XX xi j rn i1 j1

1This n is not necessarily the n used in the Bühlmann credibility factor Z  n/ ( n + k ) . Instead, Z  m i / ( m i + k ) . C/4 Study Manual—17th edition Copyright ©2014 ASM

(57.1)

57.1. UNIFORM EXPOSURES

1115

We know from statistics that the sample variance with division by n − 1 is unbiased, or in other words

X n

 (Yi − Y¯ ) 2   ( n − 1) σ2  i1 

E 

for any independent identically distributed random variables Yi with variance σ2 . So

X  n   2 ¯ E  ( X i j − X i ) | Θ  ( n − 1) v (Θ)  j1  since v (Θ) is the conditional variance of each X i j . Taking the expected value of both sides of this equality fP g n 2 E ( X − X ) i j i j1 over Θ, we see that is an unbiased estimator of v. The average of this expression over n−1 all i is also unbiased, so our estimator for v is vˆ  Similarly,

r

n

i1

j1

1X 1 X ( X i j − X¯ i ) 2 r n−1

(57.2)

X  r ( X¯ i − X¯ ) 2   ( r − 1) Var ( X¯ i )  i1 

E 

The variance of the sample mean is the distribution variance divided by the number of observations. Using this, let’s calculate Var ( X¯ i ) with conditional variance: Var ( X¯ i )  Var E[X¯ i | θ] + E Var ( X¯ i | θ )





f

g

 Var E[X | θ] + E Var ( X | θ ) /n v a+ n





f

g

since E[X¯ i ]  E[X] and Var ( X¯ i )  Var ( X ) /n. So the estimator for a is aˆ 

r

1 X ¯ vˆ ( X i − X¯ ) 2 − r−1 n

(57.3)

i1

Here’s a summary of the estimators: µˆ is the sample mean. vˆ is the average of the sample variances of the rows. The sample variance of the row means has expected value a + v/n. The estimator aˆ is therefore the ˆ sample variance of the row means minus v/n. The uniform exposure formulas also work if m i j  l for all i and j for any constant l. In that case, treat each cell as a single credibility unit. By a “credibility unit”, we mean the item that gets counted up as n in the Bühlmann credibility formula Z  n/ ( n + k ) . In the following example, we do not specify how many exposures there are in each cell, since the answer does not vary with the number of exposures in each cell. C/4 Study Manual—17th edition Copyright ©2014 ASM

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1116

Example 57A An insurance portfolio has three groups, A, B, and C. Each group has the same number of exposures. The number of exposures does not vary by year. Claim counts in each year for each group are given in the following table: Group

Year 1

Year 2

A B C

1 1 2

0 1 1

Determine the expected third year claim counts for each group by using empirical Bayes nonparametric methods. Answer: We will use the notation v i for the sample variance of group i. X¯ A  vA 

1 2 1 2

X¯ B  1

X¯ C 

vB  0

vC  1 2 2

+

1 2 2

3 2 1 2

µˆ  1 vˆ 

1 3

1 3

1  2 2 12 2 (1/12) 1 Zˆ   2 (1/12) + (1/3) 3 aˆ 

PCA 



1 1 2 5 + (1)  3 2 3 6

!

PCB  1 PCC

?

2 1 3 7 + (1)   3 2 3 6

!



Quiz 57-1 For an insurance coverage: (i) For any insured the expected number of claims does not vary by year. (ii) You have the following four years of experience for claim counts of two insureds: Insured A B

Year 1 0 1

Year 2 1 0

Year 3 0 2

Year 4 0 2

Calculate the empirical Bayes nonparametric estimate of the expected number of claims for A in the following year.

57.2

Non-uniform exposures

57.2.1

No manual premium

In the general case, the Bühlmann-Straub framework, exposures (m i j ) vary by cell. We still assume that for each group, the underlying loss distribution does not vary by individual or by year. Now, E[X i j | Θ]  µ (Θ) , but Var ( X i j | Θ)  v (Θ) /m i j , since the variance of the sample mean of m i j observations is the distribution variance divided by m i j . C/4 Study Manual—17th edition Copyright ©2014 ASM

57.2. NON-UNIFORM EXPOSURES

1117

¯ the exposure-weighted sample mean, or The formula for µˆ is still x, µˆ  X¯ 

Pr Pn i i1

j1

mi j Xi j

(57.4) m The estimators for vˆ and aˆ are once again based on sums of squares horizontally and vertically, but we must determine the correct factor to multiply them by to make them unbiased. Most students will prefer to simply memorize the formulas. For those who would like to see the derivations, see the sidebar. The formula for vˆ isn’t too bad either; it’s the same as before, except that the square terms are weighted by exposures: Pr Pn i m ( X i j − X¯ i ) 2 i1 j1 i j vˆ  (57.5) Pr i1 ( n i − 1)

This formula is no surprise; it’s more or less what you would do if left to your own devices. The only thing you need to remember is the denominator. It is a sum of n i − 1, not a sum of the exposures, m. This is unlike the estimator for µ. The mean of X i j | Θ is µ (Θ) , but the variance is v (Θ) /m i j , so the m i j ’s have already been divided into the sample variance we calculate using X i j ’s. As a result, you only divide by n i − 1. The formula for aˆ has a more complicated factor: aˆ  * m − m −1 ,

r X i1

m 2i +

-

−1

r X * m i ( x¯i − x¯ ) 2 − vˆ ( r − 1) + , i1 -

(57.6)

However, even this formula isn’t so bad. If you weren’t concerned about bias, you’d divide by m. Similarly, the adjustment term would be mvi or something like that, just like it was v/n before. So the only thing to P memorize is the denominator, m − m −1 ri1 m 2i . Because exposures may vary by group, credibility factors Zˆ will also vary by group. This means that the sum of the predictive estimates of losses resulting from PCi  (1 − Z i ) X¯ + Z i X¯ i

when multiplied by historical exposures will not equal the experienced losses. To avoid this problem, instead of using µˆ  X¯ in the formula for PC , we can define µˆ as the credibility weighted average:

P ˆ ¯ Zi Xi . µˆ  P Zˆ i

(57.7)

The credibility weighted average—the method which preserves total losses—was featured on some of the released exams between 2000 and 2004, and is discussed both by Loss Models and Dean. Dean refers to this technique as “balancing the estimators”. However, Herzog doesn’t mention it, so they will not test on it. In other words, an exam question can ask you for credibility factors, but cannot ask you to calculate the credibility premium, since that depends on whether you use the raw mean or the credibility-weighted mean. In the unlikely chance that they ask you to calculate a credibility premium, use the unweighted mean. The exercises of this lesson will use the unweighted mean unless stated otherwise. Example 57B An insurance portfolio has two group policyholders. The following table shows claim count experience over 2 years for each group. Policyholder 1 2

C/4 Study Manual—17th edition Copyright ©2014 ASM

Claims Number in group Claims Number in group

Year 1 2 100 3 40

Year 2 4 100 3 60

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1118

Derivation of Formulas for vˆ and aˆ We are going to need formulas for two situations: the row variances and the variances of the row means. To handle both simultaneously, let’s assume that we have n random variables X j , with E[X j ]  µ and P Var ( X j )  β + α/m j . Also, let m  nj1 m j . Define X¯ as the weighted average of X j :

Pn

X¯ 

i1

mjXj

m

We would like a formula for E m j ( X j − X¯ ) 2 . By definition, Var ( X j )  E[ ( X j − µ ) 2 ], so we would Pn 2 like to express j1 m j ( X j − X¯ ) in terms of ( X j − µ ) 2 . A standard trick lets us express it as follows:

fP

n X j1

n j1

g

m j ( X j − X¯ ) 2 

n X j1

m j ( X j − µ ) 2 − m ( X¯ − µ ) 2

(57.8)

Let’s see how you do this. n X j1

m j ( X j − X¯ ) 2  

n X j1 n X j1



n X j1



n X j1



n X j1



n X j1

m j ( X j − µ ) + ( µ − X¯ )



m j (X j − µ)2 +

n X j1

2

m j ( X¯ − µ ) 2 − 2

n X j1

m j ( X j − µ ) 2 + m ( X¯ − µ ) 2 − 2 ( X¯ − µ ) m j ( X j − µ ) 2 + m ( X¯ − µ ) 2 − 2 ( X¯ − µ )

m j ( X j − µ )( X¯ − µ ) n X i1

m j (X j − µ)

X

m j X j − mµ



m j ( X j − µ ) 2 + m ( X¯ − µ ) 2 − 2 ( X¯ − µ ) m ( X¯ − µ ) m j ( X j − µ ) 2 − m ( X¯ − µ ) 2

Now that we’ve established equation (57.8), take expected values on both sides:

 X X n n  2 ¯ m j Var ( X j ) − m Var ( X¯ ) m j ( X j − X )   E   j1  j1

(*)

We need to evaluate Var ( X¯ ) .

Var ( X¯ )  Var * 

Pn

j1

mjXj

+ m , 2 m j ( β + α/m j ) m2 2 j1 m j

Pn β

C/4 Study Manual—17th edition Copyright ©2014 ASM

m2

+

α m

(**)

57.2. NON-UNIFORM EXPOSURES

1119

Substituting this and Var ( X j )  β + α/m j into (*), we obtain

Pn X  X ! 2 n n j1 m j α   2 −β E  −α m j ( X j − X¯ )   mj β + mj m  j1  j1 Pn 2 j1 m j * + + α ( n − 1) β m− m , -

(57.9)

We now use this equation to evaluate the row variances and the variances of the row means. In the case of the row variances, we have E[X j | Θ]  µ (Θ) and Var ( X j | Θ)  v (Θ) /m j . So α  v (Θ) and β  0. From equation (57.9):

X  ni   2 E  m j ( X i j − X¯ i ) | Θ  ( n i − 1) v (Θ)  j1  which explains why we divide the sum by n − 1 in equation (57.5) to obtain an unbiased estimator of v (Θ) . We then take a weighted average of these estimators, weighted by m i , to get an unbiased estimator of v. In the case of the variances of the row means, the unconditional variance of X¯ i , by the conditional variance formula, is Var ( X¯ i )  Var (E[X i | Θ]) + E[Var ( X i | Θ) ]  Var µ (Θ) + E v (Θ) /m i v a+ mi





f

g

where Var ( X i | Θ) was calculated by (**), with β  0 and α  v (Θ) . So apply formula (57.9) with α  v/m i and β  a, to obtain

Pr X  r 2 i1 m i + 2 E  m i ( X¯ i − X¯ )   a * m − + v ( n − 1) m  i1  , and formula (57.6) immediately follows. 1. Determine the credibility given to the experience for each group by using empirical Bayes nonparametric methods. 2. (As discussed above, not a likely exam question.) Determine the credibility weighted mean. Answer:

1. First we calculate the means. 6  0.03 200 6 x¯2   0.06 100 12 µˆ  x¯   0.04 300 x¯1 

The x i j ’s are the claims per individual, the quotients of claims over number in group, or C/4 Study Manual—17th edition Copyright ©2014 ASM

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1120

Policyholder 1 2

xi j mi j xi j mi j

Year 1 0.02 100 0.075 40

Year 2 0.04 100 0.05 60

By formula (57.5) vˆ 

100 (0.02 − 0.03) 2 + 100 (0.04 − 0.03) 2 + 40 (0.075 − 0.06) 2 + 60 (0.05 − 0.06) 2  0.0175 (2 − 1) + (2 − 1)

By formula (57.6)

aˆ 

200 (0.03 − 0.04) 2 + 100 (0.06 − 0.04) 2 − 0.0175 (2 − 1)

1 300 − 300 (2002 + 1002 ) 0.02 + 0.04 − 0.0175  0.00031875  400/3

ˆ ( m i aˆ + vˆ ) , as usual. Note that since this formula divides, Zˆ is The credibility factors are Zˆ i  m i a/ biased even though aˆ and vˆ are unbiased. 200 (0.00031875) 0.06375   0.7846 200 (0.00031875) + 0.0175 0.08125 100 (0.00031875) 0.031875 Zˆ 2    0.6456 100 (0.00031875) + 0.0175 0.049375 Zˆ 1 

2. The credibility weighted mean can now be calculated: µˆ cred 

0.7846 (0.03) + 0.6456 (0.06)  0.04354 0.7846 + 0.6456



All formulas given in this lesson for aˆ may produce non-positive values. If so, no credibility is assigned.

?

Quiz 57-2 For an insurance coverage, you are given the following experience for two group policyholders over three years: Group A B

Number of Members Average losses per member Number of Members Average losses per member

Year 1 20 60 5 30

Year 2 30 55 5 30

Year 3 30 50 10 30

Total 80 54.375 20 30

Calculate the credibility assigned to the experience of Group A using empirical Bayes nonparametric methods. When the number of exposures is uniform in each cell, the non-uniform formulas reduce to the uniform formulas. See the sidebar for a proof. Here is an example: Example 57C (Repeat of Example 57A) An insurance portfolio has three groups, A, B, and C. Each group has 10 exposures. The number of exposures does not vary by year. Claim counts in each year for each group are given in the following table: C/4 Study Manual—17th edition Copyright ©2014 ASM

57.2. NON-UNIFORM EXPOSURES

1121

Reducing the Non-Uniform Exposure Formulas into the Uniform Exposure Formulas Assume there are l exposures in each cell, with n cells (years of exposure) for each group and r groups. Then x i j is the average claims per exposure, which is the x i j of the uniform exposure formula divided by l. To distinguish, let’s use the letter y for average claims per exposure: y i j  x i j /l. Then the general formula gives µˆ

general

Pr Pn i1



j1

yi j

nr

while for the uniform exposures formula gives the same sum with x i j instead of y i j , so µˆ uniform l

µˆ general  ˆ For v: vˆ

general

l ( y i j − y¯ i ) 2

Pr Pn i1

  

j1

Pr

i1 ( n − 1) r X n X

1 r ( n − 1)

l ( x i j − x¯ i ) 2 l2

i1 j1

n r X X ( x i j − x¯ i ) 2 1 r ( n − 1) l i1 j1

so

vˆ uniform l

vˆ general  ˆ the denominator of the general formula is For a, m−m

−1

r X i1

m 2i

Pr  nrl −  nrl −

i1 ( nl )

2

nrl n 2 rl 2  nl ( r − 1) nrl

whereas the numerator is r X i1

m i ( y¯ i − y¯ ) 2 − vˆ general ( r − 1) 

r X nl ( x¯ i − x¯ ) 2 i1

l2



vˆ uniform ( r − 1) l

Dividing the denominator into the numerator, aˆ general 

r X 1 vˆ uniform 2 ¯ ¯ ( x − x ) − i l 2 ( r − 1) nl 2 i1

so aˆ general 

C/4 Study Manual—17th edition Copyright ©2014 ASM

aˆ uniform l2

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1122

The number of credibility units (i.e., the units that are counted in the N of the Bühlmann credibility factor Z  N/ ( N + K ) ) for the general formula is l times the number of credibility units in the uniform formula. Note that the number of credibility units is n (the number of years) in the uniform formula but it is nl in the general formula, so we have v general lv uniform  uniform  lk uniform a a general nl nl n     Z uniform general uniform uniform nl + lk n + k nl + k

k general  Z general

The credibility factors are equal for both formulas

Group

Year 1

Year 2

A B C

1 1 2

0 1 1

Determine the expected third year claim counts for each group by using empirical Bayes nonparametric methods. Answer: When we worked out this example with the uniform exposures formulas, we got µˆ  1, vˆ  1/3 and aˆ  1/12. Using the non-uniform exposures formulas, we must divide the total number of claims by the number of members in the group (10) to obtain number of claims per member. The resulting claim counts per member are Group

Year 1

Year 2

A B C

0.1 0.1 0.2

0 0.1 0.1

ˆ and aˆ using the general formula are (keep in mind that m i j  10, m i  20, and ˆ v, and the estimates of µ, m  60 for all i and j) µA  0.05 µ B  0.1 µ C  0.15 20 (0.05) + 20 (0.1) + 20 (0.15) µˆ   0.1 60 10 (0.1 − 0.05) 2 + 10 (0 − 0.05) 2 + 10 (0.2 − 0.15) 2 + 10 (0.1 − 0.15) 2 1 vˆ   3 30 ˆ we skipped group B because the variance is 0 for that group. In the calculation of v, aˆ 

20 (0.05 − 0.1) 2 + 20 (0.1 − 0.1) 2 + 20 (0.15 − 0.1) 2 − 2 (1/30) 60 −

202 +202 +202 60



1 1200

As we see, vˆ is now 1/30 instead of 1/3, and aˆ is 1/1200 instead of 1/12. There are 20 credibility units for C/4 Study Manual—17th edition Copyright ©2014 ASM

57.2. NON-UNIFORM EXPOSURES

1123

each group instead of 2. Then the premium per exposure is 20 (1/1200) 1  20 (1/1200) + 1/30 3 1 2 5 PCA  (0.05) + (0.1)  3 3 60 PCB  0.1 1 2 7 PCC  (0.15) + (0.1)  3 3 60 Zˆ 

These three results are multiplied by 10 exposures per group to get 5/6 , 1 , and 7/6 as expected third year claim counts for groups A, B, and C respectively, the same answers as we got with the uniform exposures formula. 

57.2.2

Manual premium

This method is only discussed by Loss Models, not by Herzog or Dean. Therefore I don’t expect any exam questions on it. It did not appear on released exams even before 2005 even though you were required to read this material in Loss Models at that time, but it is alleged to have appeared once on a non-released exam before 2005. Now that there are two other options which don’t discuss this topic, it would be unfair to ask a question on this topic. However, since Loss Models et al discusses it, I have included a discussion here for completeness, along with a couple of exercises. Instead of estimating µ, you may wish to assume it equals the manual premium. The formula used to estimate v is the same as before. 2 However, a simpler formula is available for estimating a in this case: aˆ 

r X mi i1

m

( x¯i − µ ) 2 −

r vˆ m

(57.10)

This formula can be used even if there is only one group. For an example of its application, see Loss Models Fourth Edition Example 19.5.

Coverage of this material in the three syllabus options Section 57.1 is required. It is covered in all three syllabus reading options. Section 57.2 is covered in all three syllabus reading options. However, while Loss Models and Dean mention the credibility-weighted mean, Herzog doesn’t mention it. They have not asked any questions on credibility-weighted mean since 2005 (when Herzog was added to the list of options for this material), so I doubt they will ask any future question on it. Subsection 57.2.2 is only covered by Loss Models, not by the other two syllabus options. I do not expect future exam questions on it.

2In case you’re stumped by the parenthetical question in the textbook on page 427 of Loss Models, second paragraph after Exˆ ample 19.4, you cannot use the following formula for v:

Pn i

j1

mi j (xi j − µ)2 n

,

because µ is the collective mean, not the mean for group i, which is unknown and which you are trying to estimate. C/4 Study Manual—17th edition Copyright ©2014 ASM

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1124

Table 57.1: Formula Summary for Empirical Bayes Non-Parametric Estimators

Uniform Exposures x¯

µˆ vˆ

Non-Uniform Exposures x¯

Pr Pn i

r X n X 1 ( x i j − x¯ i ) 2 r ( n − 1)

i1

j1

Pr

i1 ( n i

i1 j1



m i j ( x i j − x¯ i ) 2

r

Pr

1 X vˆ ( x¯ i − x¯ ) 2 − r−1 n

i1

m i ( x¯ i − x¯ ) 2 − vˆ ( r − 1) m − m −1

i1

− 1)

Pr

i1

m 2i

Exercises 57.1. Four policyholders have the following aggregate loss experience under a coverage over a 3 year period: Policyholder

Year 1

Year 2

Year 3

A B C D

3000 3500 1000 2000

2000 3000 2000 2000

2500 2500 2400 2000

The number of exposures does not vary by policyholder or year. You are to use empirical Bayes nonparametric methods to estimate the credibility premium for the fourth year. Determine the credibility premium for policyholder D. 57.2.

Two policyholders have the following number of claims over a 4 year period: Policyholder

Year 1

Year 2

Year 3

Year 4

A B

1 0

0 0

2 0

1 1

The number of exposures does not vary by policyholder or year. You are to use empirical Bayes nonparametric methods to estimate the number of claims in the fifth year. Determine the number of claims projected for policyholder A.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 57

57.3.

1125

Three policyholders have the following aggregate loss experience over a 2 year period: Policyholder

Year 1

Year 2

A B C

600 1000 500

0 1200 1500

The number of exposures does not vary by policyholder or year. No data is available for the third year. You are to use empirical Bayesian nonparametric methods to estimate the credibility premium for the fourth year. Determine the aggregate losses projected for policyholder A. 57.4.

Two individual policyholders have the following number of claims over a 3 year period: Policyholder

Year 1

Year 2

Year 3

A B

36 x − 12

36 x

36 x + 12

Based on this experience, using the empirical Bayes nonparametric method, a credibility factor of 0.8125 is calculated. Determine x. 57.5. [4-S00:15 and 4-F02:11] An insurer has data on losses for four policyholders for seven years. x i j is the loss from the i th policyholder for year j. You are given: 7 4 X X i1 j1

( x i j − x¯ i ) 2  33.60

4 X i1

( x¯ i − x¯ ) 2  3.30

Calculate the Bühlmann credibility factor for an individual policyholder using nonparametric empirical Bayes estimation. (A) (B) (C) (D) (E)

Less than 0.74 At least 0.74, but less than 0.77 At least 0.77, but less than 0.80 At least 0.80, but less than 0.83 At least 0.83

57.6. [4-F00:16] Survival times are available for four insureds, two from Class A and two from Class B. The two from Class A died at times t  1 and t  9. The two from Class B died at times t  2 and t  4. Nonparametric Empirical Bayes estimation is used to estimate the mean survival time for each class. Unbiased estimators of the expected value of the process variance and the variance of the hypothetical means are used. Estimate Z, the Bühlmann credibility factor. (A) 0

(B) 2/19

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 4/21

(D) 8/25

(E) 1 Exercises continue on the next page . . .

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1126

57.7.

[4-F03:15] You are given total claims for two policyholders: Policyholder X Y

1 730 655

2 800 650

Year

3 650 625

4 700 750

Using the nonparametric empirical Bayes method, determine the Bühlmann credibility premium for Policyholder Y. (A) 655

(B) 670

(C) 687

(D) 703

(E) 719

Use the following information for questions 57.8 and 57.9: [1999 C4 Sample:31] You wish to determine the nature of the relationship between sales and the number of radio advertisements broadcast. Data collected on four consecutive days is shown below. Day

Sales

1 2 3 4

10 20 30 40

Number of Radio Advertisements 2 2 3 3

You perform an Empirical Bayes nonparametric credibility analysis by treating the first two days, on which two radio advertisements were broadcast, as one group, and the last two days, on which three radio advertisements were broadcast, as another group. You are estimating the number of sales per day. ˆ of the data from each group. Determine the estimated credibility, Z,

57.8.

You are estimating the number of sales per radio advertisement. ˆ of the data from each group. Determine the estimated credibility, Z,

57.9.

57.10. Two policyholders have the following sizes and claim experience over 2 years: Policyholder A B

Total claims No. in group Total claims No. in group

Year 1 5000 50 8000 100

Year 2 6000 60 7750 110

You are to use empirical Bayes nonparametric methods to estimate the credibility premium for the third year. Determine the credibility factor for policyholder A.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 57

1127

57.11. Two policyholders have the following sizes and claim experience over 2 years: Policyholder A B

Total claims No. in group Total claims No. in group

Year 1 5000 50 8000 50

Year 2 6000 50 7500 50

There are 50 members of policyholder A in the third year. You are to use empirical Bayes nonparametric methods to estimate the credibility premium for the third year. Determine the aggregate losses projected for policyholder A. 57.12. Two group policyholders have the following sizes and claim experience over 3 years: Policyholder A B

Total claims No. in group Total claims No. in group

Year 1 1000 10 — —

Year 2 1800 12 2000 4

Year 3 1700 14 3600 10

You are to use an empirical Bayes nonparametric estimator to estimate the credibility premium for the fourth year. There are 15 members of policyholder B in year 4. Determine the aggregate losses projected for policyholder B. 57.13. [4-F04:17] You are given the following commercial automobile policy experience: Losses Number of Automobiles Losses Number of Automobiles Losses Number of Automobiles

Company I II III

Year 1 50,000 100 ? ? 150,000 50

Year 2 50,000 200 150,000 500 ? ?

Year 3 ? ? 150,000 300 150,000 150

Determine the nonparametric empirical Bayes credibility factor, Z, for Company III. (A) (B) (C) (D) (E)

Less than 0.2 At least 0.2, but less than 0.4 At least 0.4, but less than 0.6 At least 0.6, but less than 0.8 At least 0.8

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1128

57.14. [C-S05:25] You are given: Total Claims Number in Group Average Total Claims Number in Group Average Total Claims Number in Group Average

Group 1

Year 1

2

16,000 100 160

Year 2 10,000 50 200 18,000 90 200

Year 3 15,000 60 250

Total 25,000 110 227.27 34,000 190 178.95 59,000 300 196.67

You are also given that aˆ  651.03. Use the nonparametric empirical Bayes method to estimate the credibility factor for Group 1. (A) 0.48

(B) 0.50

(C) 0.52

(D) 0.54

(E) 0.56

57.15. You are given three group policyholders: I, II, and III. Over a period of one year, the number of claims submitted by members in the groups is: Number of Claims 0 1 2 Total Claims

Number of Members Group I Group II Group III 15 7 3 25

13 1 1 15

8 2 0 10

Use the nonparametric empirical Bayes method to estimate the credibility factor to apply for predicting claim counts for Group I. 57.16. You are given two group policyholders: I and II. Group I has 5 members. In 2009, they submitted aggregate claims of 500, 1000, 0, 0, and 500 respectively. Group II has 15 members. In 2009, 5 of them submitted aggregate claims of 0, 5 of them each submitted aggregate claims of 1000, and 5 of each submitted aggregate claims of 2000. Use the nonparametric empirical Bayes method to estimate the credibility factor to apply for predicting aggregate claims for Group I.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 57

1129

The following exercises use the credibility-weighted mean. Herzog does not discuss this topic although Loss Models and Dean do. I do not expect any exam questions on it. However, you may wish to try calculating the credibility factors in these exercises. 57.17. [4-S01:32] You are given the following experience for two insured groups: Group 1 2 Total

Number of members Average loss per member Number of members Average loss per member Number of members Average loss per member 3 2 X X i1 j1 2 X i1

1 8 96 25 113

Year 2 3 12 5 91 113 30 20 111 116

Total 25 97 75 113 100 109

m i j ( x i j − x¯ i ) 2  2020

m i ( x¯ i − x¯ ) 2  4800

Determine the nonparametric Empirical Bayes credibility premium for group 1, using the method that preserves total losses. (A) 98

(B) 99

(C) 101

(D) 103

(E) 104

57.18. [4-F01:30] You are making credibility estimates for regional rating factors. You observe that the Bühlmann-Straub nonparametric empirical Bayes method can be applied, with rating factor playing the role of pure premium. x i j denotes the rating factor for region i and year j, where i  1, 2, 3 and j  1, 2, 3, 4. Corresponding to each rating factor is the number of reported claims, m i j , measuring exposure. You are given: i 1 2 3

mi 

4 X

mi j

j1

50 300 150

4 1 X x¯ i  mi j xi j mi

4  2 1X vˆ i  m i j x i j − x¯ j 3

m i ( x¯ i − x¯ ) 2

1.406 1.298 1.178

0.536 0.125 0.172

0.887 0.191 1.348

j1

j1

Determine the credibility estimate of the rating factor for region 1 using the method that preserves ¯ i1 m i x i .

P3

(A) 1.31

(B) 1.33

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1.35

(D) 1.37

(E) 1.39

Exercises continue on the next page . . .

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1130

57.19. [4-F00:27] You are given the following information on towing losses for two classes of insureds, adults and youths: Exposures Year 1996 1997 1998 1999 Total

Adult 2000 1000 1000 1000 5000

Youth 450 250 175 125 1000

Total 2450 1250 1175 1125 6000

Pure Premium Year 1996 1997 1998 1999 Weighted average

Adult 0 5 6 4 3

Youth 15 2 15 1 10

Total 2.755 4.400 7.340 3.667 4.167

You are also given that the estimated variance of the hypothetical means is 17.125. Determine the nonparametric empirical Bayes credibility premium for the youth class, using the method that preserves total losses. (A) (B) (C) (D) (E)

Less than 5 At least 5, but less than 6 At least 6, but less than 7 At least 7, but less than 8 At least 8

The following three exercises are manual premium problems. Herzog and Dean do not discuss this topic; only Loss Models mentions it. Hence I don’t expect it on the exam. 57.20. The following data are available for a group policyholder: Total claims Number in group

Year 1

Year 2

Year 3

35,000 70

36,000 90

— 100

The manual rate per exposure is 500 per year. Estimate the total credibility premium for year 3 using empirical Bayes nonparametric methods.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 57

1131

57.21. Three group policyholders have the following sizes and claim experience over 2 years: Policyholder A B C

Total claims No. in group Total claims No. in group Total claims No. in group

Year 1 20,000 100 22,000 80 — —

Year 2 24,000 120 23,000 100 12,000 50

The manual rate per group member per year is 200. You are to use the manual rate in conjunction with empirical Bayes nonparametric estimation methods to estimate the credibility premium in year 3. Determine the estimate of the variance of the hypothetical means. 57.22. Three individual policyholders have the following experience over 4 years: Policyholder

Year 1

Year 2

Year 3

Year 4

A B C

5 3 2

4 7 5

6 6 1

5 8 4

The manual rate per year is 5. You are to use the manual rate in conjunction with empirical Bayes nonparametric estimation methods to estimate the credibility premium in year 5. Determine the estimate of the variance of the hypothetical means. Additional released exam questions: C-F05:11, C-F06:27, C-S07:11

Solutions 57.1. Notice that the number of exposures per policyholder can be any constant, not just 1. The credibility premium is calculated for the entire policyholder, not for each exposure, so it is appropriate to use n  3 in the formula for Z. The means and variances of each policyholder are 3000 + 2000 + 2500  2500 3 3500 + 3000 + 2500  3000 xB  3 1000 + 2000 + 2400 xC   1800 3 x D  2000 xA 

By formula (57.1), µˆ  x¯  By formula (57.2), vˆ  C/4 Study Manual—17th edition Copyright ©2014 ASM

(3000 − 2500) 2 + (2000 − 2500) 2 + 02

 250,000 2 (3500 − 3000) 2 + 0 + (2500 − 3000) 2 vB   250,000 2 (1000 − 1800) 2 + (2000 − 1800) 2 + (2400 − 1800) 2 vC   520,000 2 vD  0 vA 

2500 + 3000 + 1800 + 2000  2,325 4

250,000 + 250,000 + 520,000 + 0  255,000 4

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1132

By formula (57.3), aˆ 

(2500 − 2325) 2 + (3000 − 2325) 2 + (1800 − 2325) 2 + (2000 − 2325) 2

3 1752 + 6752 + 5252 + 3252 255,000  −  204,166 23 3 3



255,000 3

The credibility factor and premium for policyholder D are then Zˆ 

3 (204,166 32 )

3 (204,166 23 ) + 255,000

 0.70605

Cred premium  2325 − 0.70605 (325)  2095.53 57.2.

First we calculate the means and variances for each policyholder. 0 + (0 − 1) 2 + (2 − 1) 2 + 0 2  3 3 3 (0 − 14 ) 2 + (1 − 14 ) 2 1 v2   3 4

1+0+2+1 1 4 0+0+0+1 1  x¯ 2  4 4 x¯ 1 

v1 

By equation (57.1), µˆ  x¯  By equation (57.2), vˆ 

2 3

1+ 2 + 2

By equation (57.3),

5 2 1 5 − + 8 4 8 The credibility factor and premium are then



aˆ  1 −



Zˆ 



1 4

2

1 4



5 8



11 24



11/24 16 1   4 96 6

4 (1/6) 16  4 (1/6) + 11/24 27

5 5 16 61 PC  + 1−  8 27 8 72



57.3.



The policyholder means and variances are x¯ 1  300 x¯ 2  1100 x¯ 3  1000

By formulas (57.1), (57.2), and (57.3), 300 + 1100 + 1000  800 3 180,000 + 20,000 + 500,000 700,000 vˆ   3 3

µˆ  x¯ 

C/4 Study Manual—17th edition Copyright ©2014 ASM

v¯ 1  180,000 v¯ 2  20,000 v¯ 3  500,000

EXERCISE SOLUTIONS FOR LESSON 57

aˆ 

1133

(300 − 800) 2 + (1100 − 800) 2 + (1000 − 800) 2 2



700,000/3 220,000  2 3

The credibility factor and premium are then Zˆ 

22 2 (220,000/3)  2 (220,000/3) + 700,000/3 57

PC  800 − 57.4.

22 57 (500)

 607.02

ˆ Let’s back out k. 3

 0.8125 3 + kˆ 3 (1 − 0.8125)  0.8125kˆ 3 (0.1875) kˆ   0.692308 0.8125 The variance of the first policyholder is 0 and the variance of the second policyholder is v2  2 (12) 2 /2  144 so the estimator for v is 72. Then vˆ kˆ  aˆ aˆ 

72  104 0.692308

The mean of the first policyholder is 36 and the mean of the second policyholder is x, so 36 − x aˆ  2 2

!2

36 − x 104  2 2

!2

vˆ 36 − x − 2 n 2

!2

− 24

− 24

!2

36 − x  64 2 36 − x  ±8 2 x  20, 52 57.5. This is a case with uniform exposures, m i j  1 for all i and j, r  4, n  7. For the credibility factor ˆ By formulas (57.2) and (57.3), Z, we don’t need µ. vˆ 

33.60

(6)(4)

 1.4

3.30 1.4 −  0.9 3 7 7aˆ 6.3 9 Zˆ     0.8182 7 aˆ + vˆ 7.7 11 aˆ 

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1134

57.6. Unlike most of the exercises in this lesson, where exposure groups are number of insured-years, here each person is an exposure group consisting of a single unit (regardless of the number of years in the study). Thus this is a case of uniform exposures (each of the 4 exposure groups has one life), and we can use the simpler formulas, (57.1), (57.2), and (57.3) with n  2, r  2. x¯ A  x¯ B  x¯  vˆ  aˆ 

1+9 5 2 2+4 3 2 5+3 4 2 (1 − 5) 2 + (9 − 5) 2 + (2 − 3) 2 + (4 − 3) 2  17 2 (5 − 4) 2 + (3 − 4) 2 17 −  −6.5 1 2

Since aˆ is negative, no credibility is assigned. (A) 57.7.

This is a case with uniform exposures. 730 + 800 + 650 + 700  720 4 655 + 650 + 625 + 750  670 x¯ 2  4 (730 − 720) 2 + (800 − 720) 2 + (650 − 720) 2 + (700 − 720) 2 vˆ 1   3933 13 3 (655 − 670) 2 + (650 − 670) 2 + (625 − 670) 2 + (750 − 670) 2 vˆ 2   3016 23 3 x¯ 1 

By formulas (57.1), (57.2), and (57.3), 720 + 670  695 2 3933 13 + 3016 32 vˆ   3475 2

µˆ  x¯ 

aˆ  (720 − 695) 2 + (670 − 695) 2 − Zˆ 

4 (381.25)  0.305 4 (381.25) + 3475

3475  381.25 4

PC  695 + 0.305 (670 − 695)  687.375 57.8.

(C)

Perhaps setting this up the way our other exercises are set up will make it clearer: Group

Day 1

Day 2

Average

2 ads per day 3 ads per day

10 30

20 40

15 35

This is a uniform exposure problem. The overall mean is µˆ  x¯  C/4 Study Manual—17th edition Copyright ©2014 ASM

15 + 35  25 2

EXERCISE SOLUTIONS FOR LESSON 57

1135

The estimator for v is v1  (10 − 15) 2 + (20 − 15) 2  50

v2  (30 − 35) 2 + (40 − 35) 2  50 vˆ  50 The estimator for a is

50  175 2 There are 2 days for each group, so n  2. The credibility for each group is aˆ  (15 − 25) 2 + (35 − 25) 2 −

Zˆ 

57.9.

350 7 2aˆ   2aˆ + vˆ 400 8

We set it up the way our other exercises are set up: Group A B

Radio advertisements Sales Radio advertisements Sales

Day 1 2 10 3 30

Day 2 2 20 3 40

We calculate the mean experience for each group: 10 + 20  7.5 2+2 30 + 40 35 x¯ 2   3+3 3 x¯ 1 

We calculate the overall mean, the estimator for µ: µˆ  x¯ 

10 + 20 + 30 + 40 100   10 2+2+3+3 10

We calculate the x i j , the average claims per member in each group for each year, so that we can calculate ˆ v: Group

Day 1

Day 2

A

Sales per advertisement

10 5 2

20  10 2

B

Sales per advertisement

30  10 3

40 3

We calculate vˆ using formula (57.5): 2 (5 − 7.5) 2 + 2 (10 − 7.5) 2 + 3 10 −



vˆ  The denominator of aˆ is

2−1+2−1

10 − C/4 Study Manual—17th edition Copyright ©2014 ASM

 35 2 3

42 + 6 2  4.8 10

+3



40 3



 35 2 3



125 6

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1136

We calculate aˆ using formula (57.6): aˆ 

4 (7.5 − 10) 2 + 6



35 3

4.8

− 10

2



125 6

125/6 5   4.8 24

!

125 6

!

The credibility factors are: 5 4 24 4aˆ Zˆ A   5 4aˆ + vˆ 4 24 +



Zˆ B 

5 6 24 6aˆ   5 6aˆ + vˆ 6 24 +

24 24



5 11



5 9

 24 24

57.10. By formula (57.4),

5000 + 6000 + 8000 + 7750  83.59375 50 + 60 + 100 + 110 The average numbers of claims per exposure (x i j ) are: µˆ 

Policyholder A B The group means are

xi j mi j xi j mi j

Year 1 100 50 80 100

x¯ 1  100

x¯ 2  75

Year 2 100 60 70.4545 110

By formula (57.5), 50 (02 ) + 60 (02 ) + 100 (80 − 75) 2 + 110 (70.4545 − 75) 2  2386.36 2

vˆ  By formula (57.6), aˆ 

110 (100 − 83.59375) 2 + 210 (75 − 83.59375) 2 − 2386.36 (1)  295.97 320 − (1102 + 2102 ) /320

The credibility factor is

Zˆ 

110 (295.97)  0.93171 110 (295.97) + 2386.36

57.11. Since exposures are 50 in all cells, the uniform exposures formulas can be used. 11,000  5500 2 15,500 x¯ B   7750 2 5500 + 7750 µˆ  x¯   6625 2 v A  (5000 − 5500) 2 + (6000 − 5500) 2  500,000

x¯ A 

v B  (8000 − 7750) 2 + (7500 − 7750) 2  125,000

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 57

vˆ 

1137

500,000 + 125,000  312,500 2

aˆ  (6625 − 5500) 2 + (7750 − 6625) 2 − Zˆ 

2 (2,375,000)  0.938272 2 (2,375,000) + 312,500

312,500  2,375,000 2

P A  0.938272 (5500) + (1 − 0.938272)(6625)  5569.44 We will also work this out with the general formulas. x¯ A  x¯ B  µˆ  vˆ  aˆ  Zˆ 

11,000  110 100 15,500  155 100 26,500  132.5 x¯  200 50 (100 − 110) 2 + 50 (120 − 110) 2 + 50 (160 − 155) 2 + 50 (150 − 155) 2  6250 2 2 2 100 (110 − 132.5) + 100 (155 − 132.5) − 6250 95,000   950 100 200 − (1002 + 1002 ) /200 100 (950)  0.938272 100 (950) + 6250

P A  50 0.938272 (110) + (1 − 0.938272)(132.5)  5569.44





57.12. We calculate the mean experience for each group policyholder 1000 + 1800 + 1700 4500   125 10 + 12 + 14 36 2000 + 3600 x¯ 2   400 10 + 4 x¯ 1 

We calculate the overall mean, which is the estimate for µ: µˆ 

1000 + 1800 + 1700 + 2000 + 3600 10,100   202 10 + 12 + 14 + 4 + 10 50

We calculate the x i j , the average claims per member in each group for each year, so that we can calculate ˆ v: Policyholder

Year 1

Year 2

Year 3

A

Claims per member

1000  100 10

1800  150 12

1700  121.4286 14

B

Claims per member



2000  500 4

3600  360 10

We use formula (57.5) to estimate v: vˆ 

10 (100 − 125) 2 + 12 (150 − 125) 2 + 14 (121.4286 − 125) 2 + 4 (500 − 400) 2 + 10 (360 − 400) 2 2+1

C/4 Study Manual—17th edition Copyright ©2014 ASM

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1138

10 (252 ) + 12 (252 ) + 14 (3.57142 ) + 4 (1002 ) + 10 (402 )  23,309.52 3



We calculate aˆ using formula (57.6). First the denominator: m − m −1

X

m 2i  50 −

2 1 50 (36

+ 142 )  20.16

So that aˆ is

36 (772 ) + 14 (1982 ) − 23,309.52 (1)  36,656.27 20.16 Policyholder B has 14 exposure units (4 members in the second year and 10 in the third year), so B’s credibility factor is 14aˆ 14 (36,656.27) Zˆ    0.95655 14 aˆ + vˆ 14 (36,656.27) + 23,309.52 aˆ 

Using the unweighted mean, projected aggregate losses are PC  15 202 + 0.95655 (400 − 202)  5871





57.13. Missing data, data with question marks, must be excluded, so we only have 2 years of experience from each company. 100,000  333.3333 300 300,000 x¯ 2   375 800 300,000 x¯ 3   1500 200 700,000  538.4615 x¯  1300 100 (500 − 333.3333) 2 + 200 (250 − 333.3333) 2 + 500 (300 − 375) 2 x¯ 1 

vˆ 

+ 300 (500 − 375) 2 + 50 (3000 − 1500) 2 + 150 (1000 − 1500) 2

 53,888,889

(2 − 1) + (2 − 1) + (2 − 1)

3002 + 8002 + 2002  707.6923 1300 300 (333.3333 − 538.4615) 2 + 800 (375 − 538.4615) 2

Denominator of aˆ  1300 − aˆ 

+ 200 (1500 − 538.4615) 2 − 2 (53,888,889)

707.6923 111,132,479   157,035.02 707.6923 200 (157,035.02) Zˆ   0.3682 200 (157,035.02) + 53,888,889

(B)

57.14. They made it easy for you by calculating aˆ and all the averages, and only giving you 2 groups with 2 years apiece. The formula for vˆ yields vˆ 

50 (200 − 227.27) 2 + 60 (250 − 227.27) 2 + 100 (160 − 178.95) 2 + 90 (200 − 178.95) 2 4−2

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 57

1139

 71,985.65 The credibility factor for Group 1 is based on 110 observations (total number in group), and is Zˆ 

110 (651.03) 110 aˆ   0.4987 110 aˆ + vˆ 110 (651.03) + 71,985.64

(B)

57.15. The mean number of claims in each group is 7 + 3 (2)  0.52 25 The estimate of the overall mean is x¯ I 

x¯ II 

1 + 2 (1)  0.2 15

x¯ III 

2  0.2 10

10 (1) + 4 (2)  0.36 50 For the estimated expected process variance, we sum up, over all 50 policyholders, the square difference between claims and mean claims for the group. Then we divide by the sum of the number of policyholders in each group minus 1. µˆ  x¯ 

15 (0 − 0.52) 2 + 7 (1 − 0.52) 2 + 3 (2 − 0.52) 2 + 13 (0 − 0.2) 2 + (1 − 0.2) 2 + (2 − 0.2) 2 + 8 (0 − 0.2) 2 + 2 (1 − 0.2) 2 vˆ  (25 − 1) + (15 − 1) + (10 − 1)  0.388085 For the estimated variance of hypothetical means, we sum the square differences between the group means and the overall means, and make the usual adjustment for vˆ ( r − 1) . There are three groups, so r  3. aˆ 

25 (0.52 − 0.36) 2 + 25 (0.2 − 0.36) 2 − 2 (0.388085) 0.503830   0.016253 31 50 − (252 + 152 + 102 ) /50

The credibility factor for Group I, which has 25 units of experience (25 members for 1 year), is Zˆ I 

25aˆ 25 (0.016253)   0.5115 25 aˆ + vˆ 25 (0.016253) + 0.388085

57.16. Mean aggregate claims are 2000/5  400 for Group I and 15,000/15  1000 for Group II. Then 5 (400) + 15 (1000)  850 20 The estimate of v is obtained by summing square differences from the group mean for all 20 policyholders: µˆ  x¯ 

2 (0 − 400) 2 + 2 (500 − 400) 2 + (1000 − 400) 2 + 5 (0 − 1000) 2 + 5 (1000 − 1000) 2 + 5 (2000 − 1000) 2 (5 − 1) + (15 − 1)  594,444

vˆ 

The estimate of a is obtained by summing square differences of group means from µ, with adjustments. aˆ 

5 (400 − 850) 2 + 15 (1000 − 850) 2 − 594,444 755,556   100,741 7.5 20 − (52 + 152 ) /20

The credibility factor for Group I, with 5 units of experience, is

vˆ 594,444 kˆ    5.900716 aˆ 100,741 5 5 Zˆ    0.4587 ˆ 10.900716 5+k C/4 Study Manual—17th edition Copyright ©2014 ASM

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1140

57.17. By formula (57.5), vˆ  By formula (57.6), aˆ 

2020  505 4

4800 − 505

100 −

252 +752 100



4295  114.5333 37.5

The credibility factors, credibility-weighted mean, and credibility premiums are 25 (114.5333)  0.8501 25 (114.5333) + 505 75 (114.5333)  0.9445 Zˆ 2  75 (114.5333) + 505 0.8501 (97) + 0.9445 (113) x¯ CRED  0.8501 + 0.9445 189.1829   105.4208 1.7945 PC  105.4208 + 0.8501 (97 − 105.4208)  98.2625 Zˆ 1 

(A)

ˆ denominator should be (4 − 1) + (4 − 1) + (4 − 1)  9, whereas vˆ i 57.18. Since there are n  4 years, v’s has 3 in the denominator, so we add them up and divide by 3: vˆ 

0.536 + 0.125 + 0.172  0.2777 3

There are r  3 regions. The denominator of aˆ is 500 −

502 +3002 +1502 500

 270. Then

0.887 + 0.191 + 1.348 − 2 (0.2777)  0.006928 270 50 (0.006928)  0.5551  50 (0.006928) + 0.2777 300 (0.006928)   0.8822 300 (0.006928) + 0.2777 150 (0.006928)   0.7892 150 (0.006928) + 0.2777 0.5551 (1.406) + 0.8822 (1.298) + 0.7892 (1.178)  0.5551 + 0.8822 + 0.7892 2.8551   1.2824 2.2264  1.2824 + 0.5551 (1.406 − 1.2824)  1.3510 (C)

aˆ  Z1 Z2 Z3 x¯ CRED

PC

57.19. Pure premium is aggregate losses per exposure. The total pure premium shown in the table is the ) +2 (250) weighted average of the pure premiums for adults and youths. For example, for 1997, 5 (10001250  4.400. CRED So the x¯ i ’s are 3 and 10, and x¯  4.167. Let P2 denote the credibility premium for the youth class that we are seeking. 2000 (32 ) + 1000 (22 ) + 1000 (32 ) + 1000 (12 ) + 450 (52 ) + 250 (82 ) + 175 (52 ) + 125 (92 ) vˆ   12,291.67 6 C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 57

1141

5000 (3 − 4.167) 2 + 1000 (10 − 4.167) 2 − 12,291.67 6000 − (50002 + 10002 ) /6000 28,541.67   17.125 1666.67 5000 (17.125)  0.874468  5000 (17.125) + 12,291.67 1000 (17.125)  0.582153  1000 (17.125) + 12,291.67 0.874468 (3) + 0.582153 (10)   5.7976 0.874468 + 0.582153  5.7976 + 0.582153 (10 − 5.7976)  8.2440 (E)

aˆ 

Zˆ 1 Zˆ 2 x¯ CRED PC 57.20.

x¯  443.75 vˆ  70 (56.252 ) + 90 (43.752 )  393,750  1  a˜  160 (56.252 ) − 393,750  703.125 160 160 (703.125) 2 Zˆ   160 (703.125) + 393,750 9 PC  100 (500 − 56.25Zˆ )  48,750 57.21. 45,000 12,000 44,000  200 x¯ 2   250 x¯ 3   240 220 180 50   1 vˆ  100 (02 ) + 120 (02 ) + 80 (252 ) + 100 (202 ) + 50 (02 )  45,000 2  1  a˜  220 (02 ) + 180 (502 ) + 50 (402 ) − 3 (45,000)  877 79 450

x¯ 1 

57.22. x¯ 1  5 x¯ 2  6 x¯ 3  3  26 1 2 vˆ  (0 + 12 + 12 + 02 ) + (32 + 12 + 02 + 22 ) + (12 + 22 + 22 + 12 )  9 9   1 17 a˜  4 (12 ) + 4 (22 ) − 3 ( 26 9 )  12 18

Quiz Solutions 57-1. µA  0.25 vA  C/4 Study Manual—17th edition Copyright ©2014 ASM

µ B  1.25

3 (0.252 )

3

+

0.752

µˆ  0.75

 0.25

57. EMPIRICAL BAYES NON-PARAMETRIC METHODS

1142

0.252 + 1.252 + 2 (0.752 ) 11  3 12   7 3 11  vˆ  0.5 + 12 12 12

vB 

aˆ  (0.25 − 0.75) 2 + (1.25 − 0.75) 2 − Zˆ 

7/12  0.354167 4

4aˆ 1.41667   0.708333 4aˆ + vˆ 1.41667 + 7/12

PC  0.708333 (0.25) + (1 − 0.708333)(0.75)  0.395833 57-2. 80 (54.375) + 20 (30)  49.5 100 20 (60 − 54.375) 2 + 30 (55 − 54.375) 2 + 30 (50 − 54.375) 2 + 0 + 0 + 0 vˆ   304.6875 4

µˆ 

! −1

  202 + 802 80 (54.375 − 49.5) 2 + 20 (30 − 49.5) 2 − 304.6875 100 9201.5625   287.5488 32 80 (287.5488) Zˆ   0.9869 80 (287.5488) + 304.6875 aˆ  100 −

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 58

Empirical Bayes Semi-Parametric Methods Reading: Loss Models Fourth Edition 19.3 or SN C-24-05 2.2 or Introduction to Credibility Theory 6.6.4, 7.4 This topic is called Empirical Bayesian Semi-Parametric Estimation by Herzog. Exam questions on it are frequent.

58.1

Poisson model

If we make some assumptions about the conditional distribution of losses, the model, we can simplify the estimation of v and a. If the model is assumed to have a Poisson distribution, the hypothetical mean and process variance are equal, so that their expected values µ and v are equal. This means that there is no need to estimate v via ¯ Even if we have only one year of data, we will still be able to calculate equation (57.5); instead, vˆ  µˆ  x. the credibility estimate. aˆ is estimated in the usual way. Uniform exposures—1 year experience Let’s start with uniform exposures. By formula (57.3), aˆ is the ˆ sample variance of the row means minus v/n. In particular, if n  1, then aˆ is the sample variance minus ˆ and the formulas are v, µˆ  x¯ vˆ  x¯

(58.1) (58.2)

2

aˆ  s − vˆ

(58.3)

As usual, the sample variance is calculated in an unbiased fashion as s2 

X ( X¯ − X¯ ) 2 i r−1

where r is the number of policyholders. That is how it is calculated in Loss Models and in Dean. Herzog, on the other hand, divides by r instead of r − 1, resulting in a biased estimator. He does this in Chapter 6 and again in Chapter 7. In all his examples he divides by r. At the end of Chapter 7, however, he mentions that his estimator is biased and that he could’ve used an unbiased estimator instead by dividing by r − 1. (See the last sentence starting on page 127.) So what will exam questions expect? I think exam questions will avoid the issue by using large r, so it doesn’t make much of a difference.1 aˆ estimated using empirical Bayes semi-parametric methods may be non-positive. In this case the method fails, and no credibility is given.

1Another peculiarity in Herzog is that he repeats exactly the same example from Section 6.6.4 in Section 7.4. It is not clear what is added by this repetition. C/4 Study Manual—17th edition Copyright ©2014 ASM

1143

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1144

Example 58A The distribution of policyholders by number of claims in a year is as follows: Number of Claims

Number of Policyholders

0 1 2 3

461 30 7 2

Total

500

For each policyholder, the number of claims is assumed to follow a Poisson distribution. You are to use empirical Bayes semi-parametric methods, with an unbiased estimator for the variance of the hypothetical means, to determine credibility. Determine the number of claims predicted for the second year for each policyholder. Answer: By equations (58.1) and (58.2), vˆ  µˆ 

461 (0) + 30 (1) + 7 (2) + 2 (3)  0.1 500

By equation (58.3), 461 (0.12 ) + 30 (0.92 ) + 7 (1.92 ) + 2 (2.92 ) 71   0.1423 499 499 aˆ  0.1423 − 0.1  0.0423

s2 

ˆ ( aˆ + vˆ ) , or The credibility factor, since there’s only one year of experience for each insured, is Zˆ  a/ 0.0423 Zˆ   0.2972 0.1423



The estimated number of claims in the next year would be 0.2972x i1 + (1 − 0.2972)(0.1) with x i1 equal to the number of claims in the first year: 0, 1, 2, or 3. Claims observed in first year

Claims predicted for second year

0 1 2 3

0.0703 0.3675 0.6647 0.9619

Notice that n  1, not 500. All 500 observations are used to estimate µ, v, and a. However, credibility is calculated for only one policyholder. Different policyholders may have different credibility premiums. Since there’s only one year of experience for the policyholder we’re calculating credibility for, n  1.

C/4 Study Manual—17th edition Copyright ©2014 ASM

58.1. POISSON MODEL

?

1145

Quiz 58-1 Claim counts follow a Poisson distribution. One year of experience from 50 policyholders is Number of Claims

Number of Policyholders

0 1 2

44 4 2

You are to use empirical Bayes semi-parametric methods to determine credibility. Determine the number of claims predicted in the second year for a policyholder with one claim in the first year. Calculator Tip Mean and variance can be computed on the TI-30XS/B calculator by entering the number of claims in one column, the frequencies in another, and then checking the statistical registers, as discussed on page 689. However, square statistic 3 Sx rather than statistic 4 σx to obtain the variance, unless you want to follow Herzog.

Uniform exposures—several years of experience experience.

Now let’s do an example with more than one year of

Example 58B The number of claims observed for 500 policyholders over a 3 year period is as follows: Number of Claims Over a 3-Year Period

Number of Policyholders

0 1 2 3

250 163 64 23

For each policyholder, the number of claims per year is assumed to follow a Poisson distribution. You are to use empirical Bayes semiparametric methods. Determine the number of claims predicted in the fourth year for each policyholder. Answer: In this question, you are not given the experience for each year, so you cannot calculate vˆ using ˆ equation (57.2). Fortunately, there is no need to, since with the Poisson assumption vˆ  µ. The easier way to work this out is to treat the 3-year period as a single unit having a Poisson parameter 3 times the annual parameter. When we have Z and calculate the credibility premium, we divide it by 3 to obtain a 1-year projection. We calculate µˆ and vˆ by equations (58.1) and (58.2). µˆ  vˆ  x¯ 

163 (1) + 64 (2) + 23 (3)  0.72 500

By formula (58.3), 500 163 (12 ) + 64 (22 ) + 23 (32 ) s  − 0.722  0.735070 499 500 2

aˆ  0.735070 − 0.72  0.015070 C/4 Study Manual—17th edition Copyright ©2014 ASM

!

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1146

Then the credibility factor is 0.015070  0.020502 Zˆ  0.735070 and the projected number of claims is Claims observed in years 1–3

Projected claims in year 4

0

. 3  0.23508  . 0.72 + 0.020502 (1 − 0.72) 3  0.24191  . 0.72 + 0.020502 (2 − 0.72) 3  0.24875  . 0.72 + 0.020502 (3 − 0.72) 3  0.25558 

1 2 3

0.72 + 0.020502 (−0.72)

The alternative is to treat each year as a unit. Then µˆ  0.72/3  0.24, 1/3 of what we have before, and vˆ  0.24. The expected value of the variance of the X¯ i ’s is a + v/3, as we discussed before equation (57.3), ˆ so that setting aˆ  s 2 − v/3, where s 2 is the unbiased sample variance (as usual), results in an unbiased ¯ estimator for a. The X i ’s are 0 for 250 policyholders, 1/3 for 163 policyholders, 2/3 for 64 policyholders, and 1 for 23 policyholders. Since these X¯ i ’s are 1/3 of the X¯ i ’s used in the first method, their sample variance s 2 is 1/9 of the sample variance computed in the first method, or 0.735070/9  0.081674. So 0.24  0.001674 3 3 (0.001674) 3aˆ   0.020502 Zˆ  3aˆ + vˆ 3 (0.001674) + 0.24 aˆ  0.081674 −

the same as before.



Now, suppose you were given year-by-year experience. For example, suppose you were given the data ˆ or should you of Example 57A. Now you can calculate vˆ using formula (57.2). Should you set vˆ  µ, ˆ Notice that this is the situation in the example in the calculate vˆ from formula (57.2) and set µˆ  v? Dean study note, page 29, and he takes it for granted that you set vˆ  x¯ rather than setting µˆ equal to the estimator for v. If µˆ differs from vˆ significantly, it puts the Poisson assumption into question. In any case, an exam question cannot allow this ambiguity, and presumably will never give you enough data to estimate vˆ directly. Non-uniform exposures Let’s now consider a case with non-uniform exposures. The Poisson model ˆ but aˆ must be estimated using equation (57.6). This type of question has appeared on implies that vˆ  µ, at least three recent exams. Example 58C You examine the following experience for two group policyholders A and B: A B

Exposures Claims Exposures Claims

10 1 20 8

The number of claims per insured follows a Poisson distribution. Using semi-parametric empirical Bayes estimators, calculate the credibility assigned to policyholder A. Answer: By the Poisson property, an estimate for vˆ is the sample mean, or C/4 Study Manual—17th edition Copyright ©2014 ASM

1+8 10+20

 0.3. The individual

58.2. NON-POISSON MODELS

1147

sample means are 1  0.1 10 8 x¯ 2   0.4 20 x¯ 1 

We estimate a using formula (57.6). 102 + 202 aˆ  30 − 30

! −1 

10 (0.1 − 0.3) 2 + 20 (0.4 − 0.3) 2 − 0.3



3 (0.4 + 0.2 − 0.3)  0.0225  40

!

The credibility assigned to A is Zˆ 

58.2

10 (0.0225) 3  10 (0.0225) + 0.3 7



Non-Poisson models

If the model is not Poisson, but there is a linear relationship between µ and v, we can use the same technique as for a Poisson model. Namely, estimate µ as the sample mean; estimate v from µ; estimate ˆ Here are two examples of distributions with linear relationships between µ and v: a  s 2 − v. 1. Negative binomial with fixed β. Then E[N | r]  rβ and Var ( N | r )  rβ (1 + β ) . We can estimate µˆ  x¯ and vˆ  x¯ (1 + β ) .

¯ 2. Gamma with fixed θ. Then E[X | α]  αθ, Var ( X | α )  αθ 2 . We can estimate µˆ  x¯ and vˆ  xθ. Example 58D (Same data as Example 58A) The distribution of policyholders by number of claims in a year is as follows: Number of Claims

Number of Policyholders

0 1 2 3

461 30 7 2

Total

500

For each policyholder, the number of claims is assumed to follow a negative binomial distribution with β  0.4. You are to use empirical Bayes semi-parametric methods, with an unbiased estimator for the variance of the hypothetical means, to determine credibility. Determine the number of claims predicted for the second year for each policyholder. Answer: In Example 58A we already calculated x¯  0.1 and s 2  0.1423. Then µˆ  0.1. Since the conditional variance of the number of claims given r, Var ( N | r )  rβ (1 + β )  1.4rβ  1.4 E[N | r] C/4 Study Manual—17th edition Copyright ©2014 ASM

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1148

is 1.4 times the mean, we estimate vˆ  0.14. Then aˆ  0.1423 − 0.14  0.0023

ˆ ( aˆ + vˆ )  0.0023/0.1423  0.01616. The estimated The credibility factor for one year’s experience is Zˆ  a/ number of claims in the next year is 0.01616x i1 + (1 − 0.01616)(0.1) with x i1 equal to the number of claims in the first year: 0, 1, 2, or 3. Claims observed in first year

Claims predicted for second year

0 1 2 3

0.0983 0.1145 0.1307 0.1469

 Feel free to skip the rest of this section or to just skim it, since I doubt any exam question will be based on it. The Dean study note does not discuss non-Poisson models, and as we’ll see, the other two textbooks are not so clear. What can be done if there is no linear relationship between the conditional mean and the conditional variance? You can relate all three parameters µ, v, and a, and perhaps do something. The end of Loss Models 19.3, Example 19.7, along with Herzog Example 7.6 both discuss a binomial model. Unfortunately, the textbooks do not give a specific method. Instead, they say the following: Suppose θi is the probability of submitting a claim for group i. Either only one claim is possible or we are not interested in the number of claims. Then the number of members in group i submitting claims in period j, assuming it has m i j members in that period, is binomial with parameters m i j and θi . We have that the hypothetical mean for each member of group i is θi and the process variance is θi (1 − θi ) . Then we can relate a, the variance of the hypothetical means, to µ and v as follows: µ  E[θi ] v  E θi (1 − θi )  E[θi ] − E θi2

f

g

f

g

a  Var ( θi )  E θi2 − E[θi ]2

f

 −v + µ − µ2

v  µ − a − µ2

g

(58.4)

So “there is a functional relationship between the parameters µ, v, and a which follows from the parametric assumptions made, and this often facilitates estimation of parameters.” (To quote the exact words in both Loss Models and Herzog) The textbooks’ point is that you can use the above parametric relationship to estimate only two of the three parameters µ, v, and a. Even if you have only one year of data, and cannot estimate vˆ directly, you ˆ This, despite the fact that the relationship isn’t linear, so can use the functional relationship to estimate v. the resulting estimate of v is biased. For example, E[ µˆ 2 ] , µ2 , yet the textbooks propose equation (58.4). A peculiarity of the binomial relationship is that it doesn’t help for a situation with uniform exposures ˆ whereas equation (58.4) says and one year of data. It that situation, we have x¯  µ and s 2  aˆ + v, 2 a + v  µ − µ . It is impossible to extract an estimator for v from these equations. Therefore, the following example is one with non-uniform exposures. C/4 Study Manual—17th edition Copyright ©2014 ASM

58.2. NON-POISSON MODELS

1149

Example 58E (Same data as Example 58C) You examine the following experience for two group policyholders A and B: A B

Exposures Claims Exposures Claims

10 1 20 8

The number of claims per insured follows a Bernoulli distribution. Using semi-parametric empirical Bayes estimators, calculate the credibility assigned to policyholder A. Answer: We have µˆ  x¯  0.3, as above. The formula for aˆ results in 102 + 202 aˆ  30 − 30

! −1 

10 (0.1 − 0.3) 2 + 20 (0.4 − 0.3) 2 − v



3 (0.4 + 0.2 − v )  0.045 − 0.075v 40

!



The functional relationship of parameters in equation (58.4) gives us vˆ  µˆ − aˆ − µˆ 2

 0.3 − (0.045 − 0.075 vˆ ) − 0.32  0.165 + 0.075 vˆ 0.165 vˆ   0.178378 0.925 aˆ  0.045 − 0.075v  0.045 − 0.075 (0.178378)  0.031622

The credibility assigned to A is Zˆ 

10 (0.031622)  0.6393 10 (0.031622) + 0.178378



Exercise 19.7 in Loss Models Fourth Edition asks you to derive a similar model for a geometric model distribution: a  v − µ − µ2

Once again, the idea is to use formula (57.6) to get another relationship between a and v and then solve the two equations in the two unknowns a and v. When there is more than one year of data, so that v can be estimated directly, it is unclear whether to calculate v and estimate a with the functional relationship or to calculate a and estimate v with the functional relationship. Herzog states explicitly to do the former—calculate v and estimate a with the functional relationship. Although Loss Models doesn’t mention which method to use in the text, the solution manual solves exercise 20.73(c)/exercise 19.7(c) by calculating v and estimating a with the functional relationship. Since the textbooks say so little about non-Poisson empirical Bayesian semi-parametric estimation, I think that any non-Poisson empirical Bayesian semi-parametric question will involve a distribution having a linear relationship between variance and mean, such as the negative binomial and gamma examples given above. Students reported that a question on the unreleased Spring 2009 exam used one of those models. C/4 Study Manual—17th edition Copyright ©2014 ASM

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1150

58.3

Which Bühlmann method should be used?

We’ve discussed several Bühlmann methods, each with its own formulas: 1. Bühlmann (Lessons 51–53) 2. Bühlmann-Straub (Lesson 54) 3. Empirical Bayes non-parametric with uniform exposures (Section 57.1) 4. Empirical Bayes non-parametric with non-uniform exposures (Section 57.2) 5. Empirical Bayes semi-parametric with uniform exposures (Example 58A) 6. Empirical Bayes semi-parametric with non-uniform exposures (Example 58C) Some students get confused as to which method to use. First of all, you must distinguish between the first two methods and the others. The first two methods can only be used if you have a model specifying risk classes (which may be continuous) with means and variances. In other words, you are told there are so-and-so risk classes, in which X has such-and-such a mean and variance (If they tell you X is such-and-such a distribution, that automatically gives you the mean and variance.), and the probability of each risk class is such-and-such. Or X for each insured has a mean and variance varying with λ, and λ varies among insureds with such-and-such a mean and variance (or such-and-such a distribution). If you don’t have this information, you can’t use the first two methods. If you can use the first two methods, the second one will always work and generalizes the first, but the first is easier if all cells have the same number of exposures. In fact, the way we carry out the second method is by breaking up the cells so that they all have the same number of exposures. If all you have is data, then you must use methods 3 or 4. Method 4 is more general and will always work, but method 3 has easier formulas and may be used if all cells have the same number of exposures. If in addition to data you hypothesize that each exposure’s X has a Poisson (or some other) distribution, then you can use methods 5 or 6. Method 5 is used when you have uniform exposures, and method 6 is used when you don’t. Here are a couple of credibility problems. Which method is appropriate for each one? 1. For two risk classes A and B, aggregate losses per year have the following means and variances: Class A B

Mean 40 30

Variance 1500 1200

25% of insureds are in risk class A and 75% in risk class B. You observe the following experience for 2 insureds selected at random over 3 years:

Insured #1 Insured #2

Year 1 30 40

Year 2 20 50

Year 3 25 45

Calculate the credibility factor used for the first insured. 2. For an insurance coverage, you observe the following experience for 2 insureds selected at random over 3 years: C/4 Study Manual—17th edition Copyright ©2014 ASM

58.3. WHICH BÜHLMANN METHOD SHOULD BE USED?

Insured #1 Insured #2

Year 1 30 40

1151

Year 2 20 50

Year 3 25 45

Calculate the credibility factor used for the first insured. 3. The number of claims for each insured follows a Poisson distribution with mean λ. λ varies by insured. The density function of λ has a beta distribution with a  1, b  3, θ  1. An insured selected at random submits 1 claim in year 1, 0 claims in year 2, and 1 claim in year 3. Calculate the credibility factor assigned to this insured. 4. The number of claims for each insured follows a Poisson distribution. For 2 insureds, you observe the following claim counts:

Insured #1 Insured #2

Year 1 1 2

Year 2 0 1

Year 3 1 2

Calculate the credibility factor used for the first insured. 5. The number of claims for each insured follows a Poisson distribution. For 2 groups of insureds, you observe the following claim counts:

Group #1 Group #2

Insureds Claim counts Insureds Claim counts

Year 1 5 1 — —

Year 2 5 2 12 2

Year 3 6 3 15 4

Calculate the credibility factor used for the first group. The answers are in the footnote below. 2

Coverage of this material in the three syllabus options Empirical semi-parametric Bayesian estimation with a Poisson model is covered by all three syllabus reading options. Herzog fudges a bit with biased estimators, but the topic will still be on exams; they’ll just arrange the sample size to be large enough so that it doesn’t matter. You should know how to perform semi-parametric Bayesian estimation for Poisson models. You should also understand the generalization to models in which the process variance is a linear function of the hypothetical mean.

1. You have a complete model with uniform exposures (1 insured in each year)—method 1. 2. You don’t have a model, only data. Exposures are uniform, so you can use method 3, although method 4 would also work. 3. You have a complete model with uniform exposures—method 1. 4. Semi-parametric, and one insured in each class every year—method 5. 5. Semi-parametric, and size of cells varies by group and year—method 6.

2

C/4 Study Manual—17th edition Copyright ©2014 ASM

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1152

Estimation with a binomial model is covered cryptically (with almost identical wordage) by Loss Models and Herzog, but not by Dean, and Loss Models also has an exercise involving a geometric model.3 I don’t expect any exam questions on semi-parametric estimation based on these models.

Exercises 58.1.

The distribution of policyholders by number of claims in a single year is as follows: Number of Claims

Number of Policyholders

0 1 2 3

870 105 22 3 1000

The distribution of number of claims is assumed to be Poisson. You are to use empirical Bayes semiparametric methods, with unbiased estimators for the variance of the hypothetical means, to determine credibility. Determine the credibility factor Z. 58.2.

The distribution of policyholders by number of claims in a single year is as follows: Number of Claims

Number of Policyholders

0 1 2

72 20 8

Total

100

The distribution of number of claims is assumed to be Poisson. You are to use empirical Bayes semiparametric methods, with unbiased estimators for the variance of the hypothetical means, to determine credibility. Determine the credibility estimate of the expected number of claims in the next year for a policyholder who had 1 claim this year.

3Exercise 19.7 C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 58

58.3.

1153

The distribution of policyholders by number of claims in a single year is as follows: Number of Claims

Number of Insureds

0 1 2 3 4

410 55 25 5 5

Total

500

The distribution of number of claims is assumed to be Poisson. You are to use empirical Bayes semiparametric methods, with unbiased estimators for the variance of the hypothetical means, to determine credibility. Determine the credibility factor Z. 58.4. [4B-S91:35] (2 points) The number of claims for each insured in a population has a Poisson distribution. The distribution of insureds by number of actual claims in a single year is shown below. Number of Claims

Number of Insureds

0 1 2 3 4 Total

900 90 7 2 1 1000

Calculate the Bühlmann estimate of credibility to be assigned to the observed number of claims for an insured in a single year. (A) (B) (C) (D) (E)

Less than 0.10 At least 0.10, but less than 0.13 At least 0.13, but less than 0.16 At least 0.16, but less than 0.19 At least 0.19

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1154

[4B-F97:7 and 1999 C4 Sample:39] (3 points) You are given the following:

58.5. •

The number of losses arising from m + 4 individual insureds over a single period of observation is distributed as follows: Number of Losses Number of Insureds 0 1 2 3 or more

m 3 1 0



The number of losses for each insured follows a Poisson distribution, but the mean of each such distribution may be different for individual insureds.



The variance of the hypothetical means is to be estimated using semi-parametric empirical Bayesian methods with unbiased estimators.

Determine all values of m for which the estimate of the variance of the hypothetical means will be greater than 0. (A) m > 0

(C) m > 3

(E) m > 9

The number of losses arising from 500 individual insureds over a single period of observations is distributed as follows: Number of Losses Number of Insureds 0 1 2 3 4 5 or more



(D) m > 6

[4B-F98:11] (2 points) You are given the following:

58.6. •

(B) m > 1

450 30 10 5 5 0

The number of losses for each insured follows a Poisson distribution, but the mean of each such distribution may be different for individual insureds.

Using empirical Bayes semi-parametric methods, with unbiased estimators for the variance of the hypothetical means, determine the Bühlmann credibility of the experience of an individual insured over a single period. (A) (B) (C) (D) (E)

Less than 0.20 At least 0.20, but less than 0.30 At least 0.30, but less than 0.40 At least 0.40, but less than 0.50 At least 0.50

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 58

1155

58.7. [4-S00:33] The number of claims a driver has during the year is assumed to be Poisson distributed with an unknown mean that varies by driver. The experience for 100 drivers is as follows: Number of Claims during the Year

Number of Drivers

0 1 2 3 4

54 33 10 2 1

Determine the credibility of one year’s experience for a single driver using semiparametric empirical Bayes estimation. (A) 0.046

(B) 0.055

(C) 0.061

(D) 0.068

(E) 0.073

58.8. [4-F00:7] The following information comes from a study of robberies of convenience stores over the course of a year: (i) X i is the number of robberies of the i th store, with i  1, 2, . . . , 500 P (ii) X  50 P i2 (iii) X i  220 (iv) The number of robberies of a given store during the year is assumed to be Poisson distributed with an unknown mean that varies by store. Determine the semiparametric empirical Bayes estimate of the expected number of robberies next year of a store that reported no robberies during the studied year. (A) (B) (C) (D) (E)

Less than 0.02 At least 0.02, but less than 0.04 At least 0.04, but less than 0.06 At least 0.06, but less than 0.08 At least 0.08

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1156

58.9.

[4-F04:37] For a portfolio of motorcycle insurance policyholders, you are given:

(i) The number of claims for each policyholder has a conditional Poisson distribution. (ii) For Year 1, the following data are observed: Number of Claims

Number of Policyholders

0 1 2 3 4

2000 600 300 80 20

Total

3000

Determine the credibility factor, Z, for Year 2. (A) (B) (C) (D) (E)

Less than 0.30 At least 0.30, but less than 0.35 At least 0.35, but less than 0.40 At least 0.40, but less than 0.45 At least 0.45

58.10. [Based on 4-F04:37] For a portfolio of motorcycle insurance policyholders, you are given: (i)

The number of claims for each policyholder has a conditional negative binomial distribution with β  0.5. (ii) For Year 1, the following data are observed: Number of Claims

Number of Policyholders

0 1 2 3 4

2200 400 300 80 20

Total

3000

Determine the credibility factor, Z, for Year 2.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 58

1157

58.11. You are given: (i)

During a 9-month period, 100 policies had the following claims experience: Total Claims in Nine Months 0 1 2 3 4+

Number of Policies 82 12 5 1 0

(ii) The number of claims per year follows a Poisson distribution. (iii) Each policyholder was insured for the entire 9-month period. A randomly selected policyholder had one claim during the 9-month period. Using semiparametric empirical Bayes estimation, determine the Bühlmann estimate for the number of claims in the following year for the same policyholder. 58.12. [C-S05:28] You are given: (i)

During a 2-year period, 100 policies had the following claims experience: Total Claims in Years 1 and 2 0 1 2 3 4

Number of Policies 50 30 15 4 1

(ii) The number of claims per year follows a Poisson distribution. (iii) Each policyholder was insured for the entire 2-year period. A randomly selected policyholder had one claim over the 2-year period. Using semiparametric empirical Bayes estimation, determine the Bühlmann estimate for the number of claims in Year 3 for the same policyholder. (A) 0.380

(B) 0.387

(C) 0.393

(D) 0.403

(E) 0.443

58.13. For two group policyholders, you have the following data about number of losses over 3 years. Policyholder A B

Number of exposures Number of losses Number of exposures Number of losses

Year 1 10 2 10 1

Year 2 10 3 10 1

Year 3 10 4 10 1

For each exposure, the number of losses in a year has a Poisson distribution. Determine the credibility factor assigned to the experience of Policyholder A using empirical Bayes semi-parametric methods.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1158

Use the following information for questions 58.14 and 58.15: For two group policyholders, you have the following data about number of losses over 3 years: Policyholder A B

Number of exposures Number of losses Number of exposures Number of losses

Year 1 5 2

Year 2 10 3 10 1

Year 3 15 4 10 1

58.14. For each exposure, the number of losses in a year follows a Poisson distribution. Determine the credibility factor assigned to the experience of Policyholder A using empirical Bayes semi-parametric methods. 58.15. For each exposure, the number of losses in a year follows a negative binomial distribution with β  0.5. Determine the credibility factor assigned to the experience of Policyholder A using empirical Bayes semi-parametric methods. 58.16. Your company has insured employee group 1 for the last 3 years and employee group 2 for the last 2 years. You have had the following claim count experience for these groups: Employer 1 2

Employees Claim count Employees Claim count

1 5 1

Year 2 6 2 4 7

3 7 0 6 4

Claim counts for each employee are assumed to follow a Poisson distribution. Group 1 will have 8 employees next year. Estimate the claim count for group 1 next year using empirical semi-parametric Bayesian credibility methods. 58.17. For ten policyholders: (i) Aggregate annual losses for each policyholder follow a gamma distribution with θ  1000. (ii) Based on one year’s experience, average aggregate losses per policyholder for the group were 2000, and the sample variance was 3,000,000. (iii) For one of the ten policyholders, aggregate losses for the year were 2600. Using empirical Bayes semi-parametric methods, calculate the credibility premium for this policyholder. Additional released exam questions: C-F05:22,30, C-F06:13, C-S07:25

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 58

1159

Solutions 58.1.

By equation (58.2), vˆ  x¯ 

By equation (58.3),

105 (1) + 22 (2) + 3 (3)  0.158 1000

870 (0.1582 ) + 105 (0.8422 ) + 22 (1.8422 ) + 3 (2.8422 )  0.195231  aˆ + vˆ 999 aˆ  0.195231 − 0.158  0.037231

s2 

The credibility factor is

Zˆ  58.2.

aˆ 0.037231   0.1907 aˆ + vˆ 0.195231

By equations (58.2) and (58.3), 20 (1) + 8 (2)  0.36 100 72 (0.362 ) + 20 (0.642 ) + 8 (1.642 )   0.394343  aˆ + vˆ 99  0.394343 − 0.36  0.034343 aˆ   0.08709 ˆa + vˆ

µˆ  vˆ  x¯  s2 aˆ Zˆ

The credibility premium is PC  µˆ + Z ( x¯ i − µˆ ) or PC  0.36 + 0.08709 (1 − 0.36)  0.4157 58.3.

By equations (58.2) and (58.3), 55 (1) + 25 (2) + 5 (3) + 5 (4)  0.28 500          410 (0 − 0.28) 2 + 55 (1 − 0.28) 2 + 25 (2 − 0.28) 2 + 5 (3 − 0.28) 2 + 5 (4 − 0.28) 2

vˆ  x¯  s2 

aˆ  0.482565 − 0.28  0.202565 aˆ Zˆ   0.4198 aˆ + vˆ 58.4.

499

 0.482565

By equations (58.2) and (58.3), vˆ  s2 

1 1000 (90 + 14 + 6 + 4)  0.114  2 2 1 999 900 (0.114 ) + 90 (0.886 )

+ 7 (1.8862 ) + 2 (2.8862 ) + 1 (3.8862 )



 0.139143 aˆ  0.139143 − 0.114  0.025143 aˆ  0.18070 (D) Zˆ  aˆ + vˆ Using the biased estimator for aˆ (dividing by 1000 instead of 999) results in s 2  0.139, aˆ  0.025, and Z  0.1799, still the same answer choice, and in fact the answer the original exam expected, since at the time Herzog Section 6.4 was the text material for the exam, and Herzog divides by 1000. C/4 Study Manual—17th edition Copyright ©2014 ASM

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1160

58.5.

By equation (58.2),

5 m+4 ˆ The raw second empirical moment is We need aˆ > 0, or s 2  aˆ + vˆ > v. vˆ 

7 m+4

µ02  so the biased sample variance is 7 5 σˆ  − m+4 m+4 2

and unbiasing it, s2  ˆ Let’s set this to be greater than v.

!2 

7m + 28 − 25 7m + 3  2 ( m + 4) ( m + 4) 2

m+4 2 7m + 3 σˆ  m+3 ( m + 3)( m + 4)

7m + 3 5 > ( m + 3)( m + 4) m+4 7m + 3 >5 m+3 7m + 3 > 5m + 15 2m > 12 m>6 58.6.

By equation (58.2),

vˆ 

By equation (58.3), s2 

1 499



1 500 (30

(D)

+ 20 + 15 + 20)  0.17

450 (0.172 ) + 30 (0.832 ) + 10 (1.832 ) + 5 (2.832 ) + 5 (3.832 )

 0.36182 aˆ  0.36182 − 0.17  0.19182

The credibility factor is

Zˆ  58.7.

aˆ  0.53016 aˆ + vˆ

(E)

By equation (58.2), 33 (1) + 10 (2) + 2 (3) + 1 (4) 63   0.63 100 100 33 (12 ) + 10 (22 ) + 2 (32 ) + 1 (42 )  107

vˆ  By equation (58.3),

 100  1.07 − 0.632  0.6799 99 aˆ  s 2 − v  0.6799 − 0.63  0.0499

s2  The credibility factor is

Zˆ 

C/4 Study Manual—17th edition Copyright ©2014 ASM

aˆ 0.0499   0.0734 aˆ + vˆ 0.6799

(E)



EXERCISE SOLUTIONS FOR LESSON 58

58.8.

µˆ and vˆ are the sample mean, or s2 

Xi n

P

1161



50 500

 0.1. The sample variance is

2 500 * X i − x¯ 2 + 499 500

P

,

-

500 220  − 0.12  0.430862 499 500 By equation (58.3),

aˆ  0.430862 − 0.1  0.330862

The credibility factor and premium are

0.330862  0.76791 0.330862 + 0.1 PC  0.1 + 0.76791 (0 − 0.1)  0.023209 Zˆ 

58.9.

(B)

By equations (58.2) and (58.3), 2000 (0) + 600 (1) + 300 (2) + 80 (3) + 20 (4)  0.50667 3000 ! 3000 2000 (02 ) + 600 (12 ) + 300 (22 ) + 80 (32 ) + 20 (42 ) 2 2 s  − 0.50667 2999 3000 vˆ 

 0.69019 aˆ  0.69019 − 0.50667  0.18352 0.18352 (A)  0.26590 Zˆ  0.69019 58.10. 400 + 300 (2) + 80 (3) + 20 (4)  0.44 3000 vˆ  1.5µˆ  0.66

µˆ 

3000 400 + 300 (22 ) + 80 (32 ) + 20 (42 ) s  − 0.442  0.68663 2999 3000

!

2

aˆ  s 2 − v  0.68663 − 0.66  0.02663 aˆ 0.02663  0.03878 Zˆ   aˆ + vˆ 0.68663 58.11. We’ll estimate a Poisson parameter for 9 months, and then multiply by 4/3 to get an annual parameter. The mean and expected process variance are estimated with the sample mean 0.25. The sample variance is 12 + 5 (22 ) + 32 − 0.252  0.3475 100 100 s2  (0.3475)  0.351010 99

σˆ 2 

0.101010  0.351010 0.287770. The credibility estimate of number of claims in the following year after one claim in nine months So aˆ  0.351010 − 0.25  0.101010. The credibility factor for one unit of time (9 months) is Z 

C/4 Study Manual—17th edition Copyright ©2014 ASM

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1162

is

 4 0.25 + (1 − 0.25)(0.287770)  0.6211 3

58.12. Treat the 2-year period as one time unit, and then when projecting Year 3, project half a time period (divide your projection by 2). For this 2-year time unit:

 1  50 (0) + 30 (1) + 15 (2) + 4 (3) + 1 (4)  0.76 100  1   50 (02 ) + 30 (12 ) + 15 (22 ) + 4 (32 ) + 1 (42 )  1.42 100 100  (1.42 − 0.762 )  0.8509 99  0.8509 − 0.76  0.0909 0.0909 aˆ    0.1068 aˆ + vˆ 0.8509  (1 − 0.1068)(0.76) + (0.1068)(1)  0.7856

µˆ  vˆ  µˆ 02 σˆ 2 aˆ Zˆ Pˆ C

The credibility estimate of Year 3 is 0.7856/2  0.3928 . (C) 58.13. The mean for A is x¯ 1  (2 + 3 + 4) /30  0.3, and the mean for B is (1 + 1 + 1) /30  0.1. Because of the Poisson assumption, this is also the variance,  . so the estimate for v is the weighted average of v1  x¯ 1 and v 2  x¯ 2 , or µˆ  vˆ  x¯  30 (0.3) + 30 (0.1) 60  0.2. a is estimated using the general formula (57.6). aˆ  * m −

m 12 + m 22 m

,

−1

+ -

302 + 302  60 − 60 0.4 1   30 75



m 1 ( x¯ 1 − x¯ ) 2 + m 2 ( x¯ 2 − x¯ ) 2 − vˆ ( r − 1)

! −1 



30 (0.3 − 0.2) 2 + 30 (0.1 − 0.2) 2 − 0.2 (1)



30aˆ 30/75 2   . 30aˆ + vˆ 30/75 + 0.2 3 Notice that since there were the same number of exposures for both policyholders, the uniform exˆ (57.3) could also have also been used. But then, each cell would be treated as one posures formula for a, exposure. The mean and expected process variance would be estimated as µˆ  vˆ  x¯  2, and The credibility factor is then Zˆ 

aˆ 

X

Then the credibility factor is Zˆ 

( x¯ i − x¯ ) 2 −

2 4 vˆ  (3 − 2) 2 + (1 − 2) 2 −  n 3 3

3aˆ 3 (4/3) 2   . ˆ ˆ 3a + v 3 (4/3) + 2 3

2+3+4 1+1 58.14. x¯ 1  5+10+15  0.3 and x¯ 2  10+10  0.1. µ and v are both estimated as the overall sample mean, or 11 ˆ  0.22. a is calculated using formula (57.6). 50

aˆ  * m − , C/4 Study Manual—17th edition Copyright ©2014 ASM

m 12 + m 22 m

−1

+ -



m 1 ( x¯ 1 − x¯ ) 2 + m 2 ( x¯ 2 − x¯ ) 2 − vˆ ( r − 1)



EXERCISE SOLUTIONS FOR LESSON 58

1163

! −1 

302 + 202  50 − 50 0.26   0.010833 24

30 (0.3 − 0.22) 2 + 20 (0.1 − 0.22) 2 − 0.22 (1)



30aˆ 30 (0.010833)   0.59633 . 30aˆ + vˆ 30 (0.010833) + 0.22 ˆ 58.15. We again estimate µˆ  x¯  0.22. We estimate vˆ  1.5 µˆ  0.33. Now we estimate a. The credibility factor is then Zˆ 

aˆ  * m − ,

m 12 + m 22 m

−1

+ -

302 + 202  50 − 50 0.15  0.00625  24



m 1 ( x¯ 1 − x¯ ) 2 + m 2 ( x¯ 2 − x¯ ) 2 − vˆ ( r − 1)

! −1 



30 (0.3 − 0.22) 2 + 20 (0.1 − 0.22) 2 − 0.33 (1)



30aˆ 30 (0.00625)   0.36232 . 30aˆ + vˆ 30 (0.00625) + 0.33 14 58.16. The overall sample mean is x¯  1+2+0+7+4 5+6+7+4+6  28  0.5, which is our estimate for µ and for v as well since Poisson implies v  µ. Since exposures are not uniform, we must use formula (57.6).

The credibility factor is then Zˆ 

m 1  5 + 6 + 7  18 1+2 1 x¯ 1   5+6+7 6 aˆ  28 − 

182 + 102 28

! −1

m 2  4 + 6  10 7+4 x¯ 2   1.1 4+6 1 18 − 0.5 6



2

2

+ 10 (1.1 − 0.5) − 0.5

!

5.1  0.39667 12.8571

The credibility factor is Zˆ 

18 (0.39667)  0.93456, making the credibility premium for a group of 8 18 (0.39667) + 0.5 8 0.93456 (1/6) + (1 − 0.93456)(0.5)  1.508





This assumes that the unweighted mean, rather than the credibility-weighted mean, is used. The credi10 (0.39667) bility of the second group is  0.88806, so the credibility-weighted mean is 10 (0.39667) + 0.5 0.93456 (1/6) + 0.88806 (1.1)  0.62143 0.93456 + 0.88806 so the credibility premium using the credibility weighted mean is 8 0.93456 (1/6) + (1 − 0.93456)(0.62143)  1.572





58.17. We estimate µˆ  x¯  2000. The conditional variance given gamma parameter α is θ times the mean, so we estimate vˆ  1000 (2000)  2,000,000. The estimate for a is aˆ  s 2 − v  3,000,000 − 2,000,000  ˆ ( aˆ + vˆ )  1/3, and the credibility premium is (2/3)(2000) + (1/3)(2600)  2200 . 1,000,000. Then Zˆ  a/

C/4 Study Manual—17th edition Copyright ©2014 ASM

58. EMPIRICAL BAYES SEMI-PARAMETRIC METHODS

1164

Quiz Solutions 58-1. 8  0.16 50   50 12  − 0.162  0.218776 49 50  0.218776 − 0.16  0.058776 0.058776   0.268657 0.218776

µˆ  vˆ  s2 aˆ Zˆ

The answer is (1 − 0.268657)(0.16) + 0.268657 (1)  0.3857 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 59

Supplementary Questions: Credibility 59.1. Claim frequency follows a Poisson distribution. The coefficient of variation for claim severity is 1.4. Claim counts and claim sizes are independent. The methods of limited fluctuation credibility are used, with a standard of aggregate losses being within 4% of expected losses 95% of the time. Determine the number of expected claims needed for 90% credibility. (A) (B) (C) (D) (E)

Less than 6000 At least 6000, but less than 6250 At least 6250, but less than 6500 At least 6500, but less than 6750 At least 6750

Use the following information for questions 59.2 and 59.3: There are two classes of insureds. In Class A, the number of claims per year has a Poisson distribution with mean 0.1 and claim size has an exponential distribution with mean 500. In Class B, the number of claims per year has a Poisson distribution with mean 0.2 and claim size has an exponential distribution with mean 250. Each class has the same number of insureds. An insured selected at random submits two claims in one year. Claim sizes are 200 and 400. 59.2.

Calculate the probability that the insured is in class A.

(A) 0.01 59.3. (A) (B) (C) (D) (E)

(B) 0.04

(C) 0.15

(D) 0.19

(E) 0.27

Calculate the Bühlmann estimate of the aggregate losses for this insured in the following year. Less than 42 At least 42, but less than 47 At least 47, but less than 52 At least 52, but less than 57 At least 57

59.4. On an insurance coverage, the number of claims follows a binomial distribution with parameters m  3 and Q. The probability density function for Q is π ( q )  6q (1 − q )

0≤q≤1

A randomly selected insured submits 2 claims in 4 years. Calculate the expected number of claims from this insured in the fifth year. (A) 0.25

(B) 0.50

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.64

1165

(D) 0.75

(E) 1.50

Exercises continue on the next page . . .

59. SUPPLEMENTARY QUESTIONS: CREDIBILITY

1166

59.5.

You are given the following experience for 3 group policyholders:

A B C

Number of policies Average losses Number of policies Average losses Number of policies Average losses

Year 1 10 5 – – – –

Year 2 15 10 30 12 20 5

Using empirical Bayes non-parametric methods, calculate the credibility factor Z for policyholder C. (A) (B) (C) (D) (E)

Less than 0.47 At least 0.47, but less than 0.49 At least 0.49, but less than 0.51 At least 0.51, but less than 0.53 At least 0.53

59.6. The number of claims on an insurance coverage follows a Poisson distribution. The mean of the Poisson distribution, λ, varies by insured, with the following probabilities: λ

Prior probability

0.1 0.2 0.3

0.5 0.3 0.2

A randomly selected insured submits 0 claims in 3 years. Calculate the probability that this insured submits at least one claim in the fourth year. (A) 0.132

(B) 0.140

(C) 0.142

(D) 0.148

(E) 0.155

Use the following information for questions 59.7 and 59.8: There are two dice and two spinners. One die is marked with three 0’s, two 1’s, and one 2 on its six sides. The other die is marked with two 0’s, two 1’s, and two 2’s on its six sides. Each spinner has 4 equally sized sectors. On one spinner, the numbers on the sectors are 10, 20, 30, and 40. On the other spinner, the numbers on the sectors are 20, 30, 40, and 50. You select one die and one spinner at random. You then perform the following procedure. You toss the die, and then you spin the spinner the number of times indicated by the die. If the die falls on 0, you record 0. Otherwise you record the number, or the sum of the numbers, indicated on the spin(s) of the spinner. The resulting number that you record is 30. 59.7. Calculate the expected value of the number you record when you repeat this procedure with the same die and spinner. (A) 23.6

(B) 24.6

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 24.8

(D) 25.0

(E) 27.9

Exercises continue on the next page . . .

59. SUPPLEMENTARY QUESTIONS: CREDIBILITY

1167

(Repeated for convenience) Use the following information for questions 59.7 and 59.8:

59.7–8.

There are two dice and two spinners. One die is marked with three 0’s, two 1’s, and one 2 on its six sides. The other die is marked with two 0’s, two 1’s, and two 2’s on its six sides. Each spinner has 4 equally sized sectors. On one spinner, the numbers on the sectors are 10, 20, 30, and 40. On the other spinner, the numbers on the sectors are 20, 30, 40, and 50. You select one die and one spinner at random. You then perform the following procedure. You toss the die, and then you spin the spinner the number of times indicated by the die. If the die falls on 0, you record 0. Otherwise you record the number, or the sum of the numbers, indicated on the spin(s) of the spinner. The resulting number that you record is 30. 59.8. Calculate the Bühlmann estimate of the expected value of the number you record when you repeat this procedure with the same die and spinner. (A) 25.3

(B) 25.6

(C) 26.0

(D) 27.4

(E) 29.4

59.9. For an insurance coverage, claim frequency is assumed to follow a Poisson distribution. You have observed the following experience for one year: Number of claims

Number of insureds

0 1 2

52 10 2

Empirical Bayes semi-parametric credibility methods are used. Calculate the expected number of claims in the following year for an insured who had no claims in the observation period. (A) (B) (C) (D) (E)

Less than 0.20 At least 0.20, but less than 0.21 At least 0.21, but less than 0.22 At least 0.22, but less than 0.23 At least 0.23

59.10. Annual claim counts follow a Poisson distribution. Claim sizes have mean 10 and variance 50. Full credibility is assigned to annual aggregate losses if the probability that aggregate losses are less than 5% higher than the mean is 90%. Determine the expected number of claims needed for full credibility. (A) 986

(B) 1082

(C) 1312

(D) 1624

(E) 1972

59.11. The monthly number of losses on an insurance coverage follows a Poisson distribution with mean λ. The prior distribution of λ is gamma with parameters α and θ. A randomly selected insured is observed for n months and submits no claims. Determine the smallest n such that the expected number of claims for this policyholder is half of the expected number of claims for the general population. (A) θ

(B) 1/θ

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) αθ

(D) α/θ

(E) θ/α

Exercises continue on the next page . . .

59. SUPPLEMENTARY QUESTIONS: CREDIBILITY

1168

59.12. You are given: (i)

Claim size x given Θ follows a distribution with density function f ( x | Θ) 

Θ x2

Θ 1, discard the pair of uniform random numbers and start again from the first step. √ After you get a usable pair, then compute T  − (2 ln S ) /S. The two standard normal random numbers are v 1 T and v2 T. Don’t forget to transform the uniform random numbers you start off with into uniform random numbers on [−1, 1], by multiplying each one by 2 and subtracting 1.

Example 61I You are simulating a random variable having a standard normal distribution. Use the following two uniform random numbers on [0, 1) : 0.73

0.18

and the polar method to generate two normal random numbers. Answer: v1  2 (0.73) − 1  0.46 and v2  2 (0.18) − 1  −0.64. Then S  0.462 + 0.642  0.6212 ≤ 1, so we can use √ this pair of numbers. (Otherwise we’d need to be provided with more uniform numbers.) Then T  − (2 ln 0.6212) /0.6212  1.23808. The two standard normal random numbers are 0.46 (1.23808)  0.56952 and −0.64 (1.23808)  −0.79237 . 

C/4 Study Manual—17th edition Copyright ©2014 ASM

61.4. NORMAL RANDOM VARIABLES: THE POLAR METHOD

1211

Table 61.1: Summary of Special Simulation Techniques

For mixture distributions, use two uniform random numbers; one to select the component of the mixture, the other to generate a random number from that component. To generate multinomial random variables with population size n and k categories having probabilities p1 , p2 , . . . , p k : 1. Generate k uniform random numbers on [0, 1) : u1 , . . . u k . 2. Generate x 1 as binomial m  n, q  p1 . 3. Generate x i as binomial with m  n −

Pi−1

j1

x j , q  pi

.

1−

Pi−1

j1

p j . Do this for i  2, . . . , k



A typical situation with a multinomial random variable is an insured population of size n subject to k decrements. The binomial distribution may be approximated with a normal distribution if m is large. To generate both the year of a decrement and the type of a decrement with one uniform random variable, order all the outcomes. The ( a, b, 0) class may be simulated as a stochastic process. Interevent times are exponential with mean 1/λ i , i  0, 1, . . ., where • For Poisson, λ i  λ • For binomial, λ i  −m ln (1 − q ) + i ln (1 − q )

• For negative binomial, λ i  r ln (1 + β ) + i ln (1 + β ) The Box-Muller transformation of two on [0, 1], u1 and u2 , results in two √ √ uniform random variables standard normal random variables −2 ln u 1 cos 2πu2 and −2 ln u1 sin 2πu2 . To generate standard normal random variables using the polar method, 1. Generate two uniform random variables on [0, 1], u1 and u2 . 2. v1  2u1 − 1 and v 2  2u2 − 1.

3. If S  v 12 + v22 > 1, discard the pair and start again. √ 4. T  − (2 ln S ) /S

5. Random numbers are Tv1 and Tv 2 .

C/4 Study Manual—17th edition Copyright ©2014 ASM

61. SIMULATION—SPECIAL TECHNIQUES

1212

Exercises 61.1. A random variable is a mixture of two 2-parameter Pareto distributions. The first component has a weight of 0.6, α  2, and θ  100. The second component has a weight of 0.4, α  3, and θ  1000. You use the first uniform random number to determine the component and the second uniform random number to determine the result. Use the following three pairs of uniform random numbers: {0.3,0.7}

{0.7,0.3}

{0.5,0.9}

Calculate the mean of the resulting random numbers. 61.2.

[Sample:290] A random variable X has a two-point mixture distribution with pdf f (x ) 

1 −x/2 1 −x/3 e + e , 8 4

x≥0

You are to simulate one value, x, from this distribution using uniform random numbers 0.2 and 0.6. Use the value 0.2 and the inversion method to simulate J where J  1 refers to the first random variable in the mixture and J  2 refers to the second random variable. Then use 0.6 and the inversion method to simulate a value from X. Calculate the value of x. (A) 0.45 61.3.

(B) 1.02

(C) 1.53

(D) 1.83

(E) 2.75

[3-F01:13] You wish to simulate a value, Y, from a two point mixture.

With probability 0.3, Y is exponentially distributed with mean 0.5. With probability 0.7, Y is uniformly distributed on [−3, 3] . You simulate the mixing variable where low values correspond to the exponential distribution. Then you simulate the value of Y , where low random numbers correspond to low values of Y. Your uniform random numbers from [0, 1] are 0.25 and 0.69 in that order. Calculate the simulated value of Y. (A) 0.19

(B) 0.38

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.59

(D) 0.77

(E) 0.95

Exercises continue on the next page . . .

EXERCISES FOR LESSON 61

1213

[Sample:304] For a product liability policy, you are given:

61.4. (i)

Loss amounts, Y, follow the discrete mixture distribution denoted by: FY ( y ) 

4 X

α k FX k ( y ) ,

where α k  kα 1 , k  2, 3, 4

k1

(ii)

The random variables X k follow the exponential distribution with mean θk 

1 . αk

Three loss amounts are to be simulated using the following uniform (0, 1) random numbers in order: a)

For the required discrete random variable J, where Pr ( J  k )  α k

b)

0.235

0.456

0.719

0.435

0.298

0.678

For the exponential distributions Calculate the average of the three simulated loss amounts.

(A) (B) (C) (D) (E) 61.5.

Less than 0.5 At least 0.5, but less than 1.0 At least 1.0, but less than 1.5 At least 1.5, but less than 2.0 At least 2.0 Number of claims follows a Poisson distribution with mean λ. λ is uniformly distributed on (0, 0.6) .

You simulate number of claims by simulating λ with the first uniform random number and then simulating the Poisson distribution with the second uniform number using the inversion method. Use the following pair of uniform random numbers on [0, 1) : {0.438,0.805}. Determine the resulting number of claims. 61.6. A portfolio of 100 insurance policies is subject to two decrements, death and surrender. The probability of death for each individual is 0.015 and the probability of surrender is 0.05. You are to simulate the number of deaths and surrenders, in that order. Use the following uniform random numbers on [0, 1) : 0.15 for the number of deaths and 0.25 for the number of surrenders. Use the inversion method to generate binomial random variables. Determine the resulting number of deaths and surrenders. 61.7. A group of 200 employees is subject to three decrements, death, disability, and termination. The probabilities of these decrements are 0.01, 0.08, and 0.20. You are to simulate the three decrements in the order death, disability, and termination. Generate random numbers using the normal approximation and the inversion method. Use the following three uniform random numbers on [0, 1) in the order specified: 0.8212

0.7088

0.1357

Determine the number of terminations.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

61. SIMULATION—SPECIAL TECHNIQUES

1214

61.8. [Sample:291] There are four decrements, labeled (1), (2), (3), and (4). Over the next year the probabilities for each are 0.05, 0.10, 0.15, and 0.20, respectively. There are 500 independent lives subject to these decrements. You plan to simulate the results for the next year using successive binomial distributions and do so in the order in which the decrements are numbered. The first binomial simulation had 23 instances of decrement (1) and the second binomial simulation had 59 instances of decrement (2). Determine the values of m and q for the binomial distribution to be used for simulating decrement (3). (A) (B) (C) (D) (E)

m m m m m

 500, q  500, q  418, q  418, q  418, q

 0.15  0.21  0.15  0.18  0.21

61.9. For an insurance policy, the probability of death in the first year is 0.01 and the probability of surrender is 0.10. For policies that are in force at the end of one year, the probability of death in the second year is 0.02 and the probability of surrender is 0.05. The outcome (year of termination, and whether termination is by death or surrender) is simulated. The outcomes are ordered as follows: 1.

Policy in force for two years

2.

Death in first year

3.

Death in second year

4.

Surrender in first year

5.

Surrender in second year Five policies are simulated with the following five uniform random numbers on [0, 1) : 0.08

0.92

0.84

0.32

0.65

Determine the number of policies for each of the five outcomes. 61.10. [4B-F98:15] (1 point) You wish to generate a single random number from a Poisson distribution with mean 3. A random number generator produces the following uniform numbers in the unit interval [0, 1) : Position in Generation Sequence

Random Number

1 2 3 4 5

0.30 0.70 0.30 0.50 0.90

Representing the Poisson random number as the result of a stochastic process, determine the simulated number. (A) 0

(B) 1

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 2

(D) 3

(E) 4

Exercises continue on the next page . . .

EXERCISES FOR LESSON 61

1215

61.11. [4B-F94:21] (2 points) You wish to generate single random number from a Poisson distribution with mean 3. A random number generator produces the following uniform numbers in the unit interval [0, 1) : Position in Generation Sequence

Random Number

1 2 3 4 5

0.909 0.017 0.373 0.561 0.890

Representing the Poisson random number as the result of a stochastic process, determine the simulated number. (A) 1

(B) 2

(C) 3

(D) 4

(E) 5

61.12. [130-S87:9] The number of flies In a cup of soup has a Poisson distribution with mean 2.5. You are to simulate z, the number of flies in two cups of soup by treating the Poisson distribution as a stochastic process. You are to use the following uniform random numbers on [0, 1) in the order given: First Cup: 0.20 0.90 0.90 0.90 Second Cup: 0.60 0.60 0.90 0.80 Determine z. (A) 1

(B) 2

(C) 3

(D) 4

(E) 5

61.13. [130-S88:1] You are to generate a random observation from the Poisson distribution with mean λ  3 treating the Poisson distribution as a stochastic process. Use the following sequence of random numbers from the uniform distribution over [0, 1) : 0.36

0.50

0.75

0.25

0.50

0.33

0.95

0.85

What is your random observation? (A) 0

(B) 2

(C) 4

(D) 6

(E) 7

61.14. [Sample:299] For a health insurance policy effective January 1, the number of claims in a one-year period follows the Poisson distribution with mean λ  5. The following uniform (0, 1) random numbers are used in the given order to simulate the time of the first claim and the times between occurrences of subsequent claims. 0.605

0.529

0.782

Calculate the simulated date of occurrence of the third claim. (A) (B) (C) (D) (E)

Before May 1 On or after May 1, but before August 15 On or after August 15, but before September 1 On or after September 1, but before September 30 On or after September 30

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

61. SIMULATION—SPECIAL TECHNIQUES

1216

61.15. For a population of 250, you are simulating the number of deaths in a year. The probability of death is 0.01. You will model the random variable for the number of deaths as a stochastic process. Use the following uniform random numbers on [0, 1) in the order given: 0.67

0.35

0.57

0.22

0.40

Determine the resulting number of deaths. 61.16. You are to generate a random observation from a binomial distribution with m  4, q  0.4, using a stochastic process. Use the following uniform random numbers on [0, 1) in the order given: 0.12

0.67

0.28

0.55

Determine the resulting number. 61.17. You are to generate a random observation from a negative binomial distribution with r  2, β  1 using a stochastic process. Use the following uniform random numbers on [0, 1) in the order given: 0.12

0.67

0.28

0.55

0.80

Determine the resulting number. 61.18. You are to generate a random observation from a negative binomial distribution with r  20, β  1.5 using a stochastic process. You have so far generated ten uniform numbers for the process. The cumulative sum of the corresponding times in the process is 0.989027. Determine the greatest lower bound for the eleventh uniform number such that the negative binomial random number that is generated is 10. 61.19. [4B-S90:51] (2 points) Simulate two observations, x1 and x2 from a normal distribution with mean 0 and standard deviation 1 using the polar method. Use the two random numbers, 0.30 and 0.65, taken from the uniform distribution on [0, 1) . Determine x1 and x2 . (A) −1.33 and 1.00 (B) −0.67 and 0.50 (E) Cannot be determined

(C) −2.67 and 2.00

(D) −0.65 and 0.48

61.20. [4B-S92:12] (2 points) You are given the following ordered pairs of points from the uniform distribution on [0, 1) : (0.111, 0.888);

(0.652, 0.689);

(0.194, 0.923)

Using the polar method, determine the first two simulated standard normal values, X (1) and X (2) . (A) (−0.243, 0.242) (B) (0.083, 0.661)

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) (0.374, 0.313)

(D) (1.066, 1.326)

(E) (1.599, −1.357)

Exercises continue on the next page . . .

EXERCISES FOR LESSON 61

1217

61.21. [4B-F95:26, 4B-F98:22] (3 points) Y and Z are a pair of independent, identically distributed lognormal random variables with parameters µ  5 and σ  3. Use the polar method to determine the first simulated pair of observations of Y and Z, denoted ( y1 , z 1 ) from the following list of pairs of random numbers generated from the uniform distribution over [0, 1) : 0.20, 0.95 0.60, 0.85 0.30, 0.15 Calculate y1 + z 1 . (A) (B) (C) (D) (E)

Less than 10 At least 10, but less than 100 At least 100, but less than 1,000 At least 1,000, but less than 10,000 At least 10,000

61.22. [130-F87:3] You are to use the polar method to generate a pair of independent normal random variables with mean 0 and standard deviation 1. You are to use the following uniform random numbers on [0, 1) in the order given:

Let Q 



0.82 √ − ln 0.36 and R  − ln 1.082.

0.09

0.50

0.20

Calculate the mean of the two generated normal variables. (A) −0.7Q

(B) −0.1R

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 0.1R

(D) 1.0R

(E) 0.7Q

Exercises continue on the next page . . .

61. SIMULATION—SPECIAL TECHNIQUES

1218

61.23. [130-F88:7] You are to generate a pair of independent normal variables, X (1) and X (2) , with mean 0 and standard deviation 1, using the polar method. You are to first generate a random point, ( R, θ ) , inside the unit circle as follows: R 2  U1 θ  2πU2 where R and θ represent the random point in polar coordinates, and each U i represents a uniform random variable on [0, 1) . You are given 1 e U2  0.500

U1 

Determine X (1) and X (2) . (A) (B) (C) (D) (E)

−1.4, 0.0 −0.8, 0.0 −0.6, 0.0 −0.4, 0.0 0.0, 0.6

61.24. [130-S89:4] You are to generate an ordered pair ( X1 , X2 ) of independent standard normal random variables using the polar method. Let (V1 , V2 ) be the pair of independent random variables, each uniformly distributed on [−1, 1], used to generate ( X1 , X2 ) . If V1  e −1 cos 30◦ and V2  e −1 sin 30◦ , determine the value of X2 . √ (A) e −2 (B) e −1 (C) 1 (D) 2 (E) e

Solutions 61.1. According to the tables, the inversion formula for a Pareto is θ (1− u ) −1/α −1 . For the first number and third numbers, we use α  2 and θ  100, and for the second, α  3 and θ  1000.





0.7 → 100 0.3−1/2 − 1  82.57419





0.3 → 1000 0.7−1/3 − 1  126.24788





0.9 → 100 0.1−1/2 − 1  216.22777





The average of the three numbers is 141.68 . 61.2. The exponential parameters are 2 for the first component of the mixture and 3 for the second component of the mixture. Thus the first component of the mixture has density 12 e −x/2 ; since we see a coefficient of 1/8, we conclude that this component has a weight of (1/8) / (1/2)  1/4. The second component of the mixture has density 13 e −x/3 but has a coefficient of 1/4, so its weight is (1/4) / (1/3)  3/4. Therefore the value 0.2, which is less than 1/4, the weight of the first component, implies that the first component of the mixture is used for the simulation. For an exponential distribution, the inversion formula is −θ ln (1 − u ) , and θ for the first component is 2, so the simulated number is −2 ln (1 − 0.6)  1.8326 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 61

1219

61.3. 0.25 is less than 0.3, so the exponential variable is selected. Then under an exponential, the inversion formula is u → −θ ln (1 − u ) , so 0.69 → −0.5 ln (1 − 0.69)  0.58559 . (C)

61.4. We use the method for generating random numbers from a mixture distribution that was discussed at the beginning of this lesson. Since the α k s have to sum up to 1, α 1 + 2α 1 + 3α 1 + 4α 1  1, from which it follows that α k  0.1k. That means that the map from uniform numbers to values of k (the subscript of α k is: ui

k

[0,0.1) [0.1,0.3) [0.3,0.6) [0.6,1)

1 2 3 4

Using the uniform numbers we are given, 0.235 → 2, 0.456 → 3, and 0.719 → 4. Then the three means of the exponential distributions are 1/0.2, 1/0.3, and 1/0.4 respectively. Exponential random numbers x i are generated using x i  −θ ln (1 − u i ) . ln (1 − 0.435)  2.8546 0.2 ln (1 − 0.298) 0.298 → −  1.1794 0.3 ln (1 − 0.678) 0.678 → −  2.8330 0.4 0.435 → −

The average of these three numbers is 2.2890 . (E) 61.5. We generate λ  0.6 (0.438)  0.2028. Then p0  e −0.2028  0.81644, and 0.805 < 0.81644, so there are 0 claims. 61.6. For deaths, the probability of no deaths is 0.985100  0.2206, and 0.15 < 0.2206, so the number of deaths is 0 . For surrenders, we use the probability q  0.05/ (1 − 0.015)  0.050761 p0  (1 − q ) 100  0.005464

100 p1  (1 − q ) 99 q  0.029221 1

!

100 (1 − q ) 98 q 2  0.07735 2

!

p2 

100 p3  (1 − q ) 97 q 3  0.13512 3

!

100 p4  (1 − q ) 96 q 4  0.17522 4

!

The sum up to p3 is 0.2472 and the sum up to p4 is 0.4224, so 4 surrenders occur. 61.7. For deaths, the mean is 2 and the variance is 2 (0.99)  1.98. √ The uniform number goes to the standard normal random number 0.92, which then goes to 2 + 0.92 1.98  3.29, which rounds to 3. For disability, the binomial parameters are m  200 − 3  197 and q  0.08/ (1 − 0.01) . The mean is mq  15.9192 and the variance is mq (1 − q )  14.6328. The uniform number goes to the standard normal C/4 Study Manual—17th edition Copyright ©2014 ASM

61. SIMULATION—SPECIAL TECHNIQUES

1220

√ random number 0.55. The resulting number of disabilities is 15.9192+0.55 14.6328  18.02, which rounds to 18. For terminations, the binomial parameters are m  197 − 18  179 and q  0.2/ (1 − 0.01 − 0.08)  0.219780. The mean is mq  39.34066 and the variance is mq (1 − q )  30.69436. The uniform number goes √ to −1.1. The resulting number of terminations is 39.34066 − 1.1 30.69436  33.25, which rounds to 33 .

61.8. For m, we subtract from 500 the number of instances of decrements (1) and (2), so 500 − 23 − 59  418 . For q, we condition on not having the first two decrements: q  0.15/ (1 − 0.05 − 0.10)  0.1764 . (D) 61.9. The probabilities of the five outcomes are (0.89)(0.93)  0.8277, 0.01, (0.89)(0.02)  0.0178, 0.10, and (0.89)(0.05)  0.0445. Accumulating, we get 0.8277, 0.8377, 0.8555, 0.9555, 1 respectively. Thus 0.08, 0.32, and 0.65 go to policy in force for two years, 0.92 goes to surrender in first year, and 0.84 goes to death in second year. Thus the numbers for each outcome are 3,0,1,1,0 respectively. 61.10. We generate exponential random numbers with mean 1/3, and sum them up until they are greater than 1. So we want −

X −

1 3

X

ln (1 − u i ) > 1 ln (1 − u i ) > 3

Y

(1 − u i ) < e −3  0.049787

The product of the first two numbers is (0.7)(0.3)  0.21. The product of the first three numbers is (0.21)(0.7)  0.147. The product of the first four numbers is 0.147 (0.5)  0.0735. The product of the first five numbers is (0.0735)(0.1)  0.00735 < 0.049787. It took 5 events to go past time 1, so the random

number is 4 . (E)

61.11. We multiply (1 − u i ) until the product is below e −3  0.049787.

(1 − 0.909)(1 − 0.017)  0.089453 (0.089453)(1 − 0.373)  0.056087 (0.056087)(1 − 0.561)  0.024622

It took 4 numbers, so the answer is 3 . (C)

61.12. e −2.5  0.082. Multiplying the first cup’s numbers, (0.8)(0.1)  0.08 < 0.082, so we have 1 fly. For the second cup, 0.4 (0.4)  0.16 and (0.16)(0.1)  0.016, so we have 2 flies. The total is 3 . (C) 61.13.

e −3  0.0498.

(0.64)(0.5)  0.32 (0.32)(0.25)  0.08 (0.08)(0.75)  0.06 (0.06)(0.50)  0.03 It took 5 numbers to get below e −3 , so we generate 4 . (C) 61.14. Time between claims is exponential, with mean 1/5. Simulated numbers are −0.2 ln (1 − u ) . We can first multiply the complements of the three uniform numbers, then log them. −0.2 ln (1 − 0.605)(1 − 0.529)(1 − 0.782)  0.641

In months, 0.641 of a year is 12 (0.641)  7.69 months, so the simulated date of occurrence of the third claim is in the second half of August, a little more than 7.5 months after the start of the year. (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 61

1221

61.15. ln (1 − q )  ln 0.99  −0.010050. The first λ is −250 (−0.010050)  2.512584, with resulting exponential random number − (ln 0.33) /2.512584  0.441244. The next numbers use λ 1  −249 (−0.010050)  2.502534, λ 2  −248 (−0.010050)  2.492484, and λ 3  −247 (−0.010050)  2.482433, and are ln 0.65  0.172139 2.502534 ln 0.43  0.338606 − 2.492484 ln 0.78 −  0.100088 2.482433 −

These 4 numbers add up to 1.052. It took 4 numbers to get the sum above 1, so there are 3 deaths. 61.16. ln (1 − q )  ln 0.6  −0.51083. The first λ is −4 (−0.51083)  2.04330. The resulting exponential random number is − (ln 0.88) /2.04330  0.062562. The next numbers are ln 0.33  0.72345 1.53248 ln 0.72 −  0.32154 1.02165



These 3 numbers add up to 1.108. Thus the binomial random number is 2 . 61.17. ln (1 + β )  ln 2  0.6931472. The first λ is 2 (0.6931472)  1.38630. The resulting exponential random number is − (ln 0.88) /1.38630  0.092212. The next numbers are ln 0.33 2.07944 ln 0.72 2.77259 ln 0.45 3.46574 ln 0.20 4.15888

 0.533154  0.118483  0.230401  0.386988

These 5 numbers add up to 1.361238. Since 5 numbers are required to exceed 1, the negative binomial number is 4 . 61.18. ln (1 + β )  ln 2.5  0.9162907. The 11th hazard rate is λ 10  20 (0.9162907) + 10 (0.9162907)  27.48872. We want ln (1 − u )  1 − 0.989027  0.010973 − 27.48872 Therefore, u > 1 − e −27.48872 (0.010973)  0.260391 .

61.19. Translate the numbers to uniform on [−1, 1) : 0.3 → 2 (0.3) − 1  −0.4

0.65 → 2 (0.65) − 1  0.3

√ √ Then S  0.42 + 0.32  0.25 and − (2 ln S ) /S  − (2 ln 0.25) /0.25  3.3302. The two standard normal numbers are −0.4 (3.3302)  −1.3321 and 0.3 (3.3302)  0.9991 . (A)

C/4 Study Manual—17th edition Copyright ©2014 ASM

61. SIMULATION—SPECIAL TECHNIQUES

1222

61.20. The first pair gives 0.111 → 2 (0.111) − 1  −0.778 0.888 → 2 (0.888) − 1  0.776

(−0.778) 2 + (0.776) 2  1.207 > 1

so that pair is discarded. The second pair gives

0.652 → 2 (0.652) − 1  0.304

0.689 → 2 (0.689) − 1  0.378 0.3042 + 0.3782  0.2353

r

−2 ln 0.2353  3.5069 0.2353

The resulting standard normal numbers are 0.304 (3.5069)  1.066 and 0.378 (3.5069)  1.326 . (D) 61.21. For the first pair, 0.20 → 2 (0.20) − 1  −0.6 and 0.95 → 2 (0.95) − 1  0.9, and 0.62 + 0.92 > 1, so we discard this pair. For the second pair, 0.60 → 2 (0.60) − 1  0.2 and 0.85 → 2 (0.85) − 1  0.7, and √ 0.22 + 0.72  0.53. Then − (2 ln 0.53) /0.53  1.547827. The standard normal random numbers generated are 0.2 (1.547827)  0.309565 0.7 (1.547827)  1.083479 Then we multiply by σ  3, add µ  5, and exponentiate to obtain the desired lognormal random variables. y1  e 3 (0.309565) +5  376 z1  e 3 (1.083479) +5  3829 y1 + z1  376 + 3829  4205

(D)

61.22. 0.82 → 2 (0.82) − 1  0.64 and 0.09 → 2 (0.09) − 1  −0.82. Then 0.642 + 0.822  1.082 > 1, so we discard that pair. 0.50 → 2 (0.50) − 1  0 and 0.20 → 2 (0.20) − 1  −0.6. Then 02 + 0.62  0.36

S

p

− (2 ln 0.36) /0.36 

p

2/0.36Q

The mean of the two generated normal numbers is 0 − 0.6 2

r

2 Q   −0.707Q 0.36

(A)

61.23. Usually we’re given the point in Cartesian coordinates. Here, x 1  R cos θ 

p

x 2  R sin θ 

p

1/e cos π  − 1/e

p

1/e sin π  0 √ √ √ √ √ So S  1/e and − (2 ln S ) /S  2e. The numbers are X (1)  − 1/e 2e  − 2  −1.414 and X (2)  0 . (A) √ √ √ √ You could also use the Box-Muller transformation directly. √−2 ln U1  −2 ln 1/e  (−2)(−1)  2, and cos 2πU2  cos π  −1, sin 2πU2  sin π  0. So we get {− 2, 0}, or answer (A). C/4 Study Manual—17th edition Copyright ©2014 ASM

QUIZ SOLUTIONS FOR LESSON 61

61.24.

1223

√ √ S  V12 + V22  e −2 . So − (2 ln S ) /S  4e 2  2e. Then X2  e −1 sin 30◦ (2e )  e −1

1 (2e )  1 2

!

(C)

Quiz Solutions 61-1.

d  ln (1 − q )  ln 0.8; c  − ln 0.8; λ 0  − ln 0.8. Then ln (1 − 0.34) >1 − ln 0.8 ln (1 − 0.14) 30) . 62.10. You are estimating Pr ( X > 200) using simulation. You would like to be 95% confident of being within 1% of the correct answer. If the probability is approximately 75%, estimate the number of data values to generate. 62.11. You are estimating Pr (5 < X ≤ 50) for a random variable X using simulation. In the first 100 runs, the distribution of generated numbers is Range

Number

(0, 5] (5, 50] (50, ∞)

15 62 23

Estimate the number of runs needed to be 90% confident of the estimated probability being greater than 99% and less than 101% of the true probability. 62.12. You are estimating F (1000) using simulation. Which of the following would lead to the highest level of confidence that the observed proportion is within 5% of the correct value? (A) (B) (C) (D) (E)

Number of Values

Number of Values below 1000

2500 3000 4000 4000 5000

1000 500 500 1000 1000

62.13. You are estimating the median of a random variable using simulation. After 500 runs, the order statistics Y230 through Y240 are: Y230 324

Y231 326

Y232 329

Y233 331

Y234 332

Y235 335

Y236 338

Y237 347

Y238 349

Y239 351

Y240 355

Determine the lower bound of a 90% confidence interval for the median. 62.14. You are estimating the 90th percentile of a random variable using simulation. After 1600 runs, a 95% confidence interval is given by [Ya , Yb ], where Yj is the j th order statistic of the sample. Determine a and b. 62.15. You are estimating the 95th percentile of a random variable. After 10,000 runs, the 9475th order statistic is 789 and the 9526th order statistic is 797. Determine the confidence level for the statement that the 95th percentile of the random variable is between 789 and 797. Additional released exam questions: C-F05:16, C-F06:11, C-S07:37

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 62

1233

Solutions 62.1. Your objective is met when the 1.96 times the standard deviation (estimated by the last column, √ S i / i) is less than or equal to 0.01, or Si 0.01  0.005102 √ ≤ 1.96 i (Technically speaking we need a t coefficient rather than 1.96, but the difference is small for large i.) i  104 is the first time that this happens. 62.2. 5000 − k 10 ≤ k 1.96

!2

 26.0308

5000 − k ≤ 26.0308k 5000  184.97 k≥ 27.0308 Generate 185 values. 62.3.

The sample variance is

!2

10 * 1527 48 + s  −  144.07 9 10 10 2

For 90% confidence, we need:

,

-

s 1.645 √ ≤ 1 k √ k ≥ 1.645S

k ≥ 1.6452 S2  2.706025 (144.07)  389.9.

Generate at least 390 values. 62.4. The estimated mean is 48/10  4.8 and the estimated variance, the unbiased sample variance of the 10 observations, is !2 10 * 1527 48 + 2 − s   144.07 9 10 10

,

-

By formula (42.1), the limited fluctuation credibility formula for exposures needed for full credibility, the number of runs needed is !2 ! 1.96 144.07 2  960,867 n0 CV  0.005 4.82 62.5.

The sample variance is s2 

(1 − 3) 2 + (2 − 3) 2 + (3 − 3) 2 + (4 − 3) 2 + (5 − 3) 2 4

 2.5

The standard deviation of the estimator of E[X] after k simulations will be about √ 2.5 0.05  √ k C/4 Study Manual—17th edition Copyright ©2014 ASM

√s . k

We therefore solve

62. NUMBER OF DATA VALUES TO GENERATE

1234

2.5 0.052  1000

k

q

62.6.

We want 1.96

s2 k

(E)

≤ 1, or 3.8416s 2 ≤ k. We therefore set up the following table: s2 29.1 28.9 29.0

k 110 111 112

3.8416s 2 111.79 111.02 111.41

112 values are required. 62.7. We have 1000 runs here. Using formula (42.1), the limited fluctuation credibility formula for exposures (or runs) needed for full credibility of aggregate losses, we want the n0 such that n0 CV2  1000. We’ll calculate the estimated coefficient of variation. x¯  58.135 1000 (4825.312 − 58.1352 )  1447.0809 s2  999 1447.0809 CV2   0.428171 58.1352 Now we back out the confidence level. 0.428171n0  1000 1000 n0   2335.51 0.428171 √ zπ  2335.51  48.3272 0.05 z π  2.42 π  Φ (2.42)  0.9922 p  2π − 1  0.9844 62.8.

The estimated coefficient of variation is x¯  272 100 s2  (86,235.1 − 2722 )  12,375 99 12,375 CV2   0.167264 2722

By formula (42.1), the number of runs needed is

!2

1.96 n0 CV  (0.167264)  38,416 (0.167264)  6426 0.01 2

The estimated variance is Fˆ (30) 1 − Fˆ (30)

300  (0.75)(0.25) /300  0.000625. The estimated √ value for S (30) is 75/300  0.25. The confidence interval is 0.25 ± 1.645 0.000625  (0.2089, 0.2911) .



62.9.

C/4 Study Manual—17th edition Copyright ©2014 ASM

 .

EXERCISE SOLUTIONS FOR LESSON 62

1235

62.10. The estimated variance of n runs is (0.75)(0.25) /n  0.1875/n. We would like the half-width of the confidence interval to be 1% of the true value, which is estimated as 0.75, so

r 1.96

0.1875  0.0075 n 0.1875 0.0075  n 1.96 n

!2

0.1875 (1.962 )  12,805 31 0.00752

and this is rounded up to 12,806 . Alternatively, you could use formula (62.1). The proportion Pn is about 0.75n, so ( n − Pn ) /Pn  1/3. And n0  (1.96/0.01) 2  38,416. So the formula yields n  38,416 (1/3)  12,805 31 , the same as above.

62.11. √The estimated proportion in the range (5, 50] is 62/100  0.62. The confidence interval’s half-width is 1.645 (0.62)(0.38) /n, and you want this to be less than 0.0062.

r 1.645

(0.62)(0.38) n

≤ 0.0062

1.6452 (0.62)(0.38) ≤n 0.00622 n ≥ 16,585.31 Rounding up to the next integer, n  16,586 . 62.12.p Refer to Table 62.1 for the confidence interval for F ( x ) . The half-width of the confidence interval is z π Pn ( n − Pn ) /n 3 . The half-width is set equal to 5% of the estimate, or 5% of Pn /n. In other words

r zπ

Pn Pn ( n − Pn )  0.05 n n3 Pn z π  0.05 n

s

n3  Pn ( n − Pn )

r

nPn n − Pn

To maximize π, we must maximize nPn / ( n − Pn ) . For the five choices, this is (1000)(2500)  1666 23 (A) 1500 (B) (C) (D) (E)

(500)(3000) 2500

(500)(4000) 3500

 600  571 73

(1000)(4000) 3000

(1000)(5000) 4000

 1333 13  1250

Thus (A) is the answer. C/4 Study Manual—17th edition Copyright ©2014 ASM

62. NUMBER OF DATA VALUES TO GENERATE

1236

Although not required, we’ll calculate the confidence level for (A).

q

z π  0.05 1666 23  2.0412 π  Φ (2.04)  0.9793 p  2 (0.9793) − 1  0.9586 62.13. The formula for the subscript of the order statistic for the lower bound is a  0.5 (500) + 0.5 − 1.645 (500)(0.5)(0.5)  b232.11c  232

h

p

i

so the answer is Y232  329 . 62.14. By the formula, z π 1600 (0.9)(0.1)  12 (1.96)  23.52

p

a  b1600 (0.9) + 0.5 − 23.52c  1416

b  d1600 (0.9) + 0.5 + 23.52e  1465

√ √ 62.15. The half-width of the confidence interval is z π 10,000 (0.95)(0.05) , and 10,000 (0.95)(0.05)  21.79. The half-width of the interval (9475, 9526) that we are given is 25.5. Therefore, z π  25.5/21.79  1.17, and π  N (1.17)  0.8790. Then the confidence level is 2π − 1  0.7580 .

Quiz Solutions 62-1.

The sample mean of the five numbers is 51.8, and the sample variance is 322 + 602 + 532 + 492 + 652  2811.8 5 5 (2811.8 − 51.82 )  160.7 4

so the coefficient of variation squared is 160.7/51.82  0.059890. Also, n0  (1.96/0.01) 2  38416. Therefore, 38,416 (0.059890)  2301 runs are needed. 62-2. F1000 (50)  0.92. The variance of F1000 (50) is estimated as (0.92)(0.08) /1000  0.0000736. The √ confidence interval is 0.92 ± 1.645 0.0000736  (0.906, 0.934) . p √ 62-3. For the 95th percentile, nq (1 − q )  1000 (0.95)(0.05)  6.892. The confidence interval’s sequence numbers are 950.5 ± 15.5, where 950.5 is the midpoint based on our formulas (nq + 0.5), so the level of confidence is 15.5 − 1  2Φ (2.25) − 1  2 (0.9878) − 1  0.9756 2Φ 6.892

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 63

Simulation—Applications Reading: Loss Models Fourth Edition 20.1, 20.4

63.1

Actuarial applications

Modern exam questions have you apply simulation to an actuarial problem. The textbook discusses two examples which require simulation. The first example of the textbook involves the time value of money, where time to settle a claim is correlated with claim size and interest rates are stochastic. To simulate this, a model is assumed where the distribution of settlement time has a parameter involving claim size. Then a random number is generated for claim size, another one (using the first one) for settlement time, and a third one for the interest rate. Example 63A Claims occur in a Poisson process at a rate of 3 per year. This means that the amount of time between claims is exponential with mean 1/3. Claim sizes have a Weibull distribution with parameters τ  0.5 and θ  1000. In a simulation of claims, you are given the following random numbers uniform on [0, 1) to simulate the time of each claim: 0.52

0.29

0.78

0.37

0.69

You are given the following random numbers uniform on [0, 1) to simulate claim sizes: 0.21

0.72

0.33

0.84

0.09

In each case, use the numbers in order as needed. Calculate the present value at 6% of one year of claims in this simulation run. Answer: The time between claims is exponential, so we simulate the times between claims until those times add up to more than 1, using equation (60.1) on page 1181. Simulated Time

Cumulative Time

0.52 → −1/3 ln 0.48  0.244656 0.29 → −1/3 ln 0.71  0.114163 0.78 → −1/3 ln 0.22  0.504709 0.37 → −1/3 ln 0.63  0.154012

0.244656 0.244656 + 0.114163  0.358819 0.358819 + 0.504709  0.863528 0.863528 + 0.154012  1.017541

Therefore three claims occur in the first year. To simulate claim sizes, we’ll use the equation from the tables for VaRp ( X ) of a Weibull: VaRp ( X )  θ − ln (1 − p )



 1/τ

 1000 − ln (1 − u )



0.21 → 1000 (− ln 0.79) 2  55.57

0.72 → 1000 (− ln 0.28) 2  1620.44

0.33 → 1000 (− ln 0.67) 2  160.38 C/4 Study Manual—17th edition Copyright ©2014 ASM

1237

2

63. SIMULATION—APPLICATIONS

1238

The present value of claims is 55.57 1620.44 160.38  1794.21 + + 1.060.244656 1.060.358819 1.060.863528



The second example of the textbook involves the aggregate loss distribution. Claim payments may not be truly independent due to an aggregate limit. Consider the following case: individual claims are subject to an ordinary deductible. However, there is a maximum annual out-of-pocket for the insured. If the amount spent on deductibles by the insured in a year is above a certain amount, the excess is paid by the insurance company. This leads to a dependence between claim sizes. Example 63B A major medical coverage covers 80% of claims after an annual deductible of 500, with 100% covered after the insured spends 1000. The number of claims in a year follows a Poisson distribution with mean 1 and the size of claims follows a Weibull distribution with parameters τ  2, θ  2000. Simulate the number of claims for 3 years using these uniform random numbers on [0, 1]: 0.50,

0.92,

0.70

Simulate the size of claims using these uniform random numbers on [0, 1]: 0.21,

0.74,

0.05,

0.42,

0.54,

0.83,

0.25,

0.13,

0.64

Calculate the simulated total payments over 3 years. Answer: We tabulate the probabilities for the Poisson distribution up to the point where the distribution function is 0.92: i 0 1 2 3

pi  0.3679 e −1  0.3679 e −1 /2  0.1839 e −1 /6  0.0613 e −1

F (i ) 0.3679 0.7358 0.9197 0.9810

So we have 0.3679 ≤ 0.50 < 0.7358 ⇒ 0.50 → 1

0.9197 ≤ 0.92 < 0.9810 ⇒ 0.92 → 3

0.3679 ≤ 0.70 < 0.7358 ⇒ 0.70 → 1

So we have 1 claim, 3 claims, and 1 claim in the 3 years. Inverting the Weibull: u  F ( x )  1 − e − ( x/2000)

1 − u  e − ( x/2000)

x − ln (1 − u )  2000

2

2

!2

x  2000 − ln (1 − u )

p

0.21 → 2000 − ln (1 − 0.21)  971.02

p

0.74 → 2321.27 0.05 → 452.96

0.42 → 1476.11

0.54 → 1762.42

C/4 Study Manual—17th edition Copyright ©2014 ASM

63.2. STATISTICAL ANALYSIS

1239

In the first year, the insurer pays 0.8 (971.02 − 500)  376.82. In the second year, the claims add up to 2321.27 + 452.96 + 1476.11  4250.34. The insurer would pay 0.8 (4250.34 − 500) , but this would be less than 4250.34 − 1000, with 1000 being the cap on copayments, so the insurer pays 3250.34. In the third year, the insurer pays 0.8 (1762.42 − 500)  1009.93. Total payments are 376.82 + 3250.34 + 1009.93  4637.09 .

?

Quiz 63-1 Loss sizes follow an inverse exponential distribution with θ  700. Annual loss counts follow a negative binomial distribution with r  2, β  0.5. An insurance coverage has a per-loss ordinary deductible of 500 and an annual aggregate deductible of 500. The aggregate deductible is applied after the per-loss deductible is applied. Simulate aggregate losses for one year using the following uniform random number on (0,1) for claim counts: 0.8 and the following uniform random numbers in order as needed for claim sizes: 0.14

0.56

0.38

0.76

Determine total insurance payments for the year. When simulation was on Course/Exam 3, some applications of simulation used on exams were surplus processes, joint life functions, and Brownian motion. Some of these old questions are in the exercises, and you may try them, although I doubt such frameworks would be used for current exam questions.

63.2

Statistical analysis

I am not aware of any exam questions that have been asked on statistical analysis by simulation, except on the bootstrap method. We discuss the bootstrap method in the next lesson. Testing hypotheses, or testing goodness-of-fit, involves drawing a statistic from a sample. Then the distribution of the statistic under the null hypothesis is calculated, and you determine whether the likelihood of the statistic you actually observed is sufficiently high given the truth of the null hypothesis. The difficult (or impossible) part is calculating the distribution of the statistic. The classical method is to assume the distribution is normal. But a normal distribution may not be appropriate if the sample is small. Simulation offers an alternative. You assume the null hypothesis and simulate the distribution of the statistic given this assumption. You then determine the likelihood of the statistic you observed with the simulated distribution function. For example, for a chi-square goodness-of-fit test, if you are hypothesizing a Pareto distribution with parameters α and θ for a set of 100 observations in 5 intervals, each run of the simulation involves generating 100 observations from the Pareto distribution, placing them in the 5 intervals, and calculating the chi-square statistic.

63.3

Risk measures

Usually Monte Carlo simulation is necessary to calculate VaR and TVaR, since the underlying process is too complex to calculate these analytically. VaR is a percentile, and we discussed estimators for percentiles in lesson 62. To summarize that disM q ( X )  Yk , where k  bnqc + 1 and cussion: from a simulation with n runs, the estimator for VaR is1 VaR 1The textbook uses the subscript p for the security levels of the estimators of VaR and TVaR, and does not discuss confidence intervals, but I am using p for the confidence interval so I will use q for the security level of VaR and TVaR. C/4 Study Manual—17th edition Copyright ©2014 ASM

63. SIMULATION—APPLICATIONS

1240

Yj is the j th order statistic from the sample.2 A p-confidence interval is ( Ya , Yb ) , where



q





q



a  nq + 0.5 − z ( p+1)/2 nq (1 − q ) and

b  nq + 0.5 + z ( p+1)/2 nq (1 − q )

The estimator for TVaRq ( X ) is the sample mean of the order statistics from bnqc + 1 to n. The sample variance of those observations is 2 Pn  Yj − TVaRq ( X ) jk s q2  n−k where, as usual, we divide by n − k rather than n − k + 1 to make the estimator unbiased. Since TVaRq ( X ) is the mean of the upper tail of the distribution, we might think that all we need to do is divide s q2 by n, the same way we estimate the variance of the sample mean. However, the variance of TVaRq ( X ) is greater than s q2 /n, since the variance of the estimator derives both from the sample error and the error in the

estimate for the 100q th percentile. So the variance of TVaRq ( X ) should be calculated using the conditional variance formula. The formula for an asymptotically unbiased estimator of the variance is

M q (X ) s q2 + q TVaRq ( X ) − VaR 

L TVaRq ( X )  Var 



2 (63.1)

n−k+1

Example 63C The following table records the highest 50 losses from a simulation with 500 runs: 340.5 349.3 353.4 356.9 360.9

340.7 349.4 353.8 357.0 361.6

341.1 349.6 353.9 357.5 362.4

341.5 349.9 354.2 358.0 363.0

341.8 350.1 354.5 358.2 364.1

342.5 350.3 354.9 358.3 365.1

344.6 350.7 355.2 358.9 365.3

347.1 351.1 355.8 359.2 365.7

348.0 351.2 356.1 360.1 365.9

348.2 352.0 356.5 360.7 366.8

Construct 90% confidence intervals for VaR0.95 ( X ) and TVaR0.99 ( X ) .

M 0.95 ( X ) , we use the formula Answer: k  b0.95 (500) c+1  476. To construct the confidence interval for VaR for the confidence interval of a percentile: z (1+0.90)/2 500 (0.95)(0.05)  1.645 (4.873)  8.017

p

a  b475.5 − 8.071c  467 b  d475.5 + 8.071e  484 so the 90% confidence interval for VaR0.95 ( X ) is ( Y467 , Y484 )  (350.7, 358.0) . The estimate for TVaR0.99 ( X ) is the average of the top five observations, or TVaR0.99 ( X ) 

365.1 + 365.3 + 365.7 + 365.9 + 366.8  365.76 5

2The textbook almost contradicts itself concerning the estimator for VaR. In the simulation section, the textbook says to use Yk as defined above. On page 45 (fourth edition), however, it says “When estimating the VaR directly from data, the methods of obtaining quantiles from empirical distributions described in Section 13.1 can be used.” However, Section 13.1 only describes the smoothed empirical percentile, implying that the smoothed empirical percentile rather than Yk should be used. I believe the reason for the difference is that estimating from data and estimating from a simulation are different. Sample data is a small set, and therefore greater precision is needed in estimating the percentile. For a simulation with a large number of runs, there shouldn’t be much difference between Yk and the smoothed empirical percentile. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 63

1241

Now let’s calculate s q2 , the sample variance of the top five observations. 365.12 + 365.32 + 365.72 + 365.92 + 366.82  133,780.728 5 133,780.728 − 365.762  0.3504 s q2  45 (0.3504)  0.438

where we multiplied by 45 to change the denominator from 5 to 4. For the estimate of the standard deviation

M 0.99 ( X ) . The order statistic to use is b0.99 (500) c +1  496, so VaR M 0.99 ( X )  of TVaR0.99 ( X ) , we also need VaR Y496  365.1. Using equation (63.1), L TVaR0.99 ( X )  Var 



0.438 + 0.99 (365.76 − 365.1) 2  0.17385 5

√ The confidence interval is 365.76 ± 1.645 0.17385  (365.1, 366.4) .

?



Quiz 63-2 A simulation of X has 100 runs. The highest 10 values are 78, 80, 80, 81, 81, 82, 83, 84, 86, 89. Calculate the estimated variance of the estimated value of the tail value-at-risk at the 95% security level.

Exercises 63.1. [3-F84:23] Exam grades are distributed according to the following continuous probability density function: x   25 0≤x 200

You simulate the number of losses by drawing one sample for each risk from the uniform distribution on [0,1]. Your observations are: 0.65,

0.40,

0.95,

0.74

You then draw samples from the same distribution to simulate the amounts of the individual losses. The first eight random numbers, in order, are: 0.50,

0.20,

0.75,

0.90,

0.10,

0.40,

0.35,

0.65

Use as many of these random numbers as needed to determine the total loss generated. (A) 100

(B) 116

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 266

(D) 446

(E) 450

Exercises continue on the next page . . .

63. SIMULATION—APPLICATIONS

1244

63.7. [130-S91:19] Of a group of three independent lives with medical expense insurance, the number with medical expenses during a year is distributed according to a binomial distribution with m  3 and q  0.9. The amount, X, of medical expenses for any person, once expenses occur, has the following distribution: x f (x ) 100 10,000

0.9 0.1

Each year, the insurance company pays the total medical expenses for the group in excess of 5,000. Use the uniform [0, 1] random numbers, 0.01 and 0.20, in the order given, to generate the number of claims for each of two years. Use the following uniform [0, 1] random numbers, in the order given, to generate the amount of each claim: 0.80,

0.95,

0.70,

0.96,

0.54,

0.01

Calculate the total amount that the insurance company pays for the two years. (A) 0

(B) 5,000

(C) 5,100

(D) 5,200

(E) 10,200

63.8. [130-81-97:19] A company is insured against liability suits. The number of suits for a given year is distributed as follows: Number of suits

Probability

0 1 2 3

0.5 0.2 0.2 0.1

For each of three years, you simulate the number of suits per year by generating one uniform random number from the interval [0, 1] and then applying the inversion method. Your three uniform random numbers are 0.83, 0.59, 0.19. The liability loss from an individual suit is a random variable with the following cumulative distribution function:

 0 for x < 0    1  F (x )   2 , for x  0   1 x + 1 , for 0 < x ≤ 10  20 2 You generate uniform random numbers from the interval [0, 1] and apply the inversion method to simulate the amounts of the individual losses. The first nine uniform random numbers, in order, are: 0.51,

0.01,

0.78,

0.74,

0.03,

0.69,

0.17,

0.86,

0.82

Use, in order, as many of these uniform random numbers as needed to simulate the company’s total loss due to liability suits over the three-year period. (A) 0.0

(B) 0.2

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 5.6

(D) 5.8

(E) 14.4

Exercises continue on the next page . . .

EXERCISES FOR LESSON 63 [4B-S94:11] (2 points) You are given the following:

63.9. •

1245

The random variable X for the amount of an individual claim has the density function f ( x )  2x −3 ,

x ≥ 1.



The random variable R for the ratio of loss adjustment expense (LAE) to loss is uniformly distributed on the interval [0.01, 0.21].



The amount of an individual claim is independent of the ratio of LAE to loss.



The random variable Y having the uniform distribution on [0, 1) is used to simulate outcomes for X and R, respectively.



Observed values of Y are Y1  0.636 and Y2  0.245. Using Y1 to simulate X and Y2 to simulate R, determine the simulated value for X (1 + R ) .

(A) (B) (C) (D) (E)

Less than 0.65 At least 0.65, but less than 1.25 At least 1.25, but less than 1.85 At least 1.85, but less than 2.45 At least 2.45

63.10. [3-S01:11] You are using the inversion method to simulate Z, the present value random variable for a special two-year term insurance on (70). You are given: (i)

(70) is subject to only two causes of death, with k 0 1

(ii)

(1) k| q 70

0.10 0.10

(2) k| q 70

0.10 0.50

Death benefits, payable at the end of the year of death, are: During year 1 2

Benefit for Cause 1 1000 1100

Benefit for Cause 2 1100 1200

(iii) i  0.06 (iv) For this trial your random number from the uniform distribution on [0, 1] is 0.35. (v) High random numbers correspond to high values of Z. Calculate the simulated value of Z for this trial. (A) 943

(B) 979

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 1000

(D) 1038

(E) 1068

Exercises continue on the next page . . .

63. SIMULATION—APPLICATIONS

1246

63.11. [3-F02:23] Your company insures a risk that is modeled as a surplus process as follows: (i) (ii) (iii) (iv) (v)

Interarrival times for claims are independent and exponentially distributed with mean 1/3. Claim size equals 10t , where t equals the time the claim occurs. Initial surplus equals 5. Premium is collected continuously at rate ct 4 . The interest rate is 0.

You simulate the interarrival times for the first three claims by using 0.5, 0.8, and 0.9 respectively, from the uniform distribution on [0, 1], where large random numbers correspond to long interarrival times. Of the following, which is the smallest c such that your company does not become insolvent from any of these three claims? (A) 22

(B) 35

(C) 49

(D) 113

(E) 141

63.12. Loss sizes for fire insurance have a two-parameter Pareto distribution with parameters θ  9000, α  1. Insurance coverage has an ordinary deductible of 1000 and a maximum covered loss of 50,000. You simulate 5 losses, with lower uniform random numbers corresponding to smaller loss amounts. The uniform random numbers used to generate values are: 0.21, 0.32, 0.53, 0.88, 0.05. Determine the average payment per loss. 63.13. [SOA3-F03:5] N is the random variable for the number of accidents in a single year. N follows the distribution: Pr ( N  n )  0.9 (0.1) n−1 n  1, 2, . . . X i is the random variable for the claim amount of the i th accident. X i follows the distribution: g ( x i )  0.01e −0.01x i ,

x i > 0,

i  1, 2, . . .

Let U and V1 , V2 , . . . be independent random variables following the uniform distribution on (0, 1) . You use the inversion method with U to simulate N and Vi to simulate X i , with small values of the random numbers corresponding to small values of N and X i . You are given the following random numbers for the first simulation: u 0.05

v1 0.30

v2 0.22

v3 0.52

v4 0.46

Calculate the total amount of claims during the year for the first simulation. (A) 0

(B) 36

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C) 72

(D) 108

(E) 144

Exercises continue on the next page . . .

EXERCISES FOR LESSON 63

1247

63.14. [SOA3-F03:40] You are the consulting actuary to a group of venture capitalists financing a search for pirate gold. It’s a risky undertaking; with probability 0.80, no treasure will be found, and thus the outcome is 0. The rewards are high: with probability 0.20 treasure will be found. The outcome, if treasure is found, is uniformly distributed on [1000, 5000]. You use the inversion method to simulate the outcome, where large random numbers from the uniform distribution on [0, 1] correspond to large outcomes. Your random numbers for the first two trials are 0.75 and 0.85. Calculate the average of the outcomes of these first two trials. (A) 0

(B) 1000

(C) 2000

(D) 3000

(E) 4000

63.15. [SOA3-F04:6] You are simulating a compound claims distribution: (i) The number of claims, N, is binomial with m  3 and mean 1.8. (ii) Claim amounts are uniformly distributed on {1,2,3,4,5}. (iii) Claim amounts are independent, and are independent of the number of claims. (iv) You simulate the number of claims, N, then the amounts of each of those claims, X1 , X2 , . . . , X N . Then you repeat another N, its claim amounts, and so on until you have performed the desired number of simulations. (v) When the simulated number of claims is 0, you do not simulate any claim amounts. (vi) All simulations use the inversion method. (vii) Your random numbers from (0, 1) are 0.7, 0.1, 0.3, 0.1, 0.9, 0.5, 0.5, 0.7, 0.3, and 0.1. Calculate the aggregate claim amount associated with your third simulated value of N. (A) 3

(B) 5

(C) 7

(D) 9

(E) 11

63.16. [Based on SOA3-F04:33] You are simulating the gain/loss from insurance where: (i) Claim occurrences follow a Poisson process with λ  2/3 per year. (ii) Each claim amount is 1, 2 or 3 with p (1)  0.25, p (2)  0.25, and p (3)  0.50. (iii) Claim occurrences and amounts are independent. (iv) The annual premium equals expected annual claims plus 1.8 times the standard deviation of annual claims. (v) i  0 You use 0.25, 0.40, 0.60, and 0.80 from the unit interval to simulate time between claims. You use 0.30, 0.60, 0.20, and 0.70 from the unit interval to simulate claim size. Calculate the gain or loss from the insurer’s viewpoint during the first 2 years from this simulation. (A) (B) (C) (D) (E)

loss of 5 loss of 4 0 gain of 4 gain of 5

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

63. SIMULATION—APPLICATIONS

1248

63.17. [SOA3-F04:34] Annual dental claims are modeled as a compound Poisson process where the number of claims has mean 2 and the loss amounts have a two-parameter Pareto distribution with θ  500 and α  2. An insurance pays 80% of the first 750 of annual losses and 100% of annual losses in excess of 750. You simulate the number of claims and loss amounts using the inversion method. The random number to simulate the number of claims is 0.8. The random numbers to simulate loss amounts are 0.60, 0.25, 0.70, 0.10, and 0.80. Calculate the simulated insurance claims for one year. (A) 294

(B) 625

(C) 631

(D) 646

(E) 658

63.18. You hypothesize that loss data you are examining has an underlying Pareto distribution with α  2 and θ  4000. You wish to test your hypothesis using the chi-square goodness-of-fit test. However you only have 5 losses. Therefore, you use simulation to determine the distribution of the chi-square statistic. For the purpose of the chi-square test, you group your data into 3 groups: below 1000, 1000–6000, and above 6000. The first run of simulation uses the uniform random numbers on [0, 1]: 0.15, 0.95, 0.20, 0.35, 0.60. Calculate the simulated chi-square statistic. 63.19. [CAS3-F04:37] An actuary is simulating annual aggregate loss for a product liability product, where claims occur according to a binomial distribution with parameters m  4 and q  0.5, and severity is given by an exponential distribution with parameter θ  500,000. The number of claims is simulated using the inverse transform method (where small random numbers correspond to small claim sizes) and a random value of 0.58 from the uniform distribution on [0, 1]. The claim severities are simulated using the inverse transform method (where small random numbers correspond to small claim sizes) using the following values from the uniform distribution on [0, 1]: 0.35, 0.70, 0.61, 0.20. Calculate the simulated annual aggregate loss for the product liability policy. (A) (B) (C) (D) (E)

Less than 250,000 At least 250,000, but less than 500,000 At least 500,000, but less than 750,000 At least 750,000, but less than 1,000,000 At least 1,000,000

63.20. You hypothesize that loss data you are examining has an underlying Pareto distribution with α  2 and θ  4000. You wish to test your hypothesis using the Kolmogorov-Smirnov test. However you only have 5 losses. Therefore, you use simulation to determine the distribution of the Kolmogorov-Smirnov statistic. The first run of simulation uses the uniform random numbers on [0, 1]: 0.15, 0.95, 0.20, 0.35, 0.60. Calculate the simulated Kolmogorov-Smirnov statistic.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 63

1249

63.21. [C-S05:34] Unlimited claim severities for a warranty product follow the lognormal distribution with parameters µ  5.6 and σ  0.75. You use simulation to generate severities. The following are six uniform (0, 1) random numbers: 0.6179

0.4602

0.9452

0.0808

0.7881

0.4207

Using these numbers and the inversion method, calculate the average payment per claim for a contract with a policy limit of 400. (A) (B) (C) (D) (E)

Less than 300 At least 300, but less than 320 At least 320, but less than 340 At least 340, but less than 360 At least 360

63.22. [Sample:298] For a warranty policy, the expenses, E, are the product of the random variable Y and the loss amount, X. That is, E  XY. Loss amounts, X, follow the lognormal distribution with the underlying normal distribution parameters µ  5.2 and σ  1.4. Given X, the conditional probability density function of Y is: √ x −y √x/2 f ( y | x)  e , y>0 2 Use the inversion method and the following three uniform (0, 1) random numbers to simulate three loss amounts: 0.937

0.512

0.281

For each simulated loss amount, use the following uniform (0, 1) random numbers, in order, to simulate three expense amounts. 0.433

0.298

0.978

Calculate the average value of expenses based on the three simulated amounts. (A) (B) (C) (D) (E)

Less than 23 At least 23, but less than 28 At least 28, but less than 33 At least 33, but less than 38 At least 38

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

63. SIMULATION—APPLICATIONS

1250

63.23. [Sample:302] For a health insurance policy, you are given: (i)

The annual number of claims follows the distribution given below: Number of Claims

Probability

0 1 2

0.7 0.2 0.1

(ii)

Unmodified claim amounts, with no deductible or limit, follow the exponential distribution with mean 20. (iii) Each claim is subject to a deductible of 2 and a limit of 40 applied before the deductible. There is also an aggregate annual limit of 50 applied after the deductible. Use the uniform (0, 1) random number 0.237 and the inversion method to simulate the payment when the number of claims is one. Use the uniform (0, 1) random numbers 0.661 and 0.967 and the inversion method to simulate the payments when the number of claims is two. Calculate the mean of annual payments, using the simulated values. (A) (B) (C) (D) (E)

Less than 6 At least 6, but less than 21 At least 21, but less than 36 At least 36, but less than 51 At least 51

63.24. To calculate Value-at-Risk at security level 0.99, VaR0.99 ( X ) , 5000 simulation runs are performed. A 95% confidence interval for VaR0.99 ( X ) is ( Yi , Yj ) , where Yk is the k th order statistic of the sample. Determine i and j. 63.25. For a simulation with 100 runs, the largest 20 values are 920 948

920 952

922 959

925 962

926 969

932 976

939 989

940 1005

943 1032

945 1050

Construct a 95% confidence interval for VaR0.90 ( X ) , the Value-at-Risk at security level 0.9. Use the following information for questions 63.26 through 63.28: For a simulation with 100 runs, the largest 10 values are 65

66

68

69

75

78

82

89

91

101

Let X be the random variable for the underlying process. 63.26. Estimate the Value-at-Risk at security level 95%, VaR0.95 ( X ) . 63.27. Estimate TVaR0.95 ( X ) . 63.28. Construct a 95% confidence interval for TVaR0.95 ( X ) , using the estimates calculated in the previous two exercises.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 63

1251

63.29. The Tail-Value-at-Risk at security level 95% is estimated using simulation. You are given: (i) There are 2000 runs. (ii) The average of the 100 highest results is 325. (iii) The sample standard deviation of the 100 highest results is 24. (iv)

M 0.95 ( X ) , the estimate of the Value-at-Risk at security level 95%, is 302. VaR

Construct a 90% confidence interval for TVaR0.95 ( X ) . Additional released exam questions: C-F05:8,27, C-F06:4,21,32, C-S07:9,19

Solutions 5  0.2. The triangle is graphed 63.1. We calculate F ( x ) . f ( x ) forms a triangle with base 10 and height 25 in Figure 63.1. A passing grade is generated by a uniform number great enough to generate a 6 (or greater),

0.2

f (x )

0.16 0.12 0.08 0.04 1

2

3

4

5

6

7

8

9

10

x Figure 63.1: Graph of f ( x ) in exercise 63.1

so instead of inverting F, let’s go backwards from 6 to the corresponding uniform number u ∗ and then see how many of the 5 numbers are below u ∗ . We calculate F (6) by calculating the area of the triangle. We will calculate the area of the triangle from 6 to 10, and use 1 minus this number. The base is 4 and the height is 0.2 54  0.16 (since 6 is 45 of the way from 10 to 5), so the area is 0.5 (4)(0.16)  0.32, and the area from 0 to 6 is 1 − 0.32  0.68. 0.77 and 0.92 are greater than 0.68, so 40% of the grades pass. (C)

63.2.

The binomial distribution is tabulated: i 0 1 2 3 4

pi

0.54  0.0625 4 (0.54 )  0.25 6 (0.54 )  0.375 4 (0.54 )  0.25 0.54  0.0625

F (i ) 0.0625 0.3125 0.6875 0.9375 1.0000

So the numbers of claims in the three weeks are 1, 2, and 3. Claim sizes are x  10u, so we simply add up 6 uniform numbers and multiply by 10: 10 (0.3 + 0.1 + 0.7 + 0.6 + 0.5 + 0.8)  30 C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

63. SIMULATION—APPLICATIONS

1252

63.3.

Let’s tabulate this: Uniform number ui

Standard normal number n i  Φ−1 ( u i )

Claim cost x i  15,000 + 2000n i

0.5398 0.1151 0.0013 0.7881

0.10 −1.20 −3.00 0.80

15,200 12,600 9,000 16,600

Payment max (0, x i − 10,000) 5200 2600 –0– 6600

The sum is 5200 + 2600 + 0 + 6600  14,400 . (B) 63.4. Note that average amount per person means total payments divided by 9. In general, always divide by the quantity after the word “per”. Interpreting this problem as “per payment” is incorrect. This is a discrete distribution, with the F (0)  0.10, F (100)  0.10 + 0.80  0.90, F (50,000)  0.90 + 0.09  0.99, and F (100,000)  1. Thus we map the uniform number u i to y i , and payment size (after deductible of $10,000) x i as follows:

So the payments are

Range for u i

Value of y i

Value of x i

0 ≤ u i < 0.10 0.10 ≤ u i < 0.90 0.90 ≤ u i < 0.99 0.99 ≤ u i ≤ 1

0 100 50,000 100,000

0 0 40,000 90,000

0.250 → 0

0.002 → 0

0.780 → 0

0.300 → 0

0.640 → 0

0.995 → 90, 000

0.910 → 40, 000

0.890 → 0

0.350 → 0

The average payment per person is (40,000 + 90,000) /9  $14, 444 . (A) 63.5. For the normal distribution, Φ (0.5)  0.6915. The parameters of the inflated lognormal distribution are µ  7 + ln 1.05  7.0488 and σ  2 (see page 29 to see how to scale a lognormal distribution) so 0.6915 corresponds to exp 2 (0.5) + 7.0488  3130.01 . (C) 63.6. The number of losses is a discrete distribution with F (0)  0.7, F (1)  0.9, and F (2)  1, so numbers below 0.7 (0.65, 0.40) are mapped to 0, numbers between 0.7 and 0.9 (0.74) are mapped to 1, and 0.95 is mapped to 2. Thus we have 3 losses. We invert F ( x ) , claim size: √ x u 0 ≤ x ≤ 100 20 x  400u 2 0 ≤ u ≤ 0.5 because F (100)  0.5 x u 100 < x ≤ 200 200 x  200u 0.5 ≤ u ≤ 1 So 0.50 → 100, 0.20 → 16, 0.75 → 150. The total is 100 + 16 + 150  266 . (C) 63.7.

The binomial distribution’s distribution function is

C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 63

1253

pi

F (i )

 0.001 3 (0.12 )(0.9)  0.027 3 (0.1)(0.92 )  0.243

0.001 0.028 0.271 1

i 0.13

0 1 2 3

Therefore 0.01 → 1 and 0.20 → 2. Claim size is a discrete distribution, with 0.80 and 0.70 going to 100 and 0.95 going to 10,000. Therefore, claims in the first year are 100 and claims in the second year are 10,000 + 100  10,100. Payment is 0 in the first year and 5100 in the second year. Total is 5100 . (C) 63.8. The distribution function for the number of suits is F (0)  0.5, F (1)  0.7, F (2)  0.9, and F (3)  1, so 0.83 → 2, 0.59 → 1, and 0.19 → 0. Inverting the size of loss sends numbers below 0.5 to 0. From 0 to 10, u

1 1 x+ 20 2



x  20 u −

1 2



So 0.51 → 0.2, 0.01 → 0, 0.78 → 5.6. The total is 0.2 + 5.6  5.8 . (D) 63.9.

F ( x )  1 − x −2  y1 . x  √ 1

1−y1





1 1−0.636

 1.6575. r  0.01 + 0.2y2  0.01 + 0.2 (0.245)  0.059. The

answer is 1.6575 (1.059)  1.7553 . (C) 63.10. Z is a discrete random variable with 5 possible values: 0, and the 4 combinations of causes and years. The probabilities of the 4 possible non-zero values of Z are 0.1 for Cause 1 year 1 (the lowest present value other than zero), 0.1 for Cause 1 year 2 (the second lowest), 0.1 for Cause 2 year 1 (the third lowest) and 0.5 for Cause 2 year 2 (the highest). This leaves a probability of 1 − 0.1 − 0.1 − 0.1 − 0.5  0.2 for 0. Thus values of the uniform random number below 0.2 are assigned to 0. Values from 0.2 to 0.3 are assigned to Cause 1 year 1. Values between 0.3 and 0.4 are assigned to Cause 1 year 2. That actuarial present value of that is 1100/1.062  979 . (B) 63.11. For an exponential with mean θ, the formula for a random number is x i  −θ ln (1 − u i ) , where u i is a uniform random number on [0, 1) . Therefore, the simulated interarrival times are x i  − ln (1 − u i ) /3. 5 Total premium collected is the integral of ct 4 through time t, or ct5 . Thus we have i

ui

1 2 3

0.5 0.8 0.9

Interarrival time i) x i  − ln (1−u 3 0.2310 0.5365 0.7675

Claim time P t i  ij1 x i

Claim size 10t

Cumulative claims Pi j1 x i

t 5 /5

0.2310 0.7675 1.5351

1.7024 5.8550 34.2813

1.7024 7.5574 41.8386

0.00013 0.05327 1.70472

The first claim is below 5, thus doesn’t require premium. We must solve 2 inequalities: 5 + 0.05327c ≥ 7.5574 ⇒c ≥ 48.01

5 + 1.70472c ≥ 41.8386 ⇒c ≥ 21.61

The stricter condition is the first one, so the answer is 49 . (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

63. SIMULATION—APPLICATIONS

1254

63.12. The Pareto has F ( x )  1 − as follows:

9000 9000+x

 u, so x 

9000 1−u

− 9000. Therefore, the losses and payments are

Random Number

Loss

Payment

0.21 0.32 0.53 0.88 0.05

2,392.41 4,235.29 10,148.94 66,000.00 473.68

1,392.41 3,235.29 9,148.94 49,000.00 –0–

The average payment per loss is 1392.41 + 3235.29 + 9148.94 + 49,000.00 + 0  12,555.33 5 63.13. With a 0.9 chance of 1 accident, u  0.05 corresponds to N  1. Then the exponential variable with mean 100 is simulated by x1  −100 ln (1 − v1 ) , as indicated in the previous question. The total amount of simulated claims is −100 ln 0.7  35.6675 (B)

63.14. The distribution function—the probability the outcome is less than x—is 0.8 for any x ≤ 1000, and then increases linearly up to x  5000, since it is uniform. Therefore, the outcome X has distribution function   0.8  x ≤ 1000 . F (x )    0.8 + 0.2 ( x − 1000) 4000 1000 ≤ x ≤ 5000



You don’t really have to write this down. We know 0.75 has to go to 0 since it is less than 0.8. Because it’s uniform between 0.8 and 1, linear interpolation gets you the inverse value between 1000 and 5000, so 0.85 would go to one-quarter of the way between 1000 and 5000, or 2000. If this doesn’t satisfy you, though, you can work out the algebra. The average is therefore 0+2000  1000 . (B) 2 63.15. For the binomial, since E[N]  mq  1.8, q  0.6, and the probabilities are p 0  0.43  0.064 2

p 1  3 (0.4 )(0.6)  0.288 2

p 2  3 (0.4)(0.6 )  0.432 3

p 3  0.6  0.216

F (0)  0.064 F (1)  0.352 F (2)  0.784 F (3)  1

For the severity distribution, [0, 0.2) goes to 1, [0.2, 0.4) goes to 2, etc. So 0.7 goes to 2 claims of 0.1 and 0.3, 0.1 goes to 1 claim of 0.9. Then 0.5 goes to 2 claims of 0.5 and 0.7 which go to 3 and 4 for a total of 7 . (C) 63.16. This question may not a realistic exam question because Poisson processes are not on the MLC syllabus, so even a student taking the exams in order may never have encountered them. What you need to know is that time between events  in a Poisson process are exponentially distributed. 2 Average annual claims are 3 0.25 + 0.25 (2) + 0.5 (3)  1.5. The variance of aggregate claims, by the

compound Poisson variance formula (14.4) is 32 (0.25 + 0.25 (4) + 0.5 (9)  3.8333. So the annual premium √ is 1.5 + 1.8 3.8333  5.0242. Time between events in a Poisson process is exponentially distributed with mean equal to the reciprocal of the Poisson parameter. The formula for generating exponential random numbers from uniform



C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 63

1255

random numbers is −µ ln (1 − u i ) . Thus we get

0.25 → − 23 ln 0.75  0.4315 0.40 → − 23 ln 0.60  0.7662 0.60 → − 23 ln 0.40  1.3744

These add up to 2.5721 > 2, while the first two add up to 1.1977 ≤ 2, so there are 2 claims. The claim sizes are 1 for u i < 0.25, 2 for 0.25 ≤ u i < 0.5, and 3 for u i ≥ 0.5, so 0.3 → 2 and 0.6 → 3. Thus the claims add up to 5, the premiums add up to 2 (5.024)  10.048, and the gain is 5.048. (E) 63.17. For the Poisson, p0  e −2  0.1353, p1  p2  0.2707, and p 3  0.1804, so F (2)  0.6767 ≤ 0.8 < F (3)  0.8571 and there are 3 claims. To invert the Pareto:

!2

500 u 1− 500 + x √ 500 1−u  500 + x 500 x√ − 500 1−u

So 0.60 → 290.57, 0.25 → 77.35, and 0.70 → 412.87, adding up to 780.79. Insurance payments are (0.8)(750) + (780.79 − 750)  630.79 . (C)

63.18. For the underlying distribution, we have

4000 F (1000)  1 − 4000 + 1000

!2

4000 F (6000)  1 − 4000 + 6000

!2

 1 − 0.82  0.36  1 − 0.42  0.84

So the expected number of observations in each of the three intervals, np i , are 5 (0.36)  1.8 below 1000, 5 (0.84 − 0.36)  2.4 in the interval 1000–6000, and 5 (0.16)  0.8 above 6000. There is no need to invert the Pareto, since we know by the above that any u i below 0.36 will go into the first interval, any u i between 0.36 and 0.84 will go into the second interval, and any u i above 0.84 will go into the third interval. So there are 3 simulated observations below 1000, 1 between 1000 and 6000, and 1 above 6000. The chi-square statistic is then Q

(3 − 1.8) 2 1.8

+

(1 − 2.4) 2 2.4

+

(1 − 0.8) 2 0.8

 0.8 + 0.8167 + 0.05  1.6667

63.19. The binomial distribution has probabilities: 1 16 4 p1  16 6 p2  16 p0 

1 so that F (0)  p 0  16 , F (1)  p 0 + p 1  claims are generated. C/4 Study Manual—17th edition Copyright ©2014 ASM

5 16 ,

F (2)  p 0 + p 1 + p 2 

11 16

and

5 16

< 0.58
10] Pr ( θ > 10) 5  E[θ | θ ≤ 10] (1 − e −2 ) + 15e −2

since an exponential is memoryless so E[θ − 10 | θ > 10]  θ  5 and E[θ | θ > 10]  15. E[θ | θ ≤ 10] 

5 − 15e −2 1 − e −2

as before. Either way, the posterior expected value of a claim is 5+

C/4 Study Manual—17th edition Copyright ©2014 ASM

5 − 15e −2   5 + 3.4348  8.4348 1 − e −2

(A)

PRACTICE EXAM 4, SOLUTIONS TO QUESTIONS 25–27

1474

25.

[Lesson 13] We need β0 such that rβ0 (1 + β0 )  0.11, or β0 (1 + β0 )  0.055 β02 + β0 − 0.055  0 √ −1 + 1.22 β0   0.052268 2

The adjustment to the actual β  0.1 that we need is that 1 − F ( d )  0.52268.

0.052268 0.1

 0.52268. So we want the deductible d such

!2

10,000  0.52268 10,000 + d 10,000  0.72297 10,000 + d 10,000 (1 − 0.72297)  3831.90 d 0.72297 26.

(A)

[Sections 3.2 and 4.1] Let X be the loss random variable. E[Λ]  θΓ (1 + 1/τ )  100Γ (3)  200 E Λ2  θ 2 Γ (1 + 2/τ )  1002 Γ (5)  240,000

f

g

Var (Λ)  240,000 − 2002  200,000

E[X]  E E[X | Λ]  E[10Λ]  2000

f

g

Var ( X )  E Var ( X | Λ) + Var E[X | Λ]

f

g





 E[10Λ2 ] + Var (10Λ)

 10 (240,000) + 100 (200,000)  22,400,000 Using the normal approximation, the 80th percentile is E[X] + Φ−1 (0.8) Var ( X )

p

 2000 + 0.842 22,400,000  5985.1

p

27.

(D)

[Lesson 53] The overall mean is the mean of the gamma, or 2 (1200)  2400. The sample mean is  1650. The credibility factor is backed out from

1400+1900 2

1650Z + 2400 (1 − Z )  1800 2 which implies that Z  0.8. Then 2+k  0.8, so k  0.5. The mean of three claims is 3 6 The revised Z is 3+k  7 . The revised credibility expectation is

6 1 + 2400  1714.286 1600 7 7

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

!

(D)

1400+1900+1500 3

 1600.

PRACTICE EXAM 4, SOLUTIONS TO QUESTIONS 28–30

1475

28. [Lesson 45] There are 3 ways to spend $4.00. The joint probabilities of spending $4.00 and one of these 3 ways are: 1.

4 pounds of Brand A on sale, probability (0.25)(0.25) .

2.

2 pounds of Brand A not on sale, probability (0.75)(0.25)(0.25) .

3.

4 pounds of Brand B, probability (0.75)(0.25)(0.25) .

We need only the relative probabilities, so we can factor out the (0.25)(0.25) . So we are looking at weights 1 on the first possibility and 0.75 on each of the second and third possibilities. Then the probability of the first and the third vs. the second is 1 + 0.75  0.70 (C) 1 + 0.75 + 0.75 29.

[Lesson 58] The expected value of the process variance vˆ is the overall mean, or vˆ 

92 (1) + x (2) 92 + 2x  800 + 92 + x 892 + x

The overall second moment is

92 (1) + x (4) 92 + 4x  800 + 92 + x 892 + x The biased estimator for the overall variance is µ02 

92 + 4x 92 + 2x µ2  − 892 + x 892 + x

!2

and the unbiased estimator, s 2 , is arrived at by multiplying this by s2 

892+x 891+x :

(92 + 2x ) 2 92 + 4x − , 891 + x (892 + x )(891 + x )

The final answer is higher if you skip this adjustment. Then aˆ 

92 + 4x (92 + 2x ) 2 92 + 2x − − 891 + x (892 + x )(891 + x ) 892 + x

We want aˆ > 0. We multiply through by (892 + x )(891 + x ) :

(92 + 4x )(892 + x ) − (92 + 2x ) 2 − (92 + 2x )(891 + x ) > 0 −8372 + 1418x − 2x 2 > 0 √ 1418 − 1,943,748 x>  5.95409 2 (2)

Therefore, at least 6 policyholders with 2 claims are necessary. (D) 30. [Lesson 60] Inversion works by taking the value from the y-axis to the corresponding value on the x axis. Thus 0.6 goes to 3, 0.7 goes to 3 (draw a line between the points (3, 0.6) and (3, 0.8) on the graph; then going horizontally from 0.7 on the y-axis, you’ll hit that line at x  3, 0.2 goes to 0 (draw a line between the points (0, 0) and (0, 0.3) on the graph; the graph starts at (0, 0) ), and 0.9 goes to 3.5 (by linear interpolation). 0.4 goes to 2 by the convention used in the textbook that assigns the highest possible number. The mean of the simulated values is 3 + 3 + 0 + 3.5 + 2  2.3 (D) 5 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 4, SOLUTIONS TO QUESTIONS 31–34

1476

31. [Lesson 17] In order for there to be four televisions, there must be at least two families. For two families, either there could be 1 and 3 or 2 and 2, for probability 0.4 2 (0.4)(0.2) + (0.32 )  0.1





For three families, there could be 0,1,3 (6 ways) or 0,2,2 (3 ways) or 1,1,2 (3 ways) for probability 0.1 6 (0.1)(0.4)(0.2) + 3 (0.1)(0.32 ) + 3 (0.42 )(0.3)  0.1 (0.048 + 0.027 + 0.144)  0.0219





The total probability is 0.1 + 0.0219  0.1219 . (D) 32.

[Lesson 31] The 40th percentile of an exponential, π0.4 , is determined by e −π0.4 /θ  0.6 π0.4  − ln 0.6 θ πˆ 0.4 θˆ  − ln 0.6

For a sample of size 2, the 40th smoothed empirical percentile is the 3 (0.4)  1.2th item. If the sample is X1 and X2 and the order statistics are Y1 and Y2 , the 40th smoothed empirical percentile is πˆ 0.4  0.8 ( Y1 ) + 0.2 ( Y2 ) . We need the expected value of this. For the minimum of two exponential variables, Y1 : SY1 ( y1 )  Pr ( Y1 > y1 )  Pr ( X1 > y1 ) Pr ( X2 > y1 )  e −2y1 /θ so we see that Y1 has an exponential distribution with mean θ/2. On the other hand, Y1 + Y2  X1 + X2 , and E[X1 ]  E[X2 ]  θ, so E[Y2 ]  2θ − E[Y1 ]  3θ 2 . It follows that θ 3θ + 0.2  0.7θ E[πˆ 0.4 ]  0.8 2 2

!

!

It follows that the expected value of our estimator is ˆ − E[θ] and the bias is



θ −

0.7θ ln 0.6

0.7 − 1  θ (1.37033 − 1) ln 0.6



and c  0.37033 . (A) 33. [Lesson 42] Since a one-sided interval is requested, we use the 95th percentile of the normal distri2 bution, 1.645, instead of the 97.5th percentile. The expected number of claims needed is 1.645  270.6025, 0.1 or 271 . (B) 34. [Lesson 47] For the Poisson/gamma conjugate prior pair, Bühlmann and Bayesian credibility methods are the same, so we will use Bayesian methods. We use equation (47.1) on page 939 to calculate the credibility premium, with α  20 and γ  θ1  10 for insured #1 and α  20 and γ  20 for insured #2. In the following, the first is on the left side, the second is on the right side, and we equate them. 20 + k 20 + 2k  10 + k 20 + k C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 4, SOLUTION TO QUESTION 35

1477

(20 + k ) 2  (20 + 2k )(10 + k ) 400 + 40k + k 2  200 + 40k + 2k 2 k 2 − 200  0 √ k  200  14.14 In the fifteenth year, the credibility premium for insured #1 will be less than for insured #2. Since the first year is 2000, the fifteenth year is 2014 . (B) 35. [Lesson 3] Let X be payments for basic services, Y payments for major services, Z total payments. Then Z  X + Y and Var ( X )  mq (1 − q )  2 (0.3)(0.7)  0.42

Var ( Y )  mq (1 − q )  2 (0.2)(0.8)  0.32

Var ( Z )  Var ( X ) + Var ( Y ) + 2 Cov ( X, Y ) 0.6  0.42 + 0.32 + 2 Cov ( X, Y ) Cov ( X, Y )  −0.07

C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 1–3

1478

Answer Key for Practice Exam 5 1 2 3 4 5 6 7 8 9 10

C A C A D C C D D D

11 12 13 14 15 16 17 18 19 20

A A C A C B A D D B

21 22 23 24 25 26 27 28 29 30

E B D D B D A E A D

31 32 33 34 35

B E B C B

Practice Exam 5 1. [Lesson 8] Since TVaRp ( X ) −VaRp ( X )  θ for an exponential, we have θ  750. For an exponential, the 100p th percentile is −θ ln (1 − p ) , so −750 ln (1 − p )  2630

ln (1 − p )  −2630/750  −3.5067 1 − p  e −3.5067  0.03 p  0.97

2. or

(C)

[Subsection 25.1.2] There are 50 losses. The first moment is the average of the interval averages,

15 (500) + 11 (1000) + 12 (1500) + 9 (2000) + 3 (3500)  1300 50 The second moment is the average of the second moments. In the following expression, 15/50  0.3 is the probability of the first interval, 11/50  0.22 is the probability of 1000, and so on. For intervals ( a, b ) , the second moment is ( a 2 + ab + b 2 ) /3. E[X] 

E[X 2 ] 

0.3 (12 + 1 · 999 + 9992 ) 0.24 (10012 + 1001 · 1999 + 19992 ) + 0.22 (10002 ) + + 0.18 (20002 ) 3 3 0.06 (20012 + 2001 · 4999 + 49992 ) +  2,379,760 3

The variance is

3.

Var ( X )  2,379,760 − 13002  689,760

(A)

[Lesson 38] S∗ (1)  e −1 and F ∗ (1)  1 − e −1 . Using formula (38.1) A2  −1 + − ln e −1 − ln 1 − e −1









 −1 + 1 − ln 0.632121  0.4587

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 4–6

1479

You can calculate the statistic directly from the integral definition. Fn ( x ) is 0 in [0, 1) and 1 in [1, ∞) , while F ∗ ( x )  1 − e −x , so we evaluate two integrals: A2 

1

Z

Z ∞ ( e −x ) 2 (1 − e −x ) 2 −x + e dx e −x dx −x −x (1 − e ) e (1 − e −x ) e −x 1 {z } {z } |

|0

B

1

Z B

0

C

(1 − e −x ) dx 1

 ( x + e −x )

0

 1 + e −1 − 1  e −1 ∞

Z

C 

Z1 ∞ 1

e −2x dx (1 − e −x )   e −x e −x + dx 1 − e −x

 ∞

 −e −x − ln (1 − e −x )



1

 −e

−1

− ln 1 − e



−1



A2  B + C  e −1 − e −1 − ln 1 − e −1  ln 0.632121  0.4587





4. [Lesson 27, page 471] The shape parameter of the gamma is α. To make the mean 70, θ would 70 have to be 70 α  50 . x  74. Therefore, the form would be 1 θα Γ(α)

5.

x α−1 e −x/θ 

1

(70/50) 50 Γ (50)

7449 e −

(74)(50)

(A)

70

[Lesson 32] The likelihood of number of claims is e −θ ( θ n i /n i !) and the likelihood of claim sizes is so after dropping constants, the likelihood is

e −x i /3θ / (3θ ) ,

L ( θ )  e −5θ θ 1+1+2

e − (2+3+1+2)/3θ θ4

8 3θ 8  −5 + 0 3θ 2 8  15  0.7303 (D)

l ( θ )  −5θ − dl dθ

θ2 θ

6.

[Lesson 49] The prior is a beta with a  1.5, b  1.5. There are 4 out of 10 possible claims, so

a → 1 + 4  5.5, b → 1 + 6  7.5. The expected number of claims is C/4 Study Manual—17th edition Copyright ©2014 ASM

5a a+b



55 26

 2.1154 . (C)

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 7–9

1480

7.

[Lesson 57] 1+2+0+7+4  0.5 5+6+7+4+6 3 1 x¯ 1   18 6 11 x¯ 2   1.1 10 x¯ 

5 0.2 −



vˆ 

 1 2 6

+6



1 3



 1 2 6

+7 0−



 1.061111 aˆ 

18



1 6

− 0.5

2

 1 2 6 3

+ 4 (1.75 − 1.1) 2 + 6

+ 10 (1.1 − 0.5) 2 − 1.061111 182 +102 28

28 − 18 (0.353025) Zˆ   0.8569 18 (0.353025) + 1.061111

8.





2 3

− 1.1

2

4.538889  0.353025 12.85714

(C)

[Lesson 46] The likelihood of the counts is (leaving out the Poisson constants

1 1!

and

1 2! )

e −5θ θ 3 The likelihood of the sizes is (leaving out the constant denominators and 10’s) θ 3 e −10θ ( 20 + 5 + 10 )  θ 3 e −3.5θ 1

1

1

Multiplying the two together, the likelihood is θ 6 e −8.5θ . The density of the single parameter Pareto is π (θ)  Therefore the posterior is πΘ|X ( θ | x)  R

5 θ6

θ>1

e −8.5θ

∞ −8.5θ e dθ 1

θ>1

−8.5

The integral in the denominator is e8.5 . The distribution of X is inverse exponential with parameter 10, and Pr ( X > 10 | Θ)  1 − e −10θ/10  1 − e −θ . Hence the answer is the complement of

R

∞ −θ −8.5θ e e dθ 1 e −8.5 /8.5



8.5e −9.5  0.329155 9.5e −8.5

or 1 − 0.329155  0.670845 . (D) 9. [Lesson 27] We need F ( x )  0.3, where F ( x ) is the kernel-smoothed distribution. We know that in the empirical distribution, F ( x ) increases by 0.2 for each point, so by the time we get to 5 (past the span for 3), F ( x ) ≥ 0.4. Let’s try x  3. We have kernel weights of 0.5 from 3 and 0.75 from 2, making F (3)  (0.5 + 0.75) /5  0.25. From 3 to 4, F ( x ) will increase at the rate of 3 (0.25) /5  0.15, since 2, 3, and 5 each contribute at a rate of 0.25 (since the bandwidth is 2, so the kernel density is 1/ (2b )  0.25). Thus we need 0.25 + 0.15x  0.3, or x  1/3. Thus F (3 31 )  0.3 and the answer is 3 13 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 10–13

1481

10. [Subsection 33.3.3] This question can be done easily with the Weibull shortcut. There are six uncensored claims. We sum up the fourth powers of the losses or, for censored claims, the policy limits, minus the fourth powers of the truncation points for the three truncated claims. We divide by the number of uncensored claims. Then we take the fourth power.

11.

θˆ 

r



p4

4

2004 + 3004 + 5004 + 2 (10004 ) + 12004 + 13004 + 15004 − 3 (10004 ) 6

1,510,733 × 106  1109

(D)

[Lesson 5] E[X ∧ 20]  4 + 0.7 (20)  18

E ( X ∧ 20) 2  78 + 0.7 (202 )  358

f

g

Var ( X ∧ 20)  358 − 182  34

(A)

12. [Lesson 30] The survival function of an exponential is e −x/θ , where θ is the mean. We are given that e −5/θ  0.4, so −5  ln 0.4 θ −5 θ  5.4568 ln 0.4 5.4568 must be the sample mean. We now set this equal to the mean of the lognormal: e 1+σ 1+

2 /2

 5.4568

σ2

 ln 5.4568  1.6969 2 σ2  2 (1.6969 − 1)  1.3938 √ σ  1.3938  1.1806

We now calculate the probability that a claim will be greater than 5. 1 − F (5)  1 − Φ

ln 5 − 1  1 − Φ (0.516)  1 − 0.6985  0.3015 1.1806

!

(A)

√ 13. [Section 21.1] By definition, M  x 1 x2 for a sample x1 , x2 . Let’s compute E[M]. Note that the density function for the uniform distribution is 1/θ. Since there are two integrals, we multiply by two density functions, or divide by θ 2 . 1 E[M]  2 θ

Z

1 θ2

Z



 

θ

0

0 θ 0

θ

Z √

 2 3/2 2 θ 3 θ2



x 1 x2 dx1 dx 2

x1 dx 1 

θ

Z 0



x 2 dx2

4 θ 9

ˆ  (8/9) θ and the bias is bias ˆ ( θ )  (8/9) θ − θ  −θ/9 . (C) So E[θ] θ C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 14–17

1482

14. [Lesson 15] Φ−1 (0.6)  0.25. For a geometric, E[N]  β and Var ( N )  β (1 + β ) . For a gamma, E[X]  αθ and Var ( X )  αθ2 . E[S]  E[N] E[X]  (0.5)(30)  15 Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  0.5 (300) + 0.75 (302 )  825 √ The normal approximation is 15 + 0.25 825  22.18 . (A) 15. [Section 28.1] For the actuarial estimate, age-0 exposure ignoring deaths and withdrawals is 0 for those born on 1/1/81 since they’re 31 during 2012. It is 1/12 for those born on 2/1/81, 2/12 for those born on 3/1/81, etc., up to 1 for those born on 1/1/82, then decreasing to 11/12 for those born on 2/1/82, etc., down to 1/12 for those born on 12/1/82. Total exposure in years is

(11)(12) 1 + 2 + · · · + 11 + 10  20 + 10  120 20 12 2 (12) !





We need to adjust for withdrawals and deaths. #1 leaves at age 31, so no adjustment is needed. The actuarial estimator does not adjust #2 since deaths count for the full year. #3 dies at age 31 so no adjustment is made. For #4, we lose 6 months of exposure. For #5, we lose 6 months of exposure. For #6, death occurs at age 29 so the scheduled 10 months of exposure are lost. For #7, since death gets a full year of exposure (through 7/1/2013), we add 6 months. For #8 we lose 6 months of exposure. So the total adjustment to actuarial exposure in months is −6 − 6 − 10 + 6 − 6  −22, or 1 65 years. There are 2 deaths at age 30 (#3 and #7), so the actuarial estimate of q30 is 2/ (120 − 1 65 )  0.01693 . (C) 16.

[Lesson 45] The expected value for Θ  0 is

0.1 (1) + 0.1 (2)  0.5. The expected value for Θ  1 is 0.4 + 0.1 + 0.1

0.2 (1) + 0.1 (2)  1. 0.1 + 0.2 + 0.1   The likelihood of {0,1} if Θ  0 is 23 16  91 . The likelihood of {0,1} if Θ  1 is value is then   0.6 19 (0.5) + 0.4 18 (1)  0.7143 (B)   0.6 19 + 0.4 81 17.

1 1 4 2

 18 . The expected

[Lesson 63] For the Poisson, p 0  e −0.8  0.4493 p 1  0.8 (0.4493)  0.3595 p 2  0.4 (0.3595)  0.1438

and these numbers add up to more than the highest uniform number 0.82. Then F (1)  0.4493 + 0.3595  0.8088 and the number of losses in the three years is 0, 2, and 1. We use VaRp ( X ) in the tables to invert the uniform numbers for loss sizes, with p  u: 0.52 → 200 (1 − 0.52) −1 − 1  216.67





0.93 → 200 (1 − 0.93) −1 − 1  2657.14





0.34 → 200 (1 − 0.34) −1 − 1  103.03





In the second year, the payments are 216.67 + 2657.14 − 200  2673.81, and in the third year the payment after deductible is 0. Annual average is 2673.81/3  891.27 . (A) C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 18–21

1483

18. [Lesson 26] We’ll construct a confidence interval for S (3) and then take the complement. The variances of Sˆ (3) and Fˆ (3) are the same. 19 15 11 Kaplan-Meier gives Sˆ (3)  20 17 14  0.658613. Greenwood gives

L Sˆ (3)  0.6586132 Var 

so σS (3) 



1



(20)(19)

+

2

(17)(15)

+

3



(14)(11)

 0.0129937

√ 0.0129937  0.113990. The log transformation gives 1.96 (0.113990) ZσS  exp  0.443839 U  exp S ln S 0.658613 ln 0.658613

!

!

(S1/U , SU )  (0.3903, 0.8308) The answer is the complement, (1 − 0.8303, 1 − 0.3903)  (0.1691, 0.6097) . (D) 19. [Lesson 31] The smoothed empirical median is the average of the fifth and sixth observations, or P µ ˆ  ln 24  3.178054. 24, and the empirical mean is 10 i1 x i /10  51. The distribution’s median is e , so µ 2 µ+σ /2 The distribution’s mean is e , so µ+

20.

σ2  ln 51  3.931826 2 p σˆ  2 (3.931826 − 3.178054)  1.2278

(D)

[Lesson 6] E[X]  E[X ∧ x] + e X ( x ) Pr ( X > x )

2000  400 + 1800 Pr ( X > 500) 2000 − 400 8 Pr ( X > 500)   1800 9 1 (B) Pr ( X ≤ 500)  9

21.

[Lesson 48] For the normal/normal conjugate prior: µv + na x¯ (100)(500) + (10)(60)(50)   80 v + na 500 + (10)(50) av (50)(500) a∗    25 v + na 500 + 10 (50)

µ∗ 

The mean of the predictive distribution is 80 and the variance is a∗ + v  525. The predictive distribution is normal. The probability of the next claim being greater than 90 is 90 − 80 1−Φ √  1 − Φ (0.44)  0.33 525

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 22–24

1484

22. [Lesson 24] Completion times (end minus start) in months are 25, 28, 21, 18 for the first four buildings respectively, with the fifth building censored at 18 months. The product-limit estimate is developed in the following table: yi

ri

si

Sˆ ( y i )

18 21

5 3

1 1

0.8 8/15

Hence the answer is Pr ( X ≤ 24)  1 − 8/15  7/15 . (B) 23.

[Lesson 26] The product-limit estimator gives yi

ri

si

Sˆ ( y i )

2 4

99 98

1 2

98/99 96/99

The Greenwood approximation of the variance is 96 99 The square root of this is

!2 

1

(99)(98)

+

2



(98)(96)

96 √ 0.00031566  0.01723 99

3 The estimated probability of death is the complement of the probability of survival, or 99  0.0303. The variance of the probability of death is the same as the variance of survival, since Var (1 − X )  Var ( X ) . The confidence interval is 0.0303 ± 1.96 (0.01723)  (−0.00347, 0.06407) , but since mortality can’t be below 0, we use (0, 0.06407) . (D)

24. [Lesson 32] To keep the numbers small, we’ll use 2 and 5 instead of 200 and 500, and multiply by 100 at the end; this works since θ is a scale factor. The likelihood function is 3θ 3 L (θ)  ( θ + 2) 4

!3

3θ 3 ( θ + 5) 4

!2 ∼

θ 15 ( θ + 2) 12 ( θ + 5) 8

l ( θ )  15 ln θ − 12 ln ( θ + 2) − 8 ln ( θ + 5) dl 15 12 8  − − 0 dθ θ θ+2 θ+5

(15θ2 + 105θ + 150) − (12θ2 + 60θ ) − (8θ2 + 16θ )  0 5θ 2 − 29θ − 150  0 √ 29 + 3841 θ  9.0976 10

Therefore, θˆ  909.76 . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 25–28

1485

25. [Section 61.3] ln (1 − q )  ln 0.8  −0.223144. Then λ 0  −4 (−0.223144)  0.89257. The first time generated is s 0  − ln (1 − 0.24) /0.89257  0.30747. Continuing, λ 1  0.89257 − 0.223144  0.66943

s1  − ln (1 − 0.52) /0.66943  1.0964

Clearly s 0 + s1 > 1, so the generated number is 1 . (B) 26. [Subsection 34.3] The loglikelihood is 100 ln (1 − q ) , which is maximized at q  0. The non-normal confidence interval consists of all q fo which 100 ln (1 − q ) > −3.84/2  −1.92, so ln (1 − q ) > −0.0192

1 − q > e −0.0192  0.9810

q < 1 − 0.9810  0.0190

(D)

27. [Lesson 30] Note that the 3 weights must add up to 1, so w 3  1 − w1 − w2 . The equation for the first moment is 10.5  1.05 10 3 1 2 w 1 + w 2 + 2 (1 − w 1 − w 2 )  1.05 1 2 w1

+ w 2 + 32 w3 

w 1 + 2w2 + 3 − 3w 1 − 3w2  2.1 2w 1 + w2  0.9

w2  0.9 − 2w 1 The second moment is the variance (1/12 for each component of the mixture) plus the mean squared, so the equation for the second moment is 1 12 1 4 w1

13.65  1.365 10 1 − 94 w 1 − 49 w 2  1.365 − 12

+ 14 w1 + w 2 + 49 w 3 

+ w2 +

9 4 9 4

− 2w 1 − 45 w 2  1.365 −

1 12 7 3

−2w 1 − 54 (0.9 − 2w 1 )  1.365 − w1 7 4.5  1.365 − +  0.1567 2 3 4 w 1  0.3133 (A)

28.

[Lesson 13] We need q 0 such that p00  (1 − q 0 ) 10  0.5, so 1 − q 0  0.50.1  0.933033 q 0  0.066967

q 0  vq  0.1v v  0.66967 Thus we need Pr ( X > d )  0.66967, or e − ( x/1000) e − ( x/1000) C/4 Study Manual—17th edition Copyright ©2014 ASM

0.5

0.5

 0.66967.

 0.66967

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 29–31

1486

x  (− ln 0.66967) 2  0.16078 1000 x  160.78 (E)

29.

[Section 19.2] Anything less than 1 gets sent to 0; between 1 and 3 to 2, etc. So p0  F (1)  1 − e −1/10

p2  F (3) − F (1)  e −1/10 − e −3/10

p4  F (5) − F (3)  e −3/10 − e −5/10

etc.

The mean is E[X] 

∞ X

2 jp2 j

j1

 2 e −1/10 − e −3/10 + 2 e −3/10 − e −5/10 + 3 e −5/10 − e −7/10 + · · ·









 2 e −1/10 + e −3/10 + e −5/10 + · · ·



2

∞ X









e − (2 j−1)/10

j1



2e −1/10  9.9834 1 − e −2/10

(A)

30. [Lesson 63] The p-value is the probability of observing what we actually observed or something less likely, given that the fit (or the null hypothesis) correctly describes the underlying distribution. What did we observe? We observed two in [0, 5) and four in [5, 10]. The chi-square statistic for these 2

2

) ) observations is (2−3 + (4−3  23 . 3 3 What is the simulated probability that the chi-square statistic is 2/3 or less? In other words, what percentage of the simulated runs have a chi-square statistic of 2/3 or greater? To generate simulated observations that are uniform on [0, 10], we multiply uniform numbers on [0, 1] by 10. Runs 1 and 2 have two observations in one of the intervals and four in the other, so they have statistic of 23 . (In the case of Run 1, four observations are less than 0.5 and two are 0.5 or higher. For run 2, it’s the other way around.) Run 3 has five observations below 0.5 and one above 0.5, so its chi-square statistic is (5−3) 2

2

) + (1−3  83 . Run 4 has three observations below 0.5 and three 0.5 and above, so it has a statistic of 0. 3 Thus 3 runs out of 4 have a chi-square statistic greater than or equal to the actual one of 2/3, making the p-value 0.75 . (D) Notice that we could calculate the p-value exactly (without simulation). The number of observations in [0,5) is binomial with m  6, q  0.5, so the (two-sided) likelihood of 2 or less (or 4 or more) in the lower  interval is the probability of not having exactly 3 in the interval, or 1 − 63 (0.56 )  11/16. 3

31.

[Section 34.1] Differentiate the loglikelihood function twice. L (θ) 

e−

l (θ)  − C/4 Study Manual—17th edition Copyright ©2014 ASM

P

θ P

x i /θ

20

xi − 20 ln θ θ

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 32–33

1487

x i 20 dl  2 − dθ θ θ P 2 d l 2 x i 20 − + 2 2 dθ θ3 θ " 2 # 2 (20θ ) 20 20 d l  − 2  2 I  −E dθ 2 θ3 θ θ

P

20 When θ  5, the information matrix I is 25  0.8 . (B) You may also use the fact that the MLE is the sample mean. The information matrix is the reciprocal of the asymptotic variance, which is equal to the variance of the sample mean, or θ 2 /20  25/20, so the information matrix is 20/25.

32. [Lesson 32] Let p be the probability of N claims in the range (1000, 5000) and 4N claims out of the range. To maximize the likelihood of N claims in this range and 4N claims, maximize L ( p )  p N (1 − p ) 4N As usual, maximum likelihood matches p with the observed proportion, or 0.2, but if you want to see the steps: l ( p )  N ln p + 4N ln (1 − p ) N 4N dl  − 0 dp p 1−p p  0.2

F (5000) − F (1000)  0.2 θ θ −  0.2 θ + 1000 θ + 5000 Multiplying through by ( θ + 1000)( θ + 5000) θ 2 + 5000θ − θ2 − 1000θ  0.2 ( θ 2 + 6000θ + 5,000,000) 0.2θ 2 − 2800θ + 1,000,000  0 θ  13,633.25 or 366.75

Since we are given θ < 5000, we use the solution 366.75. Then F (1000)  1 − 33.

[Lesson 45] f (5 | µ  5) 

√1 σ 2π

f (5 | µ  4) 



√1 2 2π

366.75  0.7317 1366.75

(E)

 0.1995, whereas

2 2 1 1 √ e − (4−5) /2σ  √ e −1/8  0.1760, σ 2π 2 2π

so the probability you are there given 5 is 0.95 (0.1995)  0.9556 0.95 (0.1995) + 0.05 (0.1760)

C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

PRACTICE EXAM 5, SOLUTIONS TO QUESTIONS 34–35

1488

34.

[Lesson 5] For a two-parameter Pareto, E[X ∧ θ] 

θ θ * 1− α−1 x+θ

! α−1

,!

x θ x+θ

+ -

when α  2. We’re given 10,000 3 20,000 θ  θ 10,000 + θ 4 20,000 + θ 1 6  1 + θ0 8 + 4θ0

!

!

where θ0  θ/10,000 and we divided numerator and denominator by 10,000 to make the numbers easier to handle. Then cross multiplying 6 + 6θ0  8 + 4θ0 2θ0  2 θ0  1 so θ  10,000 . (C) 35.

[Lesson 42] Equating the means 1000α  1200 α−1 1.2α − 1.2  α 0.2α  1.2 α6

For an exponential random variable X with mean θ, the coefficient of variation of severity is Var ( X ) θ 2  2 1 E[X]2 θ For a single-parameter Pareto, 1 + CV2s  

which is

αθ2 / ( α − 2)

( α − 1) 2  2  α ( α − 2) αθ/ ( α − 1)

25 52  in our case. So if 1600 claims are needed under the exponential, then n0  800 and (6)(4) 24

800 (25/24)  833 13 claims are needed under the Pareto. (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 1–2

1489

Answer Key for Practice Exam 6 1 2 3 4 5 6 7 8 9 10

E A D A D E B C B C

11 12 13 14 15 16 17 18 19 20

D E B B B B B B E C

21 22 23 24 25 26 27 28 29 30

C C A B A C B E E E

31 32 33 34 35

C E B A E

Practice Exam 6 1. [Lesson 1] Looking up the tables for the inverse gamma distribution, we see that the mode is θ and the mean is α−1 , so

Dividing the second line into the first,

θ α+1

θ 4 α+1 θ 6 α−1 α−1 4  α+1 6 α5

θ  24

Then (γ1 is the coefficient of skewness). θ2 242   48 ( α − 1)( α − 2) 12 Var ( X )  48 − 62  12 E X2 

f

g

243 θ3   576 ( α − 1)( α − 2)( α − 3) 24 576 − 3 (48)(6) + 2 (63 ) γ1  121.5 √ 144  1.5  12  3.4641 (E) 12

E X3 

f

g

2. [Lesson 16] We will calculate payment per loss and savings per loss without the deductible. Then we do not have to modify the frequency distribution, since number of losses per year doesn’t change. Without the deductible, average loss size is E[X]  C/4 Study Manual—17th edition Copyright ©2014 ASM

30 (500) + 18 (1500) + 22 (3500) + 10 (12500)  3050 80

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 3–6

1490

The limited expected value at 500 is E[X ∧ 500] 

15 (250) + 65 (500)  453.125 80

The savings is 453.125/3050  0.1486 . (A) 3. [Subsection 25.1.1] We’ll write an expression for E[X ∧600] as the average of the interval averages, and solve for x. We split the (500, 1000] interval into two intervals, with (500, 600] having 5 losses and (600, 1000] having 20 losses. 22 (50) + x (300) + 5 (550) + 20 (600)  323 47 + x 15,850  15,181 + 23x 669 x  29 (D) 23

4.

[Lesson 30] The cumulative hazard function is calculated by integrating h ( x ) : x

Z H (x ) 

0

Z H (x ) 

θ dt  xθ

10

0

θ dt +

Z

0 ≤ x ≤ 10 x

10

2θ dt

10 ≤ x ≤ 15

 10θ + 2θ ( x − 10)  (2x + 10 − 20) θ  (2x − 10) θ 10

Z

H (x ) 

0

θ dt +

Z

15

10

2θ dt +

x

Z

15

3θ dt

x ≥ 15

 10θ + 5 (2θ ) + ( x − 15)(3θ )

 θ (10 + 10 + 3x − 45)  (3x − 25) θ Exponentiating, the survival function is

 e −xθ    S (x )   e (10−2x ) θ    e (25−3x ) θ 

0 ≤ x ≤ 10 10 ≤ x ≤ 15 x ≥ 15

f ( x )  h ( x ) S ( x ) , so multiply these S ( x ) ’s by h ( x ) to obtain the density function which is multiplied to obtain the likelihood. In setting up the likelihood, for h ( x ) , drop the constants 2 and 3. The likelihood of our sample is then L ( θ )  θ 6 e (−8−14−26−50−95−155) θ  θ 6 e −348θ You may recognize this form as a gamma distribution with parameters α  7 and θ  1/348 and mode ( α − 1) θ  6/348  0.01724 . Otherwise log and differentiate until you reach that answer. (A) 5.

[Section 8.1] As discussed in Example 8A, the answer is (D)

6.

[Section 28.1]

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 7–8

1491

Insuring Date of Birth

Start of Exposure for Age 40

End of Exposure for Age 40

Months of Exposure

2/1/1971 6/1/1971 5/1/1972 11/1/1972

1/1/2012 1/1/2012 5/1/2012 11/1/2012

2/1/2012 6/1/2012 12/31/2012 11/1/2013

1 5 8 12

Total actuarial exposure is 1 + 5 + 8 + 12  26 months. (E) 7.

[Lesson 60] Integrating and exponentiating h ( x ) , the distribution function is

  1 − e −0.05x F (x )    1 − e −1−0.15( x−20)

0 < x ≤ 20 x > 20



Inverting, u  1 − e −0.05x x  −20 ln (1 − u ) for x < 20, or u < 1 − e −1  0.63212. For u ≥ 0.63212,

u  1 − e −1−0.15 ( x−20)

ln (1 − u )  −1 − 0.15x + 3  2 − 0.15x 2 − ln (1 − u ) x 0.15 For our four uniform numbers, 0.82 → 24.765, 0.37 → 9.241, 0.15 → 3.250, and 0.76 → 22.847. The average of the four generated numbers is 15.03 . (B) 8.

[Section 2.2] Let’s carry out the transformation using the survival function, Pr ( Y > x ) . Pr ( Y > x )  Pr a e X/5000 − 1 > x

 





x +1 a    x X  Pr > ln +1 a  5000   x  Pr X > 5000 ln +1 a





 Pr e X/5000 >

and based on the exponential distribution, this probability is x * 5000 ln a + 1 +/ Pr ( Y > x )  exp .− 2000



,

C/4 Study Manual—17th edition Copyright ©2014 ASM



x+a a

! −5000/2000

a a+x

! 2.5





-

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 9–11

1492

which we recognize as a two-parameter Pareto with θ  a and α  2.5, having mean a/1.5. To set this equal to 2000, set a  3000 . (C) 9. [Lesson 50] The prior is inverse gamma with parameters 3 and 10000. The posterior parameters are 3 + 4  7 and 10000 + 5000 + 6000 + 9000 + 6000  36000. The posterior mode is 36000/8  4500 . (B) 10. [Section 24.1 and Subsection 25.1.3] The data for the first 4 days are uncensored. The data for the fifth day are censored. The risk sets are: yi

ri

si

7 8 9 10 11 12

14 13 10 8 3 1

1 2 2 4 2 1

We want the E[X | X ≥ 9]. Since Sˆ (9− )  Sˆ (8) , this is equivalent to calculating E[X | X > 8]. We only have to calculate Sˆ ( x | x > 8) , so we begin out product limit calculations at t  9: 4 8  Sˆ (9 | X > 8)  10! 5! 4 4 2 Sˆ (10 | X > 8)   5 8 5 2 Sˆ (11 | X > 8)  5 ˆ S (12 | X > 8)  0

!

1 2  3 15

!

To calculate the expected value, we integrate from 8 to 12 and add 8. The integral, since Sˆ ( x | X > 8) is constant between integers, is 1 + 4/5 + 2/5 + 2/15  2 31 , and the answer is 10 13 . (C)

11. [Lesson 45] E[X | 1]  0.4 (5) + 0.2 (10)  4 and E[X | 2]  0.4 (10) + 0.2 (20)  8, so the posterior probability of Type 1 is 4p + 8 (1 − p )  5.6 8 − 4p  5.6

p  0.6

This can be expressed in terms of the prior probability q of Type 1: p Solving for q,

0.2q 0.2q  0.2q + 0.4 (1 − q ) 0.4 − 0.2q 0.6 

0.2q 0.4 − 0.2q

0.24 − 0.12q  0.2q

q  0.75

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 12–14

1493

D ( X < 75,000) is the same as the variance of Pr D ( X ≥ 75,000)  12. [Lesson 26] The variance of Pr − ˆ ˆ S (75,000 ) , and that is the same as the variance of S (60,000) with the given data. yi

ri

si

Sˆ ( y i )

10,000 15,000 20,000 30,000 50,000 60,000

15 14 12 11 8 4

1 2 1 3 1 1

14/15 12/15 11/15 8/15 7/15 0.35

The Greenwood approximation of the variance is 0.352



1

(15)(14)

+

2

(14)(12)

+

1

(12)(11)

+

3 1 1 + +  0.019542 (11)(8) (8)(7) (4)(3)



(E)

13. [Section 57.2] The group means are x¯ 1  300, x¯ 2  90, and x¯ 3  560. The estimate of the expected process variance is 1202 + 502 + 702 + 302 + 302 + 2602 + 2602 vˆ   39,700 2+1+1 The overall mean is 3 (300) + 2 (90 + 560) means is



.

7  314.2857. The estimate of the variance of the hypothetical

  1 2 2 2 3 ( 300 − 314.2857 ) + 2 ( 90 − 314.2857 ) + 2 ( 560 − 314.2857 ) − 2 ( 39,700 )  31,187.5 7 − (32 + 22 + 22 ) /7

Z for Group 3 is

39,700  1.27295 31,187.5 2 Z  0.6111 2 + 1.27295 k

14.

(B)

[Section 19.1] The probabilities of 1 and 2 claims are p1M p2M



p1T (1



p0M )



2 2 1 (0.5 )

1 − 0.25  1 − 0.7 − 0.2  0.1

(1 − 0.7)  0.2

If there is 1 claim, the probability that it is greater than 300 is e −300/200  0.223130. If there are 2 claims, the sum of the claims is gamma with α  2 and θ  200. The probability that the sum is greater than 300 is the probability that a Poisson process with rate 1/200 will have fewer than 2 events by time 300, and that is the probability that a Poisson variable with λ  300/200 will be greater than or equal to 2: e −1.5 (1 + 1.5)  0.557825 The probability that aggregate claims exceed 300 is (0.2)(0.223130) + (0.1)(0.557825)  0.1004 . (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 15–17

1494

ˆ  0.25cθ and ¯ The mean of x¯ is θ/ ( α − 1)  0.25θ, so E[θ] 15. [Subsection 21.1.3] Let θˆ  c x. 2 biasθˆ ( θ )  θ (0.25c − 1) . The second moment of X is θ /12, so θ2 θ2 θ2 −  12 16 48 θ2 θ2 Var ( x¯ )   13 (48) 624

Var ( X ) 

The Var ( θˆ )  c 2 θ 2 /624. We’ll minimize the bias squared plus the variance, which is θ 2 g ( c ) , where g ( c ) is defined on the next line. g ( c )  (0.25c − 1) 2 +

c2 624

g 0 ( c )  2 (0.25)(0.25c − 1) + 40c − 156  0 156  3.9 c 40

c 0 312

(B)

16. [Lessons 30 and 31] The smoothed 30th percentile is the 0.2 (6)  1.2 order statistic, or 0.2 (15) + 0.8 (10)  11. This is set equal to exp ( µ − 0.842σ ) . The empirical second moment is

P

x 2i

5

 2994.8

and this is set equal to exp (2µ + 2σ2 ) . Then µ  ln 11 + 0.842σ 2 (ln 11 + 0.842σ ) + 2σ2  ln 2994.8  8.004633 σ2 + 0.842σ + ln 11 − 0.5 (8.004633)  0 σ2 + 0.842σ − 1.60442  0 √ −0.842 + 7.12665 σˆ   0.91379 2 µˆ  ln 11 + 0.842 (0.91379)  3.16731

The estimated mean is exp ( µˆ + 0.5 σˆ 2 )  36.05 . (B) 17.

[Lesson 32] First calculate the MLE of a. L (a ) 

a4

Q

(100 − x i ) a−1

1004a X l ( a )  4 ln a + ( a − 1) ln (100 − x i ) − 4a ln 100

dl 4 X  + ln (100 − x i ) − 4 ln 100  0 da a X ln (100 − x i )  12.388889 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 18–19

a

1495

4  0.663153 4 ln 100 − 12.388889

To calculate the VaR, or the 90th percentile, we set the F ( x )  0.90. x

Z F ( x; a )  1−

100 − x 100

! 0.663153

f ( u ) du  −

0

100 − x 100

! a x !a  1 − 100 − x 100 0

 0.90

100 − x  0.11/0.663153  0.031049 100 x  100 − 3.1049  96.8951

(B)

18. [Lesson 15] Let S be aggregate losses for the entire block, S1 be aggregate losses for one small car, and S2 aggregate losses for one large car, so Var ( S )  100 Var ( S1 ) + 50 Var ( S2 ) . For small cars, E[S1 ]  0.2 (2000)  400 and for large cars E[S2 ]  0.3 (1000)  300, so E[S]  100 (400) + 50 (300)  55,000 We’ll use the compound variance formula. Var ( S1 | θ )  (0.2θ ) (2000/θ ) 2 + (2000/θ ) 2 





Var ( S1 )  E Var ( S1 | θ ) + Var E[S1 | θ]

f

g



1,600,000 θ



1,600,000 E + Var (400) θ

"

" E

1,600,000  θ

#

3

Z 1

#

0.5 (1,600,000) dθ  800,000 ln 3  878,890 θ

100 Var ( S1 )  100 (878,890) 50 Var ( S2 )  50 0.3 (10002 + 10002 )  30,000,000



p 1−Φ

19.

Var ( S ) 

p



100 (878,890) + 30,000,000  10,858

60,000 − 55,000  1 − Φ (0.46)  0.3228 10,858

!

(B)

[Lesson 32] The likelihood function, ignoring the constants k k , x k−1 , and Γ ( k ) , is e −125,000/θ e − kX k /θ L ( k )  1+2+3+4+5  θ θ 15 P

Logging and differentiating, −125,000 − 15 ln θ θ dl 125,000 15  − 0 dk θ θ2 125,000 θ  8333 13 15

l (k )  −

C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 20–24

1496

20.

[Lesson 46] The likelihood times the prior is e −120/θ θ5

which we recognize as the form of an inverse gamma with parameters α  4 and θ  120. (Not the same as our θ.) The mode is θ/ ( α + 1)  120/5  24 . (C) 21. [Lesson 36] The empirical distribution increases by 0.25 at 1, 0.5 at 3, and 0.25 at 4, so the probabilities of 1, 3, and 4 are 0.25, 0.5, and 0.25 respectively. The mean is 2.75 and the variance is P 0.25 ( x i − 2.75) 2  1.1875. The exponential distribution was fitted by maximum likelihood, so that its mean is 2.75. In fact, F (1)  1 − e −1/2.75  0.30486, which seems like the graph. The variance of the exponential is therefore 2.752  7.5625. The absolute difference in variances is 6.375 . (C) 22. [Section 36.2 and Lesson 37] The empirical distribution function at the four points, before adjustment for the p–p plot, is 0.25, 0.50, 0.75, 1.00. Comparing the fitted distribution function: i 1 2 3 4

F4 ( x −i ) 0 0.25 0.50 0.75

F4 ( x i )

F∗ (xi )

Largest difference

0.25 0.50 0.75 1.00

0.10 0.40 0.65 0.80

0.15 0.15 0.15 0.20

The largest difference is 0.20 . (C) 23.

[Section 40.1] For each distribution, we need to calculate the loglikelihood, which is 62 ln L (0) + 20 ln L (1) + 12 ln L (2) + 6 ln L (3)

where L ( k ) is the likelihood of k. The result of this formula is −111.875 for the negative binomial and −110.139 for the zero-modified negative binomial. The likelihood ratio statistic is 2 (−110.139 + 111.875)  3.472 . (A) 24. [Section 27.2] A uniform kernel-smoothed distribution is a mixture of uniform distributions at each observation. Let X be the empirical distribution random variable and Xˆ the kernel-smoothed distribution random variable. By conditional variance, conditioning on the observation x i , Var ( Xˆ )  E[Var ( Xˆ | x i ) ] + Var (E[Xˆ | x i ])

At each observation point, the variance of the uniform distribution is the range squared over 12, and the range is twice the bandwidth, so the variance is ( x i /k ) 2 /3. The expected value of the uniform distribution at each observation point is the point. Thus E[X 2 ] ( x i /k ) 2 + Var ( X )  + Var ( X ) Var ( Xˆ )  E 3 3k 2 √ The coefficient of variation is Var ( X ) / E[X], so E[X]2  Var ( X ) /v 2 . Thus

"

#

1

1 + v2 Var ( X ) + E[X]2 Var ( Xˆ )  + Var ( X )  Var ( X ) * + 1+ 2 3k 3k 2

, C/4 Study Manual—17th edition Copyright ©2014 ASM

-

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 25–27

1497

We want this to equal 1.5 Var ( X ) , so 1 + 1/v 2  0.5 3k 2

s k

2 (1 + 3

1 ) v2

(B)

25. [Lesson 64] The mode of the sample is 2. The mode of a bootstrap sample is 2 if three or more 2’s are drawn, leading to no error. Otherwise the mode of a bootstrap sample is 6, with an error of 4. What is the probability of drawing at least three 6’s? It is the binomial probability of three, four, or five 6’s: 5 5 5 (0.43 )(0.62 ) + (0.44 )(0.6) + (0.45 )  0.31744 3 4 5

!

!

!

The bootstrap approximation of the mean square error is 0.31744 (42 )  5.07904 . (A) 26.

[Lesson 11.2] Let c  Pr ( N > 0) . Then E[N]  c (1 + β )  4 Var ( N )  cβ (1 + β ) + c (1 − c )(1 + β ) 2 20  4β + 4 (1 − c )(1 + β )

 4β + 4 + 4β − 16  8β − 12

β4

4  0.8 1+β cβ 0.8 (4) Pr ( N > 1)    0.64 1+β 5 c

(C)

27. [Section 4.1] Let w be the weight of the component with mean 100. Then, comparing the first and second moments, 100w + (100 + a )(1 − w )  104

2

2 (100 ) w + 2 (100 + a ) 2 (1 − w )  10,944 + 1042  21,760

Let’s solve for a and w.

104 − 100w 1−w !2 104 − 100w 20,000w + 2 (1 − w )  21,760 1−w

(100 + a ) 

20,000w − 20,000w 2 + 21,632 − 41,600w + 20,000w 2  21,760 − 21,760w 160w − 128  0

w  0.8 104 − 100 (0.8) a − 100  20 1 − 0.8 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 28–29

1498

Use the Law of Total Probability on the two components of the mixture to calculate Pr ( X > 150) . Pr ( X > 150)  0.8e −150/100 + 0.2e −150/120  0.2358

(B)

28. [Lesson 48] Using the normal/normal conjugate prior, the posterior parameters θ0 and a 0 for the posterior are n x¯ 

X

x i  31,200

50,000 (31,200) + 10,000 (5000)  5193.548 6 (50,000) + 10,000 50,000 (10,000)  1612.903 a0  6 (50,000) + 10,000

θ0 

The predictive variance is the sum of the posterior variance and the process variance, or 10,000 + √ 1,612.903  11,612.903. The standard deviation is 11,612.903  107.76 . (E) 29.

[Lesson 34] The formula for αˆ is αˆ  −

n K

where, in the absence of truncation and censoring, K  n ln θ −

X

ln ( θ + x i )

In our case, K  5 ln 2 − ln (2.16)(2.50)(2.84)(3.10)(3.40)  −1.619645 5  3.087097 αˆ  1.619645 The asymptotic variance of the MLE, as listed in Table 34.1, is α 2 /n, which is estimated here by 3.0870972 /5  1.906033. For the delta method, the function is g ( α )  E[X 2 | α]  with θ  2. Differentiate g ( α ) using the product rule. 0

g ( α )  −8

2θ 2 ( α − 1)( α − 2)

1

( α − 1) 2 ( α − 2)

+

1

!

( α − 2) 2 ( α − 1)

Evaluate this at α  3.087097. g 0 (3.087097)  −8

1

(2.0870972 )(1.087097)

+

1

(1.0870972 )(2.087097)

The answer is Var ( αˆ ) g 0 ( αˆ ) 2  1.906033 (−4.932892) 2  46.38 . (E)

C/4 Study Manual—17th edition Copyright ©2014 ASM

!  −4.932892

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 30–32

1499

30. [Lesson 52] The expected hypothetical mean is µ  1.2β. The variance of the hypothetical means is a  (0.8)(0.2) β2  0.16β 2 . The expected process variance is 0.8 ( β )(1 + β ) + 0.2 (2β )(1 + 2β )  1.2β + 1.6β2 The credibility factor for one year is Z The credibility estimate is

0.16β 0.16β2 a   2 2 a + v 1.2β + 1.6β + 0.16β 1.2 + 1.76β 0.16β 1.44β + 1.92β2 1− (1.2β )  1.2 + 1.76β 1.2 + 1.76β

!

Equate this to 11.96 and solve for β. 1.44β + 1.92β 2  11.96 (1.2 + 1.76β )  14.352 + 21.0496β 1.92β2 − 19.6096β − 14.352  0 √ 19.6096 + 494.7598 β  10.90 3.84

(E)

31. [Lesson 35] Let p  β/ (1 + β ) . The likelihood of n claims is (1 − p ) p n . The likelihood of 1 or 2 claims is the sum of the likelihoods of 1 or 2 claims, or (1 − p )( p + p 2 )  (1 − p ) p (1 + p ) . The likelihood of the experience is therefore

Maximize this for p.

L ( p )  (1 − p ) 76 p 12+2 (2) +2 (1) (1 + p ) 2  (1 − p ) 76 p 18 (1 + p ) 2 l ( p )  76 ln (1 − p ) + 18 ln p + 2 ln (1 + p ) dl 76 18 2 − + + 0 dp 1−p p 1+p

(−76p − 76p 2 ) + (18 − 18p 2 ) + (2p − 2p 2 )  0 96p 2 + 74p − 18  0 √ −74 + 12,388  0.194278 p 192 p β  0.2411 (C) 1−p

[Sections 24.2 and 25.1.1 ] The empirical mean is 125 + 132 + 2 (135 + 147 + 160) /8  142.625. For



32.



Nelson-Åalen, we have

C/4 Study Manual—17th edition Copyright ©2014 ASM

yi

ri

si

Hˆ ( y i )

Sˆ ( y i )

125 132 135 147

8 7 6 4

1 1 2 2

0.12500 0.26786 0.60119 1.10119

0.88250 0.76502 0.54816 0.33248

PRACTICE EXAM 6, SOLUTIONS TO QUESTIONS 33–35

1500

ˆ We used Sˆ ( y i )  e −H ( yi ) in the above table. We integrate the survival function from 0 to 160:

ˆ ∧ 160]  125 + (132 − 125)(0.88250) + (135 − 132)(0.76502) E[X + (147 − 135)(0.54816) + (160 − 147)(0.33248)

 144.3726

The difference in the two estimates is 144.3726 − 142.625  1.7476 . (E) 33. [Lesson 63] Look up the inverse z i of the standard normal distribution for the five given uniform numbers. The random normal number is 0.9 + 0.2z i , where 0.2 is the square root of σ2 . Then calculate expenses as 1000R, calculate income as I  1000 + 300 − 1000R, and compare 5% of income to 15 (1.5% of premium) and 25 (2.5% of premium), taking the maximum of 15 and 0.05I but no more than 25. The following table shows the calculations: ui

zi

0.9 + 0.2z i

Income

5% of Income

Tax

0.11 0.52 0.38 0.62 0.67

−1.23 0.05 −0.31 0.31 0.44

0.654 0.910 0.838 0.962 0.988

646 390 462 338 312

32.3 19.5 23.1 16.9 15.6

25.0 19.5 23.1 16.9 15.6

The average is (25 + 19.5 + 23.1 + 16.9  15.6) /5  20.02 . (B) 34. [Lesson 9] Since frequency is constant, we just need to set the expected payment per loss equal between 2010 and 2012. In 2010, it is E[X ∧ 10000] − E[X ∧ 1000]  2000 ( e −1000/2000 − e −10000/2000 )  1199.5854

In 2012, θ  2000 (1.052 )  2205. Setting 2012’s expected payment per loss equal to 1199.5854, 2205 ( e −1000/2205 − e −u/2205 )  1199.5854 1199.5854  0.091361 e −u/2205  e −1000/2205 − 2205 u  −2205 ln 0.091361  5276.42 (A)

35.

[Lesson 43] We use the general formula for full credibility, equation (42.1) σ eF  n0 µ

!2

For a Poisson, σ 2  µ  λ, so n  e F  n0 /λ. For a negative binomial, σ2  rβ (1 + β ) and µ  rβ, so eF  n0

1+β rβ

Expressing the latter e F in terms of n, eF 

C/4 Study Manual—17th edition Copyright ©2014 ASM

nλ (1 + β ) rβ

(E)

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 1–2

1501

Answer Key for Practice Exam 7 1 2 3 4 5 6 7 8 9 10

D B A D A B C E C A

11 12 13 14 15 16 17 18 19 20

D C E B E E C E B C

21 22 23 24 25 26 27 28 29 30

C D C C D E E B C A

31 32 33 34 35

C C E B B

Practice Exam 7 1. [Lesson 42] By double expectation, the overall expected value is E[N]  E E[N | θ]  E[θ]  0.2. By the conditional variance formula, the overall variance is

f

g

Var ( N )  E[Var ( N | θ ) ] + Var (E[N | θ])  E[2θ 2 ] + Var ( θ )

 2 (Var ( θ ) + E[θ]2 ) + Var ( θ )  2 (0.6 + 0.22 ) + 0.6  1.88 By the general formula for full credibility, the number of exposures needed for full credibility is 1.96 n0 CV  0.1 2

!2

1.88  18,056 0.22

!

(D)

2. [Lesson 51] Let µ s and σs2 be the mean and variance of claim size. The hypothetical mean of aggregate losses is λµ s . Since µ s is constant, the variance of the hypothetical means is a  Var ( λ ) µ2s  0.2µ2s The process variance of aggregate losses, by the Poisson compound variance formula, is λ ( µ2s + σs2 ) . The expected process variance is v  E[λ] ( µ2s + σs2 )  0.1 ( µ2s + σs2 ) The Bühlmann k is k

0.1 ( µ2s + σs2 ) 0.2µ2s

 0.5 1 + CV2s  0.5 (1 + 32 )  5





We want n/ ( n + k )  0.9, or n/ ( n + 5)  0.9. Thus n  45 . (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 3–6

1502

3.

[Lesson 5] In 2013, E[X ∧ 1500] 

θ 2000 θ 1−  2000 1 −  857.14 α−1 θ + 1500 3500









In 2014, inflation increases θ to 2000 (1.05)  2100. Then



E[X ∧ 1500]  2100 1 −

2100  875 3600



The increase is 875/857.14  1.020837, so the average bonus in 2014 is 9.80 (1.020837)  10.00 . (A) 4. [Lessons 36, 37, 38, and 39] (A) D ( x )  Fn ( x ) − F ∗ ( x ) , and if this is positive, it means F ∗ ( x ) is less than F ( x ) . ! (B) (C)

(D)

The p–p plot is ( Fn ( x ) , F ∗ ( x )) , and if this is below the 45° line, it means F ∗ ( x ) < Fn ( x ) . ! √ Doubling the number of observations increases n, making the Kolmogorov-Smirnov critical value lower. In Anderson-Darling, the integral expressing the difference between observed and fitted is multiplied by the number of observations, making the statistic larger if there are more observations. The chi-square test squares the difference between observed and fitted and divides by fitted, so doubling the number of observations multiplies the numerators by 4, the denominators by 2, and the resulting statistic by 2. ! This statement should be corrected to say that the expected number of observations in each group is equal. #

(D) 5. [Lesson 1.4] The k th derivative of the mgf at 0 is the k th moment, and the k th derivative of the pgf at 1 is the k th factorial moment, so E[X 2 ]  40 E[X ( X − 1) ]  35

E[X 2 ] − E[X]  35 E[X]  5

Var ( X )  E[X 2 ] − E[X]2  40 − 52  15

(A)

6. [Lesson 8] X and Y only differ by their parameters σ2 . Both VaR and TVaR are standard deviation premium principles, meaning that they are µ + cσ for an appropriate c. From the two values of VaRp , we get ! 72 − 40 σY  σ X  1.6σX 60 − 40 and then

TVaRp ( X )  40 + cσX  65 65 − 40 σX  80 TVaRp ( Y )  40 + 1.6cσX  40 + 1.6 σX

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 7–8

1503

7. [Lesson 13] To modify the zero-modified negative binomial, multiply β by 0.8 to obtain β∗  1.6. Use formula (13.1): p0∗

1  1 + 1.6

1 p0  1+2

! 0.5

! 0.5

1 − p0M∗  (1 − p0M )

! 0.5

 0.620174

! 0.5

1   0.577350 3 ! 1 − p0∗

1 − p0

1 − 0.620174  0.359471  (1 − 0.6) 1 − 0.577350

The probability of 1 is p 1T∗  p1M∗

8.

1  2.6

!

(0.5)(1.6)

 0.502395 2.61.5 − 2.6  (0.502395)(0.359471)  0.3612

(C)

[Lesson 19] For the exact 90th percentile, we want x such that Pr ( X > x )  0.1, and Pr ( X > x )  Pr ( X > x | X > 0) Pr ( X > 0)  0.4 Pr ( X > x | X > 0)

so we want Pr ( X > x | X > 0)  tables have

0.1 0.4

 0.25. That means Pr (0 < X ≤ x | X > 0)  0.75. For a Pareto, the

VaRp ( X )  θ (1 − p ) −1/α − 1





VaR0.75 ( X )  8000 (0.25−1/3 − 1)  4699.2

so π0.9  4699.2. For the normal approximation, the aggregate mean is

8000  1600 E[N] E[X]  0.4 3−1

!

and the aggregate variance is

8000 E[N] Var ( X ) + Var ( N ) E[X]  0.4 3−1 2

!2

3 8000 + (0.4)(0.6) 1 3−1

!

where we used a shortcut formula for the variance of a Pareto:

α Var ( X )  E[X] α−2 2

!

The approximate 90th percentile is hatπ0.9  1600 + 1.282 23,040,000  7753.6

p

The difference is 7753.6 − 4699.2  3054.4 . (E) C/4 Study Manual—17th edition Copyright ©2014 ASM

!2

 23,040,000

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 9–12

1504

9.

[Section 28.2] d  0.04301 e 40  0.04301 e e  930

Exact exposure differs from actuarial exposure only in that it removes one-half of the deaths, so that exact exposure is 20 less than actuarial exposure, or 910. 1 − e −d/e  1 − e −40/910  0.04300 10.

(C)

[Lesson 32]



ln f ( x )  ln (ln x ) − 2 ln θ − 1 +



l ( θ ) ∼ −2n ln θ − 1 +

1 ln x θ



 1 X ln x i θ

dl 2n ln x i − + 0 dθ P θ θ2 ln x i θˆ   4.4040 2n

P

(A)

11. [Lesson 45] Let p be the proportion of each use that is rural. Based on the relationship of total expected claims to rural and urban expected claims, 1.8  p + 2 (1 − p )

so p  0.2. That is the prior. The likelihood of 2 is the normal density function, and for the 4 categories, that is

Rural

2

) /2 (0.5) e − (2−1 √  0.146763 2π 2 /2 (1) − ( 2−2 ) e √  0.398942 2π

Urban We weight the expected values with prior times likelihood to obtain the final result: 0.2 (0.146763)(1) + 0.8 (0.398942)(2)  1.9158 0.2 (0.146763) + 0.8 (0.398942)

(D)

12. [Lesson 57] Usually we estimate the expected process variance by averaging sample variances of many people or groups, but we are only given the sample variance for one bowler, so that’s all we can use here. So we estimate vˆ  5000. ˆ so we estimate aˆ  15,000 − 5,000  10,000. The total variance for a game is 15,000, and that is aˆ + v, The Bühlmann calculation with n  10 games is then 5,000  0.5 kˆ  10,000 10 20 Zˆ   10 + 0.5 21 1 20 PC  (101) + (70)  99.52 21 21

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 13–16

13.

1505

[Section 34.2] The tables provide the following VaR: f ( α, θ )  VaR0.75 ( X )  θ (0.25−1/α − 1)

Differentiating with respect to θ and α: ∂f θ (0.25−1/α ln 0.25)  ∂α α2 −1/0.8 5.4 (0.25 ) ln 0.25  −66.1674  0.82 ∂f  0.25−1/α − 1 ∂θ  0.25−1/0.8 − 1  4.6569 By the delta method, the variance is 0.4 (66.16742 ) + 65.2 (4.65692 ) + 2 (−4.3)(−66.1674)(4.6569)  5815

14.

(E)

[Lesson 39] The overall proportion is 750 + 745 + 730 + 775  0.0625 11,000 + 11,600 + 12,200 + 13,200

Let n i be number of college graduates. Under the null hypothesis, expected is 0.0625n i and variance, since the distribution is binomial (either the graduate becomes an actuary or not), is (0.0625)(0.9375) n i . The chi-square statistic is Q

(750 − 687.5) 2 (745 − 725) 2 (730 − 762.5) 2 (775 − 825) 2 + + +  11.359 687.5 (0.9375) 725 (0.9375) 762.5 (0.9375) 825 (0.9375)

Since the mean is estimated, we lose one degree of freedom. There are three degrees of freedom, and the statistic is between 11.345 and 12.838, the 99th and 99.5th percentiles of chi-square at three degrees of freedom, so the answer is (B) 15.

[Lesson 30] The sample mean is 10,000. We need to match E[X]  E[e Y ], and E[e Y ]  MY (1) . MY (1)  (1 − θ ) −2  10,000 θˆ  1 −

16.

r

1  0.99 10,000

(E)

[Lesson 5] First of all, Var ( X ∧ 50)  E[ ( X ∧ 50) 2 ] − E[X ∧ 50]2

Using what we’re given, E[X ∧ 50]  C/4 Study Manual—17th edition Copyright ©2014 ASM

50

Z 0

S ( x ) dx  27

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 17–19

1506

To calculate E[ ( X ∧ 50) 2 ], let’s use the definition of E[ ( X ∧ 50) k ]: the integral of min ( x k , 50k ) times f ( x ) , or Z 50

E[ ( X ∧ 50) k ] 

0

x k f ( x ) dx + 50k Pr ( X > 50)

Then Pr ( X > 50)  1 −

50

Z 0

f ( x ) dx  1 − 0.9  0.1

E[ ( X ∧ 50) 2 ]  830 + 502 (0.1)  1080 Var ( X ∧ 50)  1080 − 272  351

(E)

[Lesson 32] The likelihood for losses on Coverage A is e −x/θ /θ for the uncensored losses and for the censored loss. For coverage B, the θ parameter is 3 times the parameter for coverage A. We will use the letter θ for the mean of the exponential. The likelihood for losses on Coverage B is a constant times θ 4 / (3θ + x ) 5 for the loss. The likelihood function is 17.

e −100,000/θ

e −300,000/θ θ4 L (θ)  (3θ + 50,000) 5 θ4 300,000 − 5 ln (3θ + 50,000) l (θ)  − θ dl 300,000 15  − 0 2 dθ 3θ + 50,000 θ 15θ 2 − 900,000θ − 15 × 109  0

!

!

θˆ  73,589

(C)

18. [Section 40.1] For the Poisson, the maximum likelihood estimate of λ is the sample mean, which is 0.4. The loglikelihood is then l ( λˆ )  174 ln p0 + 54 ln p1 + 20 ln p2 + 2 ln p3 ˆ

ˆ

ˆ

 174 ln e −λ + 54 ln e −λ λˆ + 20 ln e −λ λˆ 2 /2 + 2 ln e −λ λˆ 3 /6













 −250 λˆ + 54 + 2 (20) + 3 (2) ln λˆ − 20 ln 2 − 2 ln 6





 −250 (0.4) + 100 ln 0.4 − 20 ln 2 − 2 ln 6  −209.076 The likelihood ratio test statistic is 2 −207.176 − (−209.076)  3.800. There is one degree of freedom for the extra parameter p0M . At one degree of freedom, 3.841 > 3.800, so the zero-modified Poisson is rejected at 5% significance. (E)





19. [Lesson 63] Let’s generate the 4 drug prices. Linearly interpolating between 250 and 1000, 0.46 → 250 +

0.46 − 0.3 (1000 − 250)  650 0.6 − 0.3

Linearly interpolating between 2500 and 5000, 0.84 → 2500 + C/4 Study Manual—17th edition Copyright ©2014 ASM

0.84 − 0.8 (5000 − 2500)  3000 1 − 0.8

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 20–21

1507

It doesn’t matter what 0.25 goes to, since nothing is paid for prescriptions under 250. 0.7 − 0.6 (2500 − 1000)  1750 0.8 − 0.6 The payments are 0.8 (650 − 250)  320 for the first prescription, 0.8 (1000 − 250) + (3000 − 2000)  1600 for the second, 0 for the third, and 0.8 (1000 − 250)  600 for the fourth. The average is 0.7 → 1000 +

x¯ 

320 + 1600 + 0 + 600  630 4

(B)

20. [Lesson 45] Since the agents are selected randomly, the prior, which gives a probability of 1/2 to each, can be ignored. The 0.3 probability that an interview results in a sale is a “severity modification”. The number of sales per week follows a negative binomial distribution with the β from the interview distribution multiplied by 0.3. So for Sol, the number of sales follows a negative binomial with r  2 and β  0.6. The probability of exactly 2 sales in a week is r+2−1 2

!

β2 3  r+2 2 (1 + β )

!

!

0.62  0.164795 1.64

!

Expected number of sales in a week is 2 (0.6)  1.2 For Tina, the number of sales follows a binomial distribution with r  2 and β  0.9. The probability of exactly 2 sales in a week is ! ! 3 0.92  0.186463 2 1.94 Expected number of sales in a week is 2 (0.9)  1.8. The predictive probability of selling one policy in the next week is 0.164795 (1.2) + 0.186463 (1.8)  1.5185 0.164795 + 0.186463

21.

(C)

[Lesson 38] The formula for the Anderson-Darling statistic is A2  −nF ∗ ( u ) + n +n

k  X j1

k  X

Sn ( y j )

j0

Fn ( y j )

2 

2 

ln S∗ ( y j ) − ln S ∗ ( y j+1 )

ln F ∗ ( y j+1 ) − ln F ∗ ( y j )





(38.1)

Since there is no truncation, y0  0 and u  y j+1  ∞. Since y j+1  ∞, the first sum is only up to k − 1. Since k  2, the first sum has 2 terms and the second sum has 2 terms. Also n  2. Therefore, F ∗ ( u )  1, and the first summand is −2. Note that S2 ( y0 )  S2 (0)  1, S2 ( y1 )  S2 (3)  0.5, and S2 ( y2 )  S2 (7)  0. In each case, F2 ( x )  1 − S2 ( x ) . The first sum in the formula is

      (1) 2 ln 1 − ln S (3) + (1/2) 2 ln S (3) − ln S (7)  − ln S (3) + (1/4) ln S (3) − ln S (7)

The second sum in the formula is

      (1/2) 2 ln F (7) − ln F (3) + ln 1 − ln F (7)  (1/4) ln F (7) − ln F (3) − ln F (7)

Adding −2 to twice the sum of these two sums, we get (C). C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 22–26

1508

22.

[Lesson 62] The variance of Fˆ (100) is F (100) 1 − F (100)





n

This is maximized when F (100)  0.5, so that is the worst possible case. The variance is then 0.25/n. We want the half-width of the 95% confidence interval to be 0.005, so 1.96 0.25/n  0.005

p

p

0.25/n  0.002551 0.25  6.5077 × 10−6 n 0.25  38,416 n 6.5077 × 10−6

(D)

23. [Section 36.1 and Lesson 37] D ( x )  Fn ( x ) − F ∗ ( x ) , and the Kolmogorov-Smirnov D is the largest absolute value of D ( x ) . In the graph, the largest deviation from 0 occurs around 825, and is approximately 0.14 . (C) 24.

[Lesson 26]

L Sˆ ( y8 ) ! z0.975 Var q

U  exp





Sˆ ( y8 ) ln Sˆ ( y8 ) √ ! 1.96 0.0001  exp  0.7620 0.925 ln 0.925

Sˆ ( y8 ) 1/U  0.9251/0.7620  0.903

25.

(C)

[Lesson 32] θ ln f ( x )  ln τ + τ ln θ − τ ln x − x l ( τ, θ )  n ln τ + nτ ln θ − τ

X



− ln x

ln x i −

X θ !τ xi

X X ( θ/x ) τ ∂l n i  + n ln θ − ln x i − ln ( θ/x i ) ∂τ τ



X

ln x i

(D)

26. [Lesson 27] The mode of a gamma is θ ( α − 1) , so θ  x i / ( α − 1) . Let X be the unsmoothed distribution and Y the kernel-smoothed distribution. Use the conditional variance formula to calculate the variance of Y. The mean of a gamma distribution is the product of the parameters, which for our kernel is ! ! xi α α  xi α−1 α−1 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 7, SOLUTION TO QUESTION 27

1509

and the variance is the product of the α parameter and the square of the other parameter, which for our kernel is !2 ! xi α α  x2 α−1 ( α − 1) 2 i Var ( Y )  Var (E[Y | x i ]) + E[Var ( Y | x i ) ]

! !

α xi + E α−1

 Var

α  α−1 5  4

!2

!2

"

!

α x2 ( α − 1) 2 i

#

!

α E[X 2 ] Var ( X ) + ( α − 1) 2

Var ( X ) +

5 E[X 2 ] 16

where the moments of X are based on the empirical distribution. Based on our summary statistics, E[X 2 ] 

13,674  1367.4 10

Var ( X )  1367.4 − Var ( Y ) 

27.

250 10

!2

 742.4

!2

5 5 (742.4) + (1367.4)  1587.31 4 16

(E)

[Lesson 53] For claim counts, the hypothetical mean and process variance are θ. v  E[θ]  0.2 a  Var ( θ ) 

0.42 12

0.2  15 0.42 /12 n Z  0.8 n + 15 n  60 k

For aggregate losses, the hypothetical mean is θ (1000θ )  1000θ 2 and the process variance, by the Poisson compound variance formula, is v ( θ )  θ E[X 2 ]  θ 2 · (1000θ ) 2  2,000,000θ3





The moments of a uniform distribution on [0, b] are E[X k ] 

b

Z 0

b

bk xk x k+1  dx  b b ( k + 1) 0 k + 1

The expected process variance and variance of hypothetical means are v  2,000,000 E[θ 3 ]  2,000,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

0.43  32,000 4

!

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 28–30

1510

0.44 0.44 a  1,000,000 (E[θ ] − E[θ ] )  1,000,000  2275 59 − 5 9 32,000 k  14.0625 2275 59 60 Z  0.8101 (E) 60 + 14.0625 4

28.

!

2 2

[Lesson 5] Let X be loss size. By the double expectation formula, E[X]  E[X | X ≤ 1000] Pr ( X ≤ 1000) + E[X | X > 1000] Pr ( X > 1000) 1380  500x + 2700 (1 − x )

where x  Pr ( X ≤ 1000) . Then 2700 − 2200x  1380 2200x  1320 x  0.6 Using double expectation again, E[ ( X − 1000)+ ]  E[ ( X − 1000)+ | X ≤ 1000]x + E[ ( X − 1000)+ | X > 1000] (1 − x )  (0)(0.6) + (2700 − 1000)(1 − 0.6)  680

(B)

29. [Section 24.2] The 25th percentile corresponds to the time t for which S ( t )  0.75, or H ( t )  − ln 0.75  0.2877. Summing up 1/r i starting with r i  9, we have 1 1 +  0.2361 9 8 1 0.2361 +  0.3790 7

So for any time up but not including to the third observation, the estimated probability of survival to that time is greater than 0.75, and for any time starting with the third observation, the estimated probability of survival is less than 0.75. That makes the third observation 15 the highest (and only) 25th percentile. (C) 30. [Subsection 21.1.3] Let X be the uniform random variable. The distribution function of the maximum random variable Y5 is FY5 ( x )  Pr ( X1 ≤ x, X2 ≤ x, . . . , X5 ≤ x )  This is a beta distribution with a  5, b  1, and θ. Its mean is aθ 5 5  θ θ a+b 5+1 6

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

x θ

!5

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 31–32 and its second moment is

1511

a ( a + 1) θ 2 (5)(6) 5 2  θ2  θ ( a + 1)( a + b + 1) (6)(7) 7

Let ψ be our estimator, and µ  θ/2 the distribution mean. Then biasψ ( µ ) 

θ θ 5 θ− − 12 2 12

Var ( ψ )  E[ ( Y5 /2) 2 ] − E[Y5 /2]2  The mean square error is MSEψ ( µ ) 

31.

1 5 2 1 5θ θ − 4 7 4 6





θ2 5θ 2 +  0.0119θ 2 2 1008 12

!2 

5θ 2 1008

(A)

[Section 24.1] The risk sets and events are yi

si

ri

500 5,000 8,000 12,000 15,000

1 1 1 1 1

3 4 4 3 2

The risk sets were computed using the usual rules for ties. For 500, the risk set excludes the 3 policies with deductibles. For 5000, the risk set excludes the policy with deductible 5000 as well as the loss of 500. Using the product limit estimator at the maximum covered loss, 2 Sˆ (20,000)  3 Extrapolating to 25,000,

!

3 4

!

3 4

!

2 3

!

1  0.125 2

!

Sˆ (25,000)  0.12525,000/20,000  0.074325

(C)

32. [Section 61.3] d  ln (1 + β )  ln 3  1.0986 and c  1.5d  1.6479. The mean time of the first number is 1/1.6479, so ln (1 − 0.73) s0  −  0.794538 1.6479 Continuing, λ 1  1.6479 + 1.0986  2.7465 λ 2  2.7465 + 1.0986  3.8451

ln (1 − 0.09)  0.034338 2.7465 ln (1 − 0.50) s2  −  0.180266 3.8451

s1  −

t  0.794538 + 0.034338  0.828876 t  0.828876 + 0.180266  1.009141

Three numbers were required to get above 1, so exactly 2 events occurred before time 1 and the generated number is 2 . (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 7, SOLUTIONS TO QUESTIONS 33–34

1512

33. [Lesson 46] The likelihood of no deaths is (1 − q ) 2 . The prior is a normal density function; let’s call it X. The posterior probability will be the integral of q times the likelihood times the prior, divide by the integral of the likelihood times the prior. In other words, it is E[Q (1 − Q ) 2 ] E[Q − 2Q 2 + Q 3 ]  E[ (1 − Q ) 2 ] E[1 − 2Q + Q 2 ]

We are given the mean and variance of the normal distribution, so E[Q]  µ  0.005

E[Q 2 ]  µ2 + σ2  0.000025 + 0.000001  0.000026 To obtain E[Q 3 ], we’ll use the fact that a normal distribution is symmetric, so that its third central moment is 0. E[ ( Q − µ ) 3 ]  0

E[Q 3 ] − 3 E[Q 2 ]µ + 2µ3  0

E[Q 3 ]  3 E[Q 2 ]µ − 2µ3  3 (0.000026)(0.005) − 2 (0.0053 )  0.00000014 The posterior expected value is 0.005 − 2 (0.000026) + 0.00000014  0.004998 1 − 2 (0.005) + 0.000026

(E)

34. [Section 56.1] We perform a regression. The independent variable X is the mean of the two observations, 1, 3, or 5, and the dependent variable Y is the Bayesian estimate, 2.0, 3.2, or 3.6. The weights are the probabilities; since 1 and 5 are equally likely, the probabilities of 1,1 and 5,5 are 1/4 apiece and the probability of 1,5 in either order is 1/2. Let w i be the weights. Since E[X] P  3, we subtract 3 from each X i : x i  X i − 3. There is no need to adjust Yi if we only need to calculate w i x i y i , since

X and

P

w i x i Yi 

X

w i x i ( y i + Y¯ ) 

X

w i x i y i + Y¯

X

wi xi

w i x i  0.

X

w i x i y i  0.25 (−2)(2.0) + 0.25 (2)(3.6)  0.8

X

w i x 2i  0.25 (4) + 0.25 (4)  2 0.8 βˆ   0.4 2

ˆ it follows that and since Z  β, 2  0.4 2+k 2  0.8 + 0.4k 2 − 0.8 k  3 0.4

C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

PRACTICE EXAM 7, SOLUTION TO QUESTION 35

1513

35. [Lesson 25] There are a total of 28 + 25 + 15 + 6  74 claims. The subinterval (25,40] of (25,50] is 3/5 of the length of the total interval, so of the 25 claims in the (25,50] interval, 3/5 of them, or 15, are below 40, based on the uniform distribution. That means a total of 10 + 15 + 6  31 claims are above 40. In the first two intervals, we integrate the density function, the histogram, times x 3 . E[ ( X ∧ 40) 3 ]  

C/4 Study Manual—17th edition Copyright ©2014 ASM

25

Z 0

28 x 3 dx + (74)(25)

28 (74)(25)

254 4

!

+

Z

40 25

25 31 x 3 dx + (403 ) (74)(25) 74

25 404 − 254 31 + (403 )  35,618 (74)(25) 4 74

!

(B)

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 1–2

1514

Answer Key for Practice Exam 8 1 2 3 4 5 6 7 8 9 10

A E D D C E D E B B

11 12 13 14 15 16 17 18 19 20

D B A E B A A E C D

21 22 23 24 25 26 27 28 29 30

E C B D D A B A E E

31 32 33 34 35

C C D B A

Practice Exam 8 1.

[Subsection 25.1.1] We calculate the first two moments. 6 + 8 + 9 + 3 (10)  8.8333 6 f g 62 + 82 + 92 + 3 (102 ) E (T ∧ 10) 2   80.1667 6 (A) Var (T ∧ 10)  80.1667 − 8.83332  2.139 E[T ∧ 10] 

2.

[Lesson 6] For a mixture, E[X ∧ 500] and E[X] are weighted averages of the (limited) expected

values of the components, and so is E[X] − E[X ∧ 500]  the θ  1000 component:

θ  θ  α−1 α−1 θ+500

for each Pareto component. For

1000  666.67 E[X] − E[X ∧ 500]  1000 1500

!

For the θ  2000 component: E[X] − E[X ∧ 500]  2000

2000  1600.00 2500

!

A franchise deductible pays an additional 500 per claim, or 500 1 − F (500) per loss.



1000 1 − F (500)  0.5 1500 so the average additional amount paid is

500 (122) 225

!2

2000 + 0.5 2500



!2 

 271.11. The total average payment per loss is

0.5 (666.67 + 1600) + 271.11  1404.44

C/4 Study Manual—17th edition Copyright ©2014 ASM

122 225

(E)

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 3–5

3.

1515

[Lesson 53] The hypothetical mean is the mean of the gamma distribution, AΘ. We calculate a. E[AΘ]  (3)(10)  30 E ( AΘ) 2  E A2 E Θ2

f

g

f

5

Z 

g

f

x 2 dx

!

4

1

since they’re independent since they’re independent

g

(102 + 2)

124 (102)  1054 12

!



a  1054 − 302  154 The process variance, or the variance of the gamma distribution, is AΘ2 . We calculate the expected process variance v. E AΘ2  E[A] E Θ2

f

g

f

g

since they’re independent

 3 (102)  306 a 154 Z   0.3348 a + v 154 + 306

(D)

4. [Section 8.3 and Lesson 15] We’ll calculate the moments of the aggregate distribution, using compound variance for the variance. E[N]  0.15 (1) + 0.05 (2)  0.25 E[N 2 ]  0.15 (1) + 0.05 (4)  0.35 Var ( N )  0.35 − 0.252  0.2875 E[X]  (4)(100)  400

Var ( X )  (4)(1002 )  40,000 E[S]  (0.25)(400)  100 Var ( S )  (0.25)(40,000) + (0.2875)(4002 )  56,000 By formula (8.7), 2

e −1.645 /2  0.103111 √ 2π ! p 0.103111  588.01 TVaR0.95 ( S )  100 + 56,000 0.05 φ (1.645) 

(D)

5. [Lesson 55] For the combination of a Poisson model and gamma prior, the Bühlmann k is 1/θ. Since αθ  21 and αθ2  18 , θ  14 and k  4. So n  0.75 n+4 n  12

C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 6–10

1516

6. [Lesson 40] We first compare the exponential and the inverse Gaussian models. 2 (123.2 − 121.4)  3.6 < 3.84, so we accept the exponential model. Next we compare the exponential and the transformed gamma models. 2 (123.2 − 120.4)  5.6 < 5.99, so we accept the exponential model. In order to prefer the transformed beta, we will need 2 (123.2 − x ) > 7.82, the critical value of the chi-square distribution with 3 degrees of freedom at 5%. Then x  123.2 − 3.91  119.3 . (E) 7.

[Lesson 21] limn→∞

n−1 n

 1, so I is asymptotically unbiased. In II,

!

θ  θ, α−1

ˆ  3 E[x] ¯ 3 E[ θ]

so it is unbiased. Similarly, III is unbiased, even though the variance of the third estimator is infinite. (D) 8. [Lesson 10] The bonus in millions for X expressed in millions is 0.01 ( X ∧ 11 − X ∧ 10) + 0.005 ( X ∧ 12 − X ∧ 11) . The tables for the single parameter Pareto don’t work for E[X ∧ k] for α  1, so we’ll calculate it by integrating the survival function from 0 to k, assuming k ≥ θ. k

Z E[X ∧ k] 

0 θ

Z 

0



0

S ( x ) dx +

k

Z

S ( x ) dx

θ θ

Z

S ( x ) dx

1 dx +

k

Z θ

θ dx x

! k + * /  θ + θ (ln k − ln θ )  θ .1 + ln θ , We substitute θ  9 (in millions). E[bonus]  0.01 (9) (1 + ln 11/9) − (1 + ln 10/9) + 0.005 (9) (1 + ln 12/9) − (1 + ln 11/9)









 0.09 (0.09531) + 0.045 (0.08701)  0.012493

Multiplying by a million, the answer is 12,493 . (E) 9. [Lesson 28] With the Kaplan-Meier product-limit estimate, each of the first five listed lives are in the risk set at time 46.3 There is one death at time 46.3. There are no other deaths at age 46. Therefore qˆ 46  1/5  0.2. With the interval-based estimate, the population at the beginning of age 46 is the first four lives. Two additional lives enter over during age 46 and one withdraws. With the uniform entry and withdrawal 1 assumption, that makes the risk set 4 + 0.5 (2 − 1)  4.5. There is one death, so q˜46  4.5  0.222222. The difference is |0.2 − 0.222222|  0.022222 . (B) 10. [Lessons 17 and 18] We need the aggregate probabilities of 0, 1, and 2, g0 , g1 , and g2 . We could ( m+1) q q use the recursive formula. For the binomial, a  − 1−q  −0.25 and b  1−q  1. g0  P ( f0 )  P (0.2)  1 + q (−0.8)



C/4 Study Manual—17th edition Copyright ©2014 ASM

m

 0.843  0.592704

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 11–12

1517

1 (−0.25 + 1)(0.5)(0.592704)  0.21168 1 + 0.25 (0.2)  1  g2  (−0.25 + 0.5)(0.5)(0.21168) + (−0.25 + 1)(0.2)(0.592704)  0.109872 1.05

g1 

You could also get these values by modifying the binomial distribution to eliminate claim sizes of 0. Then 0.5  0.625 and f2  0.2 for the modified distribution, q  0.2 (0.8)  0.16, f1  0.8 0.8  0.25, and g0  0.843  0.592704 g1  3 (0.842 )(0.16) (0.625)  0.21168





g2  3 (0.842 )(0.16) (0.25) + 3 (0.84)(0.162 ) (0.6252 )  0.109872









Either way, we then calculate the survival function: S (0)  1 − 0.592704  0.407296

S (1)  0.407296 − 0.21168  0.195616

and the answer is

11. or

S (2)  0.195616 − 0.109872  0.085744

0.407296 + 0.195616 + 0.4 (0.085744)  0.6372096 am ∗ 1+am ∗ ,

[Lesson 54] The credibility factor is

(B)

where m ∗ is the sum of the reciprocals of the variances,

1 1 1 19 + +  5 4 2 20 19/20 19 Z  1 + 19/20 39

m∗ 

(D)

12. [Section 35.4] Maximum likelihood sets the fitted mean equal to  . the sample mean for ( a, b, 1) distributions. The sample mean is x¯  11 + 12 (2) + 12 (3) + 16 (4) + 9 (5) 60  3. Therefore

(1/2) β 1 − (1 + β ) −1/2

3

0.5β − 3  −3 (1 + β ) −1/2 9 0.25β 2 − 3β + 9  1+β 0.25β2 (1 + β ) − 3β (1 + β ) + 9 (1 + β )  9

The 9’s on the two sides of the equation cancel, and we can divide by β since β  0 is a spurious solution. 0.25β + 0.25β 2 − 3 − 3β + 9  0 β2 − 11β + 24  0

( β − 3)( β − 8)  0 β  3, 8

However, β  8 is a spurious solution introduced by squaring the original equation; if you plug it into the second displayed line, you will see one side is positive and the other side is negative. Therefore, the solution is βˆ  3 . (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 13–17

1518

13. [Section 56.3] The correlation is equal to the covariance divided by the variance. The covariance is the variance of the hypothetical means and the variance is the sum of the expected process variance and the variance of the hypothetical means, so the answer is 2/ (2 + 6)  1/4 . (A) 14.

[Lesson 46] The posterior is proportional to θ 4 e −θ/1000 θ 3 e −θ ( 1000 + 2000 + 5000 )  θ 7 e −0.0027θ 1

1

1

This is the integrand for a gamma distribution with α  8 and scale parameter 8 0.0027  2962.96 . (E)

1 0.0027 ,

so its mean is

15. [Lesson 64] For the two samples {100, 500} and {500, 100}, the estimate is perfect. For {100, 100}, the LER is 1 and for {500, 500}, the LER is 0.2. The bootstrap approximation is 1* 1 . 0.2 − 4 3



2



+ 1−

1 3

2

, 16.

+/  -

2 2 15

+ 4

2 2 3



26  0.115556 225

(B)

[Lesson 32] The likelihood of each claim x i is

Therefore:

f (xi ) 3θ 3  1 − F (250) (θ + xi )4

!

θ + 250 θ

!3 

3 ( θ + 250) 3 (θ + xi )4

9 ( θ + 250) 6 ( θ + 400) 4 ( θ + 1000) 4 l ( θ )  ln 9 + 6 ln ( θ + 250) − 4 ln ( θ + 400) − 4 ln ( θ + 1000) dl 6 4 4  − − dθ θ + 250 θ + 400 θ + 1000 2 2 3 − − 0 θ + 250 θ + 400 θ + 1000 (3θ2 + 4200θ + 1,200,000) + (−2θ2 − 2500θ − 500,000) + (−2θ2 − 1300θ − 200,000)  0 L (θ) 

θ 2 − 400θ − 500,000  0 √ 400 + 2,160,000 400 + 1469.70   934.85 θˆ  2 2

The estimated mean of the distribution is

934.85 α−1



934.85 2

 467.43 . (A)

17. [Section 61.1] The first component is an inverse Weibull distribution with τ  3, θ  1000 and the second component is a loglogistic distribution with γ  4, θ  1000. The first, third, and fourth pairs of uniform numbers have first components that are 0.5 or greater, so they go to loglogistic, while the second pair goes to inverse Weibull. For a loglogistic distribution, the tables state VaRp ( X )  θ ( p −1 − 1) −1/γ  1000 ( p −1 − 1) −1/4 , so 0.418 → 1000 (0.418−1 − 1) −0.25  920.58

0.932 → 1000 (0.932−1 − 1) −0.25  1924.10 0.000 → 1000 (0−1 − 1) −0.25  0

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 18–20

1519

For the last one, if you work through the formula, 0−1  ∞ and 1/∞0.25  0, but it is unnecessary to work through the formula. We know that F (0)  0 so the inverse of 0 is 0. For an inverse Weibull distribution, the table states VaRp ( X )  θ (− ln p ) −1/τ  1000 (− ln p ) −1/3 , so 0.100 → 1000 (− ln 0.100) −1/3  757.29 The average of the four numbers is (920.58 + 757.29 + 1924.10 + 0) /4  900 . (A) 18.

[Lesson 53] Let N be the number of claims. Then the hypothetical mean is HM  E[N | α]  α (2) + (1 − α )(4)  4 − 2α a  Var (HM)  Var (4 − 2α )  4 Var ( α ) 

1 3

1 because the variance of a uniform distribution on [0, 1] is 12 . For the process variance, recall that the second moment of a geometric with mean β is the variance plus the mean squared, or β (1 + β ) + β2 , which is 10 for β  2 and 36 for β  4.

E ( N | α ) 2  α (10) + (1 − α )(36)  36 − 26α

f

g

PV  Var ( N |α )  E N 2 | α − E[N | α]2

f

g

 36 − 26α − (4 − 2α ) 2  −4α2 − 10α + 20

The second moment of a uniform distribution on [0, 1], E[α2 ], is 31 . v  E[PV]  E[−4α 2 − 10α + 20]  −4 Z

19.

a  a+v

1 3

1 3

+

41 3

1 42



1 41 1 − 10 + 20  3 2 3

!

!

(E)

[Section 19.1] The aggregate loss distribution is 0 with probability

1 1+β

 13 , and exponential with

mean θ (1 + β )  1500 (1 + 2)  4500 with probability 32 . Thus the median is x such that Pr ( S > x )  0.5, or Pr ( S > x|S > 0) Pr ( S > 0)  0.5, and Pr ( S > 0)  23 so Pr ( S > x|S > 0) must be 34 . Thus we need the 25th percentile of the exponential (the point for which the probability of being above the point is 3/4). Then e −x/4500  0.75 x  −4500 ln 0.75  1294.57

(C)

20. [Lesson 53] The hypothetical mean is 1000/ ( α − 1) . We’ll ignore the factor 1000 in all calculations, since it is a scale factor and cancels out when k  v/a is computed. The first and second moments of it are: 4

Z 4

Z 3 C/4 Study Manual—17th edition Copyright ©2014 ASM

3

dα  ln 3/2 α−1

dα 1 1 1  −  2 2 3 6 ( α − 1)

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 21–22

1520

so the variance of the hypothetical means is a  1/6 − (ln 3/2) 2  0.0022647. The process variance, with the 1000 factor, is





2 (10002 ) 10002 − ( α − 1)( α − 2) ( α − 1) 2

As before, we’ll ignore the 1000 factor. We’ve already computed the  expected value  of the second summand as 1/6. To integrate the first summand we’ll need to split 1/ ( α − 1)( α − 2) into partial fractions: 1

B C + ( α − 1)( α − 2) α − 1 α − 2 1  C ( α − 1) + B ( α − 2) 

C+B 0

C + 2B  −1 C1

B  −1

So 4

Z 3

2 dα  ( α − 1)( α − 2)

4

Z



3



2 2 + dα α−1 α−2



 −2 ln 3/2 + 2 ln 2  0.575364

and v  0.575364 − 1/6  0.408697. Bühlmann’s k is 0.408697/0.0022647  180.46 .

(D)

21. [Lesson 34] For an inverse Pareto distribution, Pr ( X > 500)  1 − FX (500; τ, θ )  1 − apply the delta method (equation (34.5) on page 660) to this function. 500 g ( τ, θ )  1 − 500 + θ ˆ θˆ )  − g τ ( τ, ˆ θˆ )  τˆ g θ ( τ,

! τˆ

500

! τ−1 ˆ

500 + θˆ

ln

500

q



L g ( τ, ˆ θˆ )  Var 







!

500 + θˆ 500

−

!

(500 + θˆ ) 2  0.01

L g ( τ, ˆ θˆ )  0.1266 0.002315 Var 

We



500

500 + θˆ

500  τ 500+θ .

−0.1

0.0001285  0.01134

5 6

2

!2 5 6

−0.1 5

ln

5  0.1266 6

!

!

5  0.002315 3600

!

0.1266  0.0001285 0.002315

!

!

(E)

22. [Lesson 33] Since the distribution is truncated at the deductible, for the 250 deductible, the likelihood of each loss x i is 1 −x i /θ 1 θe  e − ( x i −250)/θ θ e −250/θ For the 500 deductible, the likelihood of each loss y i is 1 − ( x i −100) /θ θe e −400/θ C/4 Study Manual—17th edition Copyright ©2014 ASM



1 − ( x i −500)/θ e θ

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 23–24

1521

In other words, shifting has no effect! The exponential distribution is memoryless. So we can use the usual shortcut of exposure over number of uncensored claims, or

(300 + 400 + 500 + 600 + 700) − 5 (250) + (600 + 700 + 800) − 3 (500) 1850   231.25 θˆ  8 8

23.

(C)

[Lesson 46] f ( α ) is the prior density for α. The likelihood of one claim given α is Pr (1|α )  α (0.1e −0.1 ) + (1 − α )(0.2e −0.2 )

The numerator of the posterior density of α is Pr (1|α ) f ( α )  0.2e −0.1 ( α ) + 0.4e −0.2 (1 − α )





2 (1 − α )



The denominator of the posterior density is the integral of this expression from 0 to 1. We want the posterior mean, and the mean is the integral of α times the posterior density. So we want (in the following expression, 2 was canceled)

R

1 (0.2e −0.1 α2 (1 − α ) + 0.4e −0.2 α (1 − α ) 2 ) dα 0 R1 (0.2e −0.1 α (1 − α ) + 0.4e −0.2 (1 − α ) 2 ) dα 0

(*)

These integrals are not hard to do, and you are invited to work them out directly. But you can avoid working them out by recognizing that the integrands are all constant multiplies of densities of beta functions. The density of a beta function is Γ ( a + b ) a−1 f (x )  x (1 − x ) b−1 Γ(a )Γ(b ) Since this must integrate to 1, it follows that 1

Z 0

α a−1 (1 − α ) b−1 dα 

Γ(a )Γ(b ) Γ(a + b )

So the numerator of (*) is 0.2e −0.1

Γ (3) Γ (2) Γ (2) Γ (3) 1 1 + 0.4e −0.2  (0.2) e −0.1 + (0.4) e −0.2  0.042372 Γ (5) Γ (5) 12 12

!

!

and the denominator of (*) is 0.2e

−0.1

Γ (1) Γ (3) 1 1 Γ (2) Γ (2) + 0.4e −0.2  (0.2) e −0.1 + (0.4) e −0.2  0.139325 Γ (4) Γ (4) 6 3

!

!

The answer is 0.042372/0.139325  0.30412 . (B) 24. [Lesson 33] You can use the exponential shortcut; exposure is 45 + 12 (5)  105, and there are 30 untruncated claims, so 105 30  3.5. However, we want an integer θ, so we must actually write down the likelihood function or its logarithm and see which integer maximizes it. The likelihood function is 1 −x i /θ for each of the claims below 5 and e −5/θ for the claims of 5, and the product of these terms is θ e L (θ)  C/4 Study Manual—17th edition Copyright ©2014 ASM

1 [−45−12 (5) ]/θ e −105/θ e  θ 30 θ 30

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 25–26

1522

l ( θ )  −30 ln θ −

105 θ

Plugging in values around 3.5, we get: θ 2 3 4 5

l (θ) −73.294 −67.958 −67.839 −69.283

The answer is therefore 4 . (D) 25.

[Lesson 57] 225 150 75 9 x¯ B   7.5 x¯ C   15 25 20 5 100 + 125 + 150 + 75 x¯  9 10 + 15 + 20 + 5  2  2 100 125 50 vˆ  10 − 9 + 15 −9  10 15 3 2 2 2 25 + 20 + 5 50 −  29 50 2 20 (7.5 − 9) 2 + 5 (15 − 9) 2 − 2 (50/3) 191 3 aˆ    6.6092 29 29 50/3 kˆ   2.5217 6.6092 25 ZA   0.9084 25 + 2.5217 20 ZB   0.8880 20 + 2.5217 5 ZC   0.6647 5 + 2.5217 ZA + Z B + Z C  0.9084 + 0.8880 + 0.6647  2.4611 (D) x¯ A 

n

26. [Lesson 35] The likelihood function for each individual is e −λ λn i !i , where n i is the number of claims for that individual. We can ignore the constant n i !. So for all 50 individuals, the likelihood and loglikelihood functions are L ( λ )  e −50λ

Y

λ ni

l ( λ )  −50λ + (ln λ )

X

ni

If the maximum of this occurs at λ  2, then its value must be higher than the value at λ  1 and at λ  3, and conversely, since the loglikelihood function is unimodal, this condition is sufficient. So we must solve 2 inequalities.

−50 (2) + (ln 2) C/4 Study Manual—17th edition Copyright ©2014 ASM

l (2) > l (1) X

n i > −50 (1)

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 27–29

1523

50  72.13 ln 2 l (2) > l (3)

−50 (2) + (ln 2)

X

ni >

X

n i > −50 (3) + (ln 3)

X

ni

50 > ni ln 3 − ln 2 X n i < 123.32

X

Since n  27.

P

n i must be an integer, it follows that it is within the range [73, 123] . (A)

[Lesson 45] The probability of zero claims in class A is 2e −0.1 e −0.2 +  0.876135 3 3

and the probability of zero claims in class B is e −0.1 2e −0.2 +  0.847433 3 3 so the probability of at least one claim is the complement, or 0.123865 for A, 0.152567 for B. Then the 0.123865 probability of being in class A, given at least one claim, is 0.123865+0.152567 , and the probability of being in 0.152567 class B, given at least one claim, is 0.123865+0.152567 . Then the probability of at least one claim in the following year is 0.1238652 + 0.1525672 (B)  0.139706 0.123865 + 0.152567 28.

[Lesson 11] Backing out a and b, b 3 b a+ 4 b 12 b a+

240 2 120 288   1.2 240 

 0.8  9.6

a  −1.2

So this is a binomial, and

120 3.6 120 n0   3.968 (3.6)(8.4)

b  3.6 2 b a +  8.4 1

a+

29.

n1 

[Lesson 7] Let X be loss size. The loss elimination ratio is 100 

Z E[X]  C/4 Study Manual—17th edition Copyright ©2014 ASM

0

1 − F ( x ) dx



E[X∧20] E[X] .

(A)

PRACTICE EXAM 8, SOLUTION TO QUESTION 30

1524

! 1/4 ! 1/2 3 x 1 x * + dx  1− + 4 100 4 100 0 !, ! ! ! 3 4 1 2  100 − (100) − (100)  23 13 Z

100

4

5

4

3

We will use equation (5.6) to evaluate E[X ∧ 20]. E[X ∧ 20] 

20

Z 0

 20 −

! 1/4 ! 1/2 1 x *.1 − * 3 x ++/ dx + 4 100 4 100 -, ! ,! ! ! ! ! 5/4 3/2 3 4

4 5

1 20 − 4 1001/4

2 3

20 1001/2

 20 − 8.0249 − 1.4907  10.4844 10.4844 LER   0.4493 (E) 23 13 30. [Lesson 6] The conditional survival function given that X > 20 for a single-parameter Pareto random variable X is !α θ α 20 x  S ( x | x > 20)  θ α x 20

which is a single-parameter Pareto with θ  20. So the expected value and variance for each class given that the loss size is greater than 20 can be computed using θ  20 instead of the actual value of θ for the class. Then, letting I be the class, αθ (3)(20)   30 α−1 2 f g αθ2 (3)(202 ) E X 2 | Class A & X > 20    1200 α−2 1 Var ( X | Class A & X > 20)  1200 − 302  300 (4)(20) E[X | Class B & X > 20]   26 32 3 f g (4)(202 ) E X 2 | Class B & X > 20   800 2 E[X | Class A &X > 20] 

Var ( X | Class B & X > 20)  800 − 26 32



2

 88 89

We must also revise the probabilities of being in Classes A and B to account for the conditioning. We use Bayes’ theorem to do this. Pr ( A | X > 20) 

Pr ( X > 20 | A ) Pr ( A ) Pr ( X > 20 | A ) Pr ( A ) + Pr ( X > 20 | B ) Pr ( B )

10 Pr ( X > 20 | A )  S (20 | A )  20 10 Pr ( X > 20 | B )  S (20 | B )  20 C/4 Study Manual—17th edition Copyright ©2014 ASM

!3



1 8



1 16

!4

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 31–32

1525

(1/8)(0.6)  0.75 (1/8)(0.6) + (1/16)(0.4) Pr ( B | X > 20)  1 − Pr ( A | X > 20)  1 − 0.75  0.25

Pr ( A | X > 20) 

Now we’re ready to calculate the variance.

Var ( X | X > 20)  E[Var ( X | I & X > 20) ] + Var E[X | I & X > 20]





 (0.75)(300) + (0.25)(88 98 ) + (0.75)(0.25)(30 − 26 32 ) 2







8900 25 +  249.3056 36 12

(E)

31. [Lesson 62] A 95% confidence interval is constructed by adding and subtracting 1.96 times the standard deviation. If σ 2 is the variance of the distribution, then the variance of the mean of n observations 2 is σn , and the standard deviation is √σn . If µ is the mean of the distribution, the required accuracy will be achieved if 1.96σ  0.02µ √ n p √ 1.96σ  0.02 n  0.02 50,806  4.50804 µ σ 4.50804   2.3 (C) µ 1.96 Alternatively, you can use limited fluctuation credibility formulas. The standard for full credibility is 50,806 exposures, which is the coefficient of variation squared times n0 according to formula (42.1) on page 828. So 50,806  n0 CV2 1.96 50,806  0.02

!2

CV2

0.02 CV  50,806 1.96 2

CV  2.3 32.

!2

 5.29

(C)

[Lesson 58] µˆ  vˆ  x¯  x¯ 1 

1+2+0+7+4  0.5 5+6+7+4+6 1 6

x¯ 2  1.1 aˆ 

18



1 6

− 0.5

2

+ 10 (1.1 − 0.5) 2 − 0.5 182 +102 28

28 − 18 (0.396667) Zˆ   0.9346 18 (0.396667) + 0.5

C/4 Study Manual—17th edition Copyright ©2014 ASM



(C)

5.1  0.396667 12.85714

PRACTICE EXAM 8, SOLUTIONS TO QUESTIONS 33–35

1526

33. [Section 46.3] This loss function is minimized at the median of the distribution. The distribution is exponential with survival function ∞

Z S (x ) 

x

e −θ dθ  e −x

and for the median m, S ( m )  0.5, so e −m  0.5 and m  − ln 0.5  0.6931 . (D) 34. [Lesson 36] The x coordinate is the smoothed percentile, or 51  0.20. The y coordinate is the value of the fitted cumulative distribution function at 500. The maximum likelihood fit sets θ equal to the sample mean, or 2000, so the fitted distribution function is F ( x )  1 − e −x/2000 . The y coordinate is F (500)  1 − e −500/2000  0.2212. The point is (0.2, 0.2212 ). (B) 35. [Lesson 15] For the normal approximation, we need the mean and variance of annual aggregate losses. We’ll use the compound variance formula to calculate the variance of aggregate losses. E[N]  (2)(0.5)  1 E[X] 

Var ( N )  (2)(0.5)(1.5)  1.5

10,000 2 (10,0002 )  2,500 Var ( X )  − 25002  10,416,667 4 (4)(3) E[S]  (1)(2,500)  2,500 Var ( S )  (1)(10,416,667) + (1.5)(2,5002 )  19,791,667

E[S] + 1.645 Var ( S )  2,500 + 1.645 (4,448.78)  9,818.25

p

C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 1–4

1527

Answer Key for Practice Exam 9 1 2 3 4 5 6 7 8 9 10

E D E D B A A B E A

11 12 13 14 15 16 17 18 19 20

D E E C A B C C D A

21 22 23 24 25 26 27 28 29 30

E B C A B B C A A D

31 32 33 34 35

E A D D A

Practice Exam 9 1.

[Section 8.2] We set the survival function of the mixture equal to 0.05. 0.5e −x/500 + 0.5e − ( x−500)/500  0.05 e −x/500 + e 500/500 e −x/500  0.1 e −x/500 (1 + e 1 )  0.1 0.1 e −x/500   0.02689 1 + e1 x  −500 ln 0.02689  1807.92

(E)

2. [Lesson 37] The largest difference occurs at 3, where the exponential is about 0.65, and the difference is 0.4 . (D) 3. [Section 40.2] The penalty function is k  (ln n ) /2 times the number of parameters. Thus we need 85.58 + 3k greater than 89.32 + k, etc. Let’s see what these inequalities imply. 85.58 + 3k < 89.32 + k 85.58 + 3k < 87.33 + 2k 85.58 + 3k < 84.09 + 4k



k < 1.87



k > 1.49



k < 1.75

The strongest upper bound is from comparison to the 2-parameter model. Then k < 1.75 ⇒ ln n < 3.5 ⇒ n < e 3.5  33.12, so the highest n for which the 3-parameter model is selected is 33 . (E) 4.

[Section 24.2] Cumulative hazard rates are 1 1 + n−3 n−4 1 1 Hˆ I I ( t5 )  H I ( t2 ) + + n−2 n−4 Hˆ I ( t5 )  H I ( t2 ) +

where H I ( t2 )  H II ( t2 ) since the events through time t2 are the same in both studies. Exponentiating negatives of these, e −1/n−4 factors out and we get (D). C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 5–8

1528

5. [Lesson 9] Even though Bühlmann credibility estimates means rather than probabilities, we can analyze the Bernoulli random variable indicating whether a claim is greater than 1000, and the expected value of that variable is the probability. Let’s call that variable I. 0.5 The hypothetical mean of I is Pr ( X > 1000)  e − (1000/1000)  0.367879 for the first class and 2 e − (1000/500)  0.018316 for the second class. The overall mean is 0.5 (0.367879 + 0.018316)  0.193098 and the variance of the hypothetical means is a  0.52 (0.367879 − 0.018316) 2  0.030549. The process variance of I is (0.367879)(1−0.367879)  0.232544 and (0.018316)(1−0.018316)  0.017980 in the two classes, so the expected process variance is 0.5 (0.232544 + 0.017980)  0.125262. The credibility factor for 4 observations is 4/ (4 + k )  4/ (4 + 0.120262/0.030549)  0.493802. 1 of the 4 observations is greater than 1000, so x¯  0.25, and the credibility estimate of the probability is 0.493802 (0.25) + (1 − 0.493802)(0.193098)  0.2212

(B)

6. [Lesson 45] The probabilities of each score for the first team are 1/21. The posterior probability of the first team is (2/3)(1/21)  0.603774 (2/3)(1/21) + (1/3)(1/16) The predictive probabilities of the scores are 1 1 + (0.396226)  0.041133 0.603774 21 32

!

for scores in [20, 29] and

!

1 1 + (0.396226)  0.053515 0.603774 21 16

!

!

for scores in [30, 40]. The mean is the sum of the probabilities of each range times the average score in each range, or 10 (0.041133)(24.5) + 11 (0.053515)(35)  30.68 . (A) 7. [Lessons 11 and 12] The mixing distribution is gamma with α  6 and θ  2.5. The unconditional distribution is negative binomial with r  6 and β  2.5. Then a  2.5/3.5 and b  12.5/3.5. Setting a + b/k  1, 2.5 12.5 + 1 3.5 3.5k 2.5k + 12.5  3.5k k  12.5 Since p13  ( a + b/13) p12 , then p13 < p12 and 12 is the mode. (A) 8. [Section 2.2] The function transforming X to Y is y  g ( x )  x −1  1/x. Notice that x  g −1 ( y )  1/y. The derivative of g −1 ( y ) is dg −1 1 − 2 dy y By formula (2.3), fY ( y )  f X (1/y )



C/4 Study Manual—17th edition Copyright ©2014 ASM

 1! y2

(1/ (100y )) 4 e −1/100y 1  (1/y ) Γ (4) y2

!

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 9–10

1529

1 e −1/100y  1004 Γ (4) y 5

!

Not surprisingly, inverting X results in an inverse gamma. If you check the tables, you will see that the new parameters are α  4 and θ  1/100. The mode of an inverse gamma according to the tables is θ/ ( α + 1)  1/ (100 · 5)  1/500 . (B) If you did not recognize the density as an inverse gamma, you could still find the mode by differentiating it and set the derivative equal to 0. It is easier to differentiate the log: 1 + ln y − ln Γ (4) − 2 ln y 100y d ln fY 400 1 1 5 1 − + − − + 0 2 dy 100y 100y y y 100y 2 1 −5y + 0 100 1 y 500 ln fY  −4 ln 100y −

9.

[Lesson 39] The chi-square statistic is Q

(3 − 7.2) 2 7.2

+

(12 − 8.7) 2 8.7

+

(7 − 9.3) 2 9.3

+

(15 − 9.5) 2 9.5

 7.455

There are 4 separate years and no fitted parameters, so there are 4 degrees of freedom. Since the critical value at 10% for 4 degrees of freedom is 7.779, the answer is (E). 10.

[Lesson 9] Let I be the indicator variable for the condition X > 30,000. By double expectation, E ( X − 30,000)+k  E E[ ( X − 30,000)+k | I]  Pr ( X > 30,000) E ( X − 30,000)+k | X > 30,000

f

g

f

g

f

g

since if X ≤ 30,000, then ( X − 30,000)+  0. X − 30,000 | X > 30,000 is a Pareto distribution with parameters α  5 and θ  40,000, so its moments are easily calculated. Since coefficient of variation is scale-free, we could divide everything by 10,000, but we won’t. 10,000 Pr ( X > 30,000)  10,000 + 30,000 40,000 40,000 E[X − 30,000 | X > 30,000]   5−1 4 E[ ( X − 30,000)+ ]  0.255 (40,000/4)

C/4 Study Manual—17th edition Copyright ©2014 ASM

 0.255

2 (40,000) 2 (4)(3) 0.255 (2)(40,0002 ) E[ ( X − 30,000)+2 ]   12  Var ( X − 30,000)+ Coefficient of variation2  E[ ( X − 30,000)+ ]2 E[ ( X − 30,000)+2 ] − E[ ( X − 30,000)+ ]2  E[ ( X − 30,000)+ ]2

E ( X − 30,000) 2 | X > 30,000 

f

!5

g

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 11–13

1530

E[ ( X − 30,000)+2 ] −1 E[ ( X − 30,000)+ ]2 0.255 (2)(40,0002 ) /12  −1 0.2510 (40,0002 /16) 47  −1 6 

The coefficient of variation is 11.

p

47 /6 − 1  52.2462 . (A)

[Subsection 33.4.2] We’ll use the Pareto MLE formula. For the first fit, K  3 ln 1500 − ln 1750 − ln 2500 − ln 3500  −1.51227 3 αˆ  −  1.983767 −1.51227 θ + 500 1500 e (500)   1524.75  0.983767 αˆ − 1

For the second fit, subtract 500 from each loss and use the Pareto MLE formula.

K 3 ln 1000 − ln 1250 − ln 2000 − ln 3000  −2.01490 3 αˆ  −  1.488905 −2.01490 θ 1000  2045.39 E[Y P ]   αˆ − 1 0.488905

The absolute difference is 2045.39 − 1524.75  520.63 .

(D)

12. [Lesson 27] The median is the m for which Fˆ ( m )  0.5. The uniform kernel’s density is piecewise constant, so its distribution function is piecewise linear, and since all the observations and the bandwidth are integers, it is linear between integers. Thus we only need to find an integer k such that F ( k ) < 0.5 and F ( k + 1) > 0.5 and then linearly interpolate. Let’s start in the middle, at 11. The kernel is a linear function decreasing (as a function of the observation) at a rate of 1/8, so it is 1 at 7, 5/8 at 10, 4/8 at 11, etc. Fˆ (11)  0.1 (1 + 1 + 0.625 + 0.5 + 0.5 + 0.375 + 0.125)  0.4125 At 12: At 13:

Fˆ (12)  0.1 (1 + 1 + 0.75 + 0.625 + 0.625 + 0.5 + 0.25)  0.475 Fˆ (13)  0.1 (1 + 1 + 0.875 + 0.75 + 0.75 + 0.625 + 0.375)  0.5375

Linearly interpolating between 12 and 13, the median m is m  12 +

13.

0.5 − 0.475  12.4 0.5375 − 0.475

[Lesson 57.2] The sample means of the three groups are

P  113.09 i1    j mi j xi j x¯ i  P  86.25 i2   j mi j  155.625 i  3  C/4 Study Manual—17th edition Copyright ©2014 ASM

(E)

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 14–15 and the overall mean is

1531

P P i j mi j xi j µˆ  P P  115.143 mi j

i

j

P

j mi j

The expected process variance is estimated by

P P vˆ  

i

j

m i j ( x i j − x¯ i ) 2

(3 − 1) + (3 − 1) + (2 − 1) P P P 2 i

j m i j x i j − 2 x¯ i

2 j m i j x i j + x¯ i



5   878,050 + 298,500 + 775,800 − 2 (113.088)(7690) + (86.25)(3450) + 155.625 (4980)

+ 113.0882 (68) + 86.252 (40) + 155.6252 (32) 5

  2025.29

The variance of the hypothetical means is estimated by m  68 + 40 + 32  140 68 (113.088 − 115.143) 2 + 40 (86.25 − 115.143) 2 + 32 (155.625 − 115.143) 2 − 2025.29 (2) 140 − (682 + 402 + 322 ) /140  930.20

aˆ 

The credibility factor for Group 1 is 68 (930.20) / 68 (930.20) + 2025.29  0.9690 . (E)





14. [Lesson 64] The mean of the empirical distribution (only needed to calculate the variance) is 37 and the variance of the empirical distribution is 2 (22 − 37) 2 + (59 − 37) 2 + (44 − 37) 2 + (67 − 37) 2 + (36 − 37) 2 + (15 − 37) 2 + (31 − 37) 2  300.5 8 For the sample mean as an estimator of the mean, the bootstrap approximation of the mean square error is the sample variance divided by the size of the sample, or 300.5/8  37.5625 . (C) 15. [Lesson 30] The observed losses are truncated at 100 and censored at 1000, so the fitted distribution has to be adjusted in the same way. If X is uniform on [0, θ], we wish to match 100 +

E[X ∧ 1000] − E[X ∧ 100] 1 − F (100)

to the sample mean, which is (200 + 300 + 400 + 600 + 1000) /5  500. If 100 < θ < 1000, then θ 2 ! !! 100 100 5000 E[X ∧ 100]  (50) + 1 − (100)  100 − θ θ θ 100 F (100)  θ

E[X ∧ 1000] 

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 16–17

1532

E[X ∧ 1000] − E[X ∧ 100] θ/2 − 100 + 5000/θ  1 − F (100) 1 − 100/θ 2 θ − 200θ + 10,000  2 ( θ − 100) ( θ − 100) 2 θ − 100   2 ( θ − 100) 2 and setting 100 plus this expression equal to 500, we find θ  900 . (A) An easier way to derive the fitted expected value is to observe that the truncated distribution is uniform on [0, θ − 100], and if θ < 1000, the mean is the midpoint of that range. If the solution for θ had been higher than 1000, we would have had to redo the computation with E[X ∧ 1000] adjusted. The estimate for θ is implausible in view of the claim at the limit. 16. [Lesson 32] To derive the likelihood function, we need the density and survival functions, which we’ll derive from the hazard rate function. For x ≤ 10, the hazard rate is constant, so the distribution is exponential with the reciprocal of the hazard rate as mean, so f ( x )  θe −θx For x > 10, H ( x )  H (10) +

x

Z

10

S ( x )  e −10θ−0.05θx

θ ≤ 10

h ( u ) du  10θ + 0.05θ ( x 2 − 102 )

2 +5θ

 e −5θ−0.05θx

2

2

f ( x )  S ( x ) h ( x )  0.1θxe −5θ−0.05θx

The likelihood function, omitting multiplicative constants like x i , is the product of four f ( x i ) ’s and six S (50) ’s, or 2 +302 +6 (502 ) ]θ

L ( θ )  θ 4 e −θ (5+10) e −40θ−0.05[20 l ( θ )  4 ln θ − 870θ 4 dl  − 870  0 dθ θ 4 θ 870 0.1 (4)(35)  0.01609 h (35)  870

(B)

17. [Lesson 63] There is no interest calculation involved,Pso wePmerely need to calculate when the −1 ( u )  −0.3 P ln (1 − u )  1, or sum of the interarrival times exceeds 1. In other words, when x  F i i i Q (1 − u i )  e −1/0.3  0.035674. The cumulative products of the uniform numbers are 0.65, (0.65)(0.35)  0.2275, (0.2275)(0.32)  0.0728, and (0.0728)(0.48)  0.034944, so 3 claims occur in the first year. The inversion formula for a Weibull given in the tables as Varp ( X ) is θ − ln (1 − p )



0.84 → −1000 (ln 0.16) 1.5  2480.82 0.35 → −1000 (ln 0.65) 1.5  282.74

0.76 → −1000 (ln 0.24) 1.5  1704.86

C/4 Study Manual—17th edition Copyright ©2014 ASM

 1/τ

, so

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 18–19

1533

These claims add up to 4468.42. To lower it to 3000, we’ll need a deductible larger than 282.74 (the second claim). 4468.42 − 282.74  4185.68. We’ll need half the difference between 4185.68 and 3000 deducted from the other two claims, or a deductible of 592.84. Rounded to the next 50, that is 600 . (C) 18. [Lesson 58] The sample means are 6, 2, 1, 3 for the four policyholders respectively. The expected hypothetical mean is estimated with the overall sample mean, 3. The expected process variance is estimated as 3 as well, due to the Poisson assumption. The variance of the hypothetical means is estimated by aˆ 

(6 − 3) 2 + (2 − 3) 2 + (1 − 3) 2 + (3 − 3) 2 3

The credibility factor is

2aˆ

(2aˆ + vˆ )



vˆ 14 3 19  −  n 3 2 6

19/3 19  19/3 + 3 28



Expected claims for Policyholder A is 19 9 (6) + (3)  5.0357 28 28 19.

(C)

[Lesson 34] The likelihood function is L ( α, θ )  Q

α n θ nα ( θ + x i ) α+1

l ( α, θ )  100 ln α + (100α ) ln θ − ( α + 1)

X

ln ( θ + x i )

Differentiate with respect to α and then with respect to θ to obtain a12 .

X ∂l 100  + 100 ln θ − ln ( θ + x i ) α ∂α 100

∂2 l 100 X 1  − θ θ + xi ∂α∂θ i1

Calculate the expected value of the second term.

 E

1  θ+X





Z 0

αθ α dx ( θ + x ) α+2



αθ α ( α + 1)( θ + x ) α+1 0 α  ( α + 1) θ −

Negate the second derivative of the loglikelihood function and take the expected value of that. 100 100α + θ ( α + 1) θ 100 100 (4) − +  −2 10 (5)(10)

a 12  −

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 20–22

1534

20.

[Section 28.1] Insuring Date of Birth

Start of Exposure for Age 40

End of Exposure for Age 40

Months of Exposure

2/1/1972 6/1/1972 5/1/1973 11/1/1972

2/1/2012 6/1/2012 5/1/2013 11/1/2012

8/1/2012 6/1/2013 5/1/2013 12/1/2012

6 12 0 1

Total exposure is 6 + 12 + 0 + 1  19 . (A) 21. [Lessons 8 and 15] We calculate mean and variance of S using the Poisson compound variance formula. E[X]  e 2+1.5 2

E[X ]  e

2 /2

4+2 (1.52 )

 22.7599  4914.7688

E[S]  3 (22.7599)  68.2797 Var ( S )  3 (4914.7688)  14,744.307 The normal approximation is with a distribution having µ  68.2797 and σ 2  14,744.307. Using formula (8.7), 2

e −0.842 /2  0.279873 √ 2π φ (0.842) TVaR0.8 ( S )  µ + σ 0.2 ! p 0.279873  68.2797 + 14,744.307  238.20 0.2 φ ( z0.80 )  φ (0.842) 

(E)

22. [Lessons 14 and Lesson 16] The deductible does not affect the number of losses, since Pr ( X > 500)  1. We’ll calculate the second moment of the payment distribution, then use the compound variance formula (14.4) for Poissons. One way to calculate the second moment of ( X − 500)+ is to add the square of the mean to the variance. The mean is 500 less than the mean without a deductible, which according to the tables is E[X] 

αθ 4  (2000)  2666 23 α−1 3

The variance is the same as the variance of X, since subtracting a constant from each loss does not affect the variance. The variance of X according to the tables is αθ 2  2 (20002 ) α−2 Var ( X )  2 (20002 ) − (2666 23 ) 2  888,888 98 E[X 2 ] 

So the variance of S is 0.25 (888,888 89 + (2166 23 ) 2 )  1,395,833. C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 9, SOLUTION TO QUESTION 23

1535

Alternatively, one could obtain the second moment of Y  ( X − 500)+ by evaluating the integral of the density function times y 2 . Since Y  X − 500, the density function of Y is fY ( x )  We evaluate

Z

∞ 1500

4 (20002 ) ( y + 500) 5

4 (20004 ) y 2 dy ( y + 500) 5

by substituting x  y + 500 and expanding the numerator.

Z

∞ 2000

( x − 500) 2 x5

!



1000 250,000 1 − 4 + dx 3 x x5 x 2000 1 1000 250,000  − + 2 (20002 ) 3 (20003 ) 4 (20004 )

dx 

Z





and multiplying this by 4 (20004 ) one gets 5,583,333 13 . Multiplying by λ  0.25, the variance of S is once again 1,395,833. √ The standard deviation is 1,395,833  1181.45 . (B) 23. [Section 4.2] We’ll use conditional variance several times. For Class I, the mean is E[λ]  0.2 and the variance is Var (E[N | λ]) + E[Var ( N | λ ) ]  Var ( λ ) + E[λ] 

0.22 2.44 + 0.2  12 12

For Class II, the mean is E[2λ]  0.4 and the variance is Var (2λ ) + E[2λ] 

4 (0.22 ) 4.96 + 2 (0.2)  12 12

The overall variance is Var ( N )  Var (E[N | class]) + E[Var ( N | class) ]   2.44 4.96  Var (0.2, 0.4) + E , 12 12 0.75 (2.44) + 0.25 (4.96)  0.22 (0.75)(0.25) +  0.26333 12

(C)

The question can also be solved by calculating first and second moments. The first moment is E[N]  Eλ Eclass E[N | λ, class]

f

f

 Eλ Eclass [λ, 2λ]

f

gg

g

 Eλ [1.25λ]  1.25 (0.2)  0.25 The second moment of a Poisson with mean λ is the mean squared plus the variance, or λ 2 + λ. Then E[N 2 ]  Eλ Eclass E[N 2 | λ, class]

f

f

gg

 Eλ Eclass [λ 2 + λ, 4λ 2 + 2λ]

f

 Eλ [1.75λ2 + 1.25λ] C/4 Study Manual—17th edition Copyright ©2014 ASM

g

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 24–27

1536

0.13  1.75 + 1.25 (0.2)  0.325833 3

!

On the last line, we calculated the second moment of the uniform distribution as the variance (0.022 /12) plus the mean square (0.022 ), with the result 0.13/3. The variance of N is Var ( N )  0.325833 − 0.252  0.26333 . 24. [Lesson 26 and Section 34.2] H (32)  − ln S (32) , so we apply the delta method on the function g ( x )  − ln x. Then g 0 ( x )  −1/x. The delta method says to multiply the variance of the original function ˆ (S (32) here) by the derivative of the transforming function  (ln x here) evaluated at the mean (S (32) here). 2 2 ˆ ˆ Therefore, the variance of ln S (32) is estimated as Var S (32) /S (32) , or 0.0004/0.26 . The confidence interval is − ln 0.26 ± 1.96 (0.02/0.26)  (1.196, 1.498) . (A) 25.

[Lesson 25] Calculate the first and second moments. 220 + 643 + 35 (50)  34.84 75 2,564 + 21,843 + 35 (502 ) E[ ( X ∧ 50) 2 ]   1492.093 75 Var ( X ∧ 50)  1492.093 − 34.842  278.27 (B) E[X ∧ 50] 

26. [Lesson 46] The likelihood of the experience is 2q 2 (1−3q ) . The likelihood times the prior, dropping multiplicative constants, is q 2 (1 − 3q ) 2 . If you recognize that this is a beta with θ  1/3, a  3, and b  3, you can immediately write down E[Q]  (1/3)(3/ (3 + 3))  1/6. Otherwise, to calculate the expected value of Q, we need the integral of that and the integral of that times q. 1/3

Z 0

2

2

q (1 − 3q ) dq 

1/3

Z 0

( q 2 − 6q 3 + 9q 4 ) dq

q 3 6q 4 9q 5 − + 3 4 5



 0.00123457 1/3

Z 0

3

2

q (1 − 3q ) dq  

1/3

Z 0

! 1/3 0

( q 3 − 6q 4 + 9q 5 ) dq

q 4 6q 5 9q 6 − + 4 5 6

 0.00020576

! 1/3 0

Thus the expected value of Q is E[Q]  0.00020576/0.00123457  1/6. Either way you get E[Q], the expected value of a claim is then E[Q] + 2 E[1 − 3Q]  2 − 5 E[Q]  7/6 . (B) 27. [Lesson 49] The marginal distribution, or the overall distribution, of N must be Bernoulli, since N can only have one of two values: 0 or 1. Once we know the expected value, this is equal to the probability of 1, and we’re done. But f g 2 E[N]  E E[N | Q]  E[Q]  5 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 28–30

1537

where the last equality follows from the fact that Q is beta with θ  1, a  2, b  3, and the mean of a beta is θa/ ( a + b ) . So the answer is (C). 28. [Lesson 50] The logarithms of the claims follow an exponential distribution, and the prior is inverse gamma with α  3 and θ  8, so we can use the exponential/inverse gamma conjugate prior on them. The mean of the logarithms of the claims is (ln 500 + ln 1000 + ln 1500) /3  6.811861. The posterior parameters for the inverse gamma are α0  3 + n  3 + 3  6 θ0  8 + n x¯  8 + 3 (6.811861)  28.4356 The posterior mean of θ is 28.4356/5  5.6871 . (A) 29. [Lesson 32] As usual, when writing the likelihood function, we can omit the +1 on the α exponent in the denominator, since that is a constant factor. The likelihood function, omitting multiplicative constants, is 2 3 α 4 (1000α (1+1/1.1+1/1.1 +1/1.1 ) ) L (α)  (1500α )(1700α/1.1 )(1600α/1.12 )(1700α/1.13 ) Logging and differentiating, ln 1700 ln 1600 ln 1700 + ł ( α )  4 ln α + 3.486852α ln 1000 − α ln 1500 + + 1.1 1.12 1.13

!

 4 ln α + 24.08632α − 25.76128α

 4 ln α − 1.67496α 4 dl  − 1.67496  0 dα α 4 α  2.388122 1.67496 The probability that aggregate losses in year 5 exceed 700 is 1000 1700

! α/1.14

10  17

! 2.388122/1.14  0.420833

(A)

30. [Lesson 53] The variable indicating whether there is at least one claim is Bernoulli. Given λ, its mean is 1 − e −λ and its variance is e −λ (1 − e −λ ) . For calculations of moments, it’ll be convenient to have integrals of e λ and ( e −λ ) 2  e −2λ over the distribution of λ in the prior hypothesis, which is uniform on [0, 0.5] (so it has density 2). 2 2

0.5

Z

Z

0 0.5 0

0.5  2 (1 − e −1/2 )  0.786939

e −λ dλ  −2e −λ

0

e −2λ dλ  −2

0.5 e −2λ

 1 − e −1  0.632121 2 0

Now we can calculate µ, a, and v. For calculating a, we use Var (1− e −λ )  Var ( e −λ ) since adding constants and multiplying by −1 does not affect variance. µ  E[1 − e −λ ]  1 − 0.786939  0.213061

a  Var ( e −λ )  E[e −2λ ] − E[e −λ ]2  0.632121 − 0.7869392  0.012848

v  E[e −λ (1 − e −λ ) ]  E[e −λ ] − E[e −2λ ]  0.786939 − 0.632121  0.154818 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 31–33

1538

0.154818  12.04991 0.012848 2 Z  0.142350 2 + 12.04991 P  Z x¯ + (1 − Z ) µ  (0.142350)(0.5) + (1 − 0.142350)(0.213061)  0.2539 k

(D)

31. [Section 61.4] To simulate a standard normal random number, first transform the uniform numbers to [−1, 1]. 0.7939 → 2 (0.7939) − 1  0.5878

0.5438 → 2 (0.5438) − 1  0.0876

0.58782 + 0.08762  0.3532 < 1, so we can use this pair. The first generated standard normal random number is r −2 ln 0.3532  1.4270 0.5878 0.3532 To approximate the aggregate loss distribution, we need the mean and variance. Let N be the claim count random variable, X the claim size random variable, and S the aggregate loss random variable. E[N]  rβ  (0.5)(1)  0.5

Var ( N )  rβ (1 + β )  1

E[X]  αθ  (6)(500)  3000

Var ( X )  αθ2  1,500,000

E[S]  E[N] E[X]  (0.5)(3000)  1500 Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  (0.5)(1,500,000) + (1)(30002 )  9,750,000 √ The generated aggregate loss number is 1500 + 1.4270 9,750,000  5956 . (E) 32.

[Section 24.2] yi

ri

si

Hˆ ( y i )

0.6 0.8 0.9 1.2 1.8 2.2

15 14 13 11 8 7

1 1 1 1 1 1

0.06667 0.13810 0.21502 0.30593 0.43093 0.57378

From this table, Sˆ (1.8)  e −0.43093  0.6499, Sˆ (2.2)  e −0.57378  0.5634, and the probability of 2.2 is 0.6499 − 0.5634  0.0865 . (A) 33.

[Lesson 18] Calculate expected annual claim costs. E[N]  0.3 (1) + 0.2 (2)  0.7 E[X]  0.6 (2) + 0.3 (3) + 0.1 (5)  2.6 E[S]  0.7 (2.6)  1.82

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 9, SOLUTIONS TO QUESTIONS 34–35

1539

Calculate probabilities of S  0,2,3,4. Pr ( S  0)  0.5 Pr ( S  2)  0.3 (0.6)  0.18 Pr ( S  3)  0.3 (0.3)  0.09 Pr ( S  4)  0.2 (0.62 )  0.072 We’ll calculate E[S ∧ 5] directly as the sum of probabilities of each value up to 5 times the value. E[X ∧ 5]  0.18 (2) + 0.09 (3) + 0.072 (4) + (1 − 0.5 − 0.18 − 0.09 − 0.072)(5)  1.708 Expected annual payments under the reinsurance contract are E[S] − E[S ∧ 5]  1.82 − 1.708  0.112 . (D) 34. [Subsection 21.1.3] The estimator is twice the third order statistic of a sample of 5, which is beta (3, 3, θ ) . The mean of m is θ/2 and the variance is θ 2 (3)(3) θ2  (3 + 3) 2 (3 + 3 + 1) 28 Therefore the estimator is unbiased and the MSE equals Var (2m )  4 Var ( m )  θ 2 /7 . (D) 35. [Lesson 43] Use formula (43.1). Compare the coefficients of variation. For a gamma, µ  αθ and σ2  αθ2 , so σ2 /µ2  1/α. For an inverse gamma, µ  θ/ ( α − 1) and E[X 2 ]  θ 2 / ( α − 1)( α − 2) , so

For a Pareto

θ 2 / ( α − 1)( α − 2) α−1 σ2 1  −1 −1 α−2 α−2 µ2 θ 2 / ( α − 1) 2 2 ( α − 1) α σ2 2θ 2 / ( α − 1)( α − 2)  −1 −1 α−2 α−2 µ2 θ 2 / ( α − 1) 2

The inverse gamma’s coefficient of variation is higher than the gamma’s, and since α > 2, the Pareto’s is higher than the inverse gamma’s. (A) ss

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 1–2

1540

Answer Key for Practice Exam 10 1 2 3 4 5 6 7 8 9 10

D C B E C E B A A C

11 12 13 14 15 16 17 18 19 20

B E D D B B C D C D

21 22 23 24 25 26 27 28 29 30

B B E C C D A D B A

31 32 33 34 35

A E D D D

Practice Exam 10 1.

[Lesson 42] Calculate 1 + CV2X  E[X 2 ]/ E[X]2 for a Weibull. E[X]  θΓ 1 +



1 τ

E X 2  θ2 Γ 1 +

f

g

E X2

g

f

E[X]2





2 τ

 θΓ (5)  24θ



 θ 2 Γ (9)  40,320θ2

40,320  70 242

Expected claims needed for full credibility is 2.



2.576 2 (70) 0.1

 46,450 . (D)

[Section 63.1] For the negative binomial, 1 p0  2

!2

 0.25

1 p1  2 2

!3

1 p2  3 2

!4

F (0)  0.25

 0.25

F (1)  0.50

 0.1875

F (2)  0.6875

Accordingly, 0.2 → 0, 0.5 → 2 (by the textbook’s convention that the largest inverse value is selected), 0.3 → 1, and 0.6 → 2. So we’ll need 5 claims.

!2

1000 1000 + x √ 1000 1−u  1000 + x 1000 x√ − 1000 1−u 0.42 → 313.06 u 1−

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 3–6

1541

0.31 → 203.86

0.27 → 170.41 0.1 → 54.09

0.65 → 690.31 Annual payments are 0 in the first year, 313.06 + 203.86 − 500  16.92 in the second year, 0 in the third year since claims are below 500, and 54.09 + 690.31 − 500  244.40 in the fourth year. The average is (16.92 + 244.40) /4  65.33 . (C) 3. [Lesson 7] By definition, the loss elimination ratio is E[X∧500]/ E[X]. We use the tables to compute the two expectations. E[X]  exp µ + 12 σ2





2

 e 5+0.5 (2 )  1096.63

! 2

Φ

ln 500 − 5 − 2 2

 Φ (−1.39)  0.0823 ln 500 − 5  Φ (0.61)  0.7291 2

!

F (500)  Φ

E[X ∧ 500]  1096.63 (0.0823) + 500 (1 − 0.7291)  225.7 225.7 LER (500)   0.2058 (B) 1096.63 4.

[Lesson 55] Bühlmann’s k is

v a

αθ αθ 2





1 θ

 100. (α plays no role.) We want

n  0.9 n + 100 n  900

5.

(E)

[Subsection 28.1.1] By formula (28.1), 5 qˆ 60

 1 − e −5 (10)/400  0.117503

By formula (28.3),

s q

L (5 qˆ60 )  Var

(1 −

0.117503) 2 (52 )

10  0.03488 4002

!

The 95% confidence interval is 0.117503 ± 1.96 (0.03488)  (0.049, 0.186) . (C) 6.

[Lesson 30] The tables have this for the inverse gamma’s moments: E Xk 

f

g

so for k  −1 and k  −2, since Γ ( x + 1)  xΓ ( x ) , E X −1 

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

θk Γ(α − k ) Γ(α)

α θ

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 7–8

1542

E X −2 

f

f E E



g

X −2

g 

2 X −1

( α + 1)( α ) θ2 α+1 1 1+ α α

The empirical moments are µ0−1



µ0−2 

1 0.1

+

1 0.12

+

1 0.2

+

1 0.22

1 0.5

5 +

+

1 0.52

5

1 1

+

+

1 2

1 12

 3.7

+

1 22

 26.05

Matching moments, 1+

7.

[Lesson 26]

1 26.05  1.902849  α 3.72 1 αˆ   1.1076 1.902849 − 1

(E)

yi

ri

si

2 4 6 7

9 6 3 2

2 1 1 1

  2 1 1 1 Var Hˆ (7)  2 + 2 + 2 + 2  0.413580 9 6 3 2 8.

(B)

[Lesson 32] Let u  e −1000/θ . The likelihood function is L ( θ )  (1 − u ) 80 ( u − u 3 ) 15 u 15  (1 − u ) 95 (1 + u ) 15 u 30 l ( θ )  95 ln (1 − u ) + 15 ln (1 + u ) + 30 ln u dl 95 15 30 − + + 0 dθ 1−u 1+u u −95u − 95u 2 + 15u − 15u 2 + 30 − 30u 2  0 140u 2 + 80u − 30  0

14u 2 + 8u − 3  0 √ −8 + 64 + 168 u  0.258270 28 e −1000/θ  0.258270 1000 θ−  738.69 ln 0.258270 The payment per loss for a deductible of 500 is E[X] − E[X ∧ 500]  θe −500/θ  738.69e −500/738.69  375.40 C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 9–12

1543

[Section 4.1] The moment generating function for Λ is MΛ ( t )  (1 − θt ) −α  1 − ( t/2,000,000)



9.

and A ( x ) 

R

x 0

2t dt 

x2,

 −3

,

so the unconditional survival function is



1



S ( x )  MΛ −A ( x ) 

(1 + x 2 /2,000,000) 3

To find the median, we set S ( x )  0.5 and solve for x.

! 3 x2 *.1 + +/  2 2,000,000 , -! 2

√3 x  2−1 2,000,000 x 2  2,000,000

√3

2−1 q √3  x  2,000,000 2 − 1  721.00



(A)

Alternatively, we could recognize the distribution as a Burr with α  3, γ  2, and θ  look up VaR0.5 ( X ) in the tables: VaR0.5 ( X ) 

p

2,000,000 0.5−1/3 − 1



 1/2

 721.00

[Section 8.3] The VaR at 99% is exp ( µ + z0.99 σ )  exp 3 + 2.326 (2)  e 7.652  2105. Then



10.

√ 2,000,000 and



2

E[X]  e 3+0.5 (2 )  148.4 ln 2105 − 3 − 22 + 0.01 (2105) E[X ∧ 2105]  148.4Φ 2

!

 148.4Φ (0.33) + 21.05  148.4 (0.6293) + 21.05  114.4 Then TVaR0.99 ( X )  2105 +

148.4 − 114.4  2105 + 3400  5505 0.01

(C)

11. [Lesson 27] The density at x is the number of points within a distance of 5 from x, divided by 80 (2 times the bandwidth times the number of points). Seven of the points are within a distance of 5 from 6, and there is no other point with this property, so the answer is 6 . (B) 12.

[Lesson 31] Let u  e −12/2θ . 0.5u 2 + 0.5u  0.5 u2 + u − 1  0 √ −1 + 5 u  0.618034 2



C/4 Study Manual—17th edition Copyright ©2014 ASM

12  − ln 0.618034  0.481212 2θ 6 θ  12.4685 (E) 0.481212

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 13–16

1544

13.

[Lesson 15] E[N]  0.02 Var ( N )  0.02 (1.1)  0.022 2000 E[X]   400 6−1 2 (20002 ) − 4002  240,000 Var ( X )  (5)(4) E[S]  0.02 (400)  8 Var ( S )  0.02 (240,000) + 0.022 (4002 )  8320 12 − 8 Φ √  Φ (0.04)  0.5160 8320

!

14.

(D)

[Lesson 46] Likelihood is 1/θ 5 for θ ≥ 6. The posterior is π ( θ | x)  R 

1/θ6

∞ dθ/θ 6 6 5 ( 65 )

θ6

θ≥6

θ≥6

which you may recognize as a single-parameter Pareto with θ  6, α  5, so its mean is E[θ | x]  5 (6) /4  7.5. The expected number of claims is θ/2 which has expected value 7.5/2  3.75 . (D) 15. [Section 23.1] The estimate is the observed probability of 1, or 10/100  0.1. The estimator is a binomial proportion: the number of 1’s for a distribution which is either 1 or not 1, divided by 100. The variance of the distribution is 100q (1 − q ) /1002 . Since q, the probability of 1, is estimated as 0.1, the √ variance is 100 (0.1)(0.9) /1002  0.0009. The confidence interval is 0.1 ± 1.96 0.0009  (0.0412, 0.1588) . (B) 16.

[Subsection 34.1.3] Differentiate the loglikelihood twice. 2θ 2 (θ + x )3  ln 2 + 2 ln θ − 3 ln ( θ + x ) 2 3  − θ θ+x 2 3  2− θ (θ + x )2

L (θ)  l (θ) dl dθ d2 l − 2 dθ

To estimate θ, set the first derivative equal to 0 and x  5. 2 3  θ θ+5 2θ + 10  3θ θˆ  10

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 17–18

1545

To calculate observed information, plug x  5 into the second derivative, and it equals 3 1 2 −  102 152 150

(B)

17. [Lesson 58] We will treat two months (1/6 year) as a unit, and multiply by 6 at the end. In this time period, 5 7 P 2 xi 12 + 12 + 32 11   n 7 7 !2 11 5 52 − σˆ 2   7 7 49 µˆ  vˆ  x¯ 

s

2

aˆ Z PC

7 52 52   6 49 42 52 5 22  −  42 7 42 22/42 11   52/42 26   ! 11 5  6 (0.412088)  2.4725 6 1− 26 7

!

(C)

18. [Lesson 46] We cannot use the Bernoulli/beta shortcut here since the uniform distribution is only up to 0.5. So we do this from first principles. The likelihood of the observation may be calculated as the product of the probability of 1 claim in 1 year and 0 claims in 1 year using a binomial (3, q ) , or as the probability of 1 claim in 2 years using a binomial (6, q ) . Either way, the likelihood is 6q (1 − q ) 5 . The density of the prior is 2. Putting this together, we have 6q (1 − q ) 5 (2) π ( q | x)  R 0.5 6q (1 − q ) 5 (2) dq 0 We may cancel the constants 6 and 2. The denominator is easier to integrate with the substitution q 0  1−q. Then 0.5

Z 0

5

q (1 − q ) dq  

Z

1

0.5 Z 1 0.5

q 05 (1 − q 0 ) dq 0

( q 05 − q 06 ) dq 0 1

q 06 q 07 − 6 7 0.5 !   1 1 0.56 0.57  − − −  0.02381 − 0.00149  0.02232 6 7 6 7 

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 19–20

1546

The posterior function is then

q (1 − q ) 5 0.02232 The expected value of this is similarly easier to calculate with the substitution q 0  1 − q: π ( q | x) 

1 0.02232

Z

1  0.02232

Z

E[Q] 

0.5 0

1

0.5

q 2 (1 − q ) 5 dq

(1 − q 0 ) 2 q 05 dq 0 1

q 06 2q 07 q 08 − + 6 7 8 0.5 ! !  1 1 2 1 0.56 2 (0.57 ) 0.58  − + − − + 0.02232 6 7 8 6 7 8 0.005952 − 0.000860 0.005092    0.22814 0.02232 0.02232

1  0.02232

!

!

The mean number of claims is 3Q, and 3 (0.22814)  0.6844 . (D) 19. [Lesson 9] One way to compute the variance is to use conditional variance, conditioning on the range. The mean payment on claims higher than 20 is 20 and the variance is 0. E Var ( X | I )  0.5 (10) + 0.4 (10) + 0.1 (0)  9

f

g

E E[X | I]  0.5 (4) + 0.4 (12) + 0.1 (20)  8.8

f

g

E E[X | I]2  0.5 (42 ) + 0.4 (122 ) + 0.1 (202 )  105.6

f

g

Var E[X | I]  105.6 − 8.82  28.16





Var ( X )  9 + 28.16  37.16

(C)

An alternative is to calculate first and second moments. The mean is 8.8, as calculate above. To calculate the second moment, calculate the second moment of each interval using the mean and variance of each interval: E[ ( X ∧ 20) 2 | 0 < X < 10]  42 + 10  26

E[ ( X ∧ 20) 2 | 10 ≤ 10 < 20]  122 + 10  154 E[ ( X ∧ 20) 2 | X ≥ 20]  400

E[ ( X ∧ 20) 2 ]  0.5 (26) + 0.4 (154) + 0.1 (400)  114.6 Var ( X ∧ 20)

20.

 114.6 − 8.82  37.16

[Lesson 6] For a franchise deductible of 1000, the average payment per loss is E[X] − E[X ∧ 1000] + 1000 Pr ( X > 1000)

Therefore, the ratio of expected payments to expected losses is 1− C/4 Study Manual—17th edition Copyright ©2014 ASM

E[X ∧ 1000] − 1000 Pr ( X > 1000) E[X]

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 21–24

1547

The moments E[X] and E[X ∧ 1000] may be looked up since F ( x ) is a beta distribution with X is beta with θ  10,000, a  1/2, b  1, but we’ll calculate E[X] and E[X ∧ 1000] by integrating the survival function. 10,000

Z E[X] 

0

1−

r

x dx 10,000

!

2 10,0001.5  3333.33  10,000 − 3 10,0000.5

!

E[X ∧ 1000] 

1000

Z

1−

0

r

x dx 10,000

!

1000

 1000 −

x 3/2

0

150 10003/2  789.18  1000 − 150 r 1000 Pr ( X > 1000)  1 −  0.68377 10,000 789.18 − 1000 (0.68377) Answer  1 −  0.96838 3333.33 21.

(D)

[Lesson 11] a  0.6, b  3. Since a is positive, the distribution is negative binomial. 0.6 

β 1+β

3  ( r − 1)(0.6)

β



r6

7 (0.4) 6 (0.6) 2  0.030966 2

!

p2 

0.6  1.5 0.4



(B)

22. [Lesson 57] The sample means are µA  12 and µ B  12 − x, and the sample variances are both 2 2 vˆ  2 +2  4. Since each policyholder’s average experience is x/2 different from the mean, the variance of 2 hypothetical means is estimated as !2 x vˆ x2 4 aˆ  2 −  − 2 3 2 3 √ and this must be greater than 0 to have Z > 0. So x 2 > 8/3, x > 8/3  1.6330 . (B) 23.

[Lesson 50] The prior is an inverse gamma with parameters 4 and 1500, so 4 → 4 + 6  10

1500 → 1500 + 600 + 800 + 200 + 500 + 1200 + 600  5400 The mean of the posterior inverse gamma is 5400/ (10 − 1)  600 . (E) 24.

[Lesson 26] We have

1 n2

+

1

( n−1) 2

+

1

( n−2) 2

+

1

( n−3) 2

 0.0982  0.009604. Although squaring is not

a linear function, there’s little we can do other than approximate √ works. 4/0.009604  20.41, so we try n  22, and it works. C/4 Study Manual—17th edition Copyright ©2014 ASM

4

( n−1.5) 2

 0.009604 and see whether n

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 25–28

1548

or

For complete data (no censoring), the Greenwood formula reduces to the empirical variance formula,

  L Sˆ ( y4 )  (4)(18)  0.006762 Var 223

The square root of 0.006762 is 0.0822 . (C) 25.

[Section 4.1] The distribution function is 0.5x   0≤x≤5    10 ! ! F (x )    x−5 x−5   + 0.5  0.25 + 0.15 ( x − 5) 5 ≤ x ≤ 10  0.25 + 0.5 10 5 

Setting this equal to 0.75, we have 0.25 + 0.15 ( x − 5)  0.75 0.5 x 5+ (C)  8 13 0.15 26.

[Section 21.1] The distribution function of X is x

Z FX ( x ) 

0

2u du x2  θ2 θ2

Let Y be the maximum of a sample of n. The distribution function of Y is FY ( x ) n  x 2n /θ 2n . The density function of Y is 2nx 2n−1 fY ( x )  FY0 ( x )  θ 2n The expected value is f g Z θ 2nx 2n dx 2n E max ( x1 , . . . , x n )   θ 2n 2n +1 θ 0 The bias is

2n θ . (D) θ−θ − 2n + 1 2n + 1

[Lesson 45] Expected aggregate claims in Group A is 0.5 (2)(500)  500 and in Group B, 0.4 (4) +



27.

0.1 (8) 250  600. The probability of 1000 given A is 0.5 (2 claims); given B, 0.4 (4 claims). Weighting with the posterior probabilities, (0.5)(500) + (0.4)(600)  544 49 (A) 0.5 + 0.4



28. [Lesson 47] The parameters for the prior gamma are α and θ. Set up the two equations for the posterior (indicated with primes), with γ  1/θ, α0  α + claims, γ0  γ + periods. α+5 1 γ + 10 α + 30  1.2 γ + 20 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 29–32

1549

From the first equation, α  γ + 5. α + 30  1.2 α + 15 1.2 ( α + 15)  α + 30 1.2α + 18  α + 30 0.2α  12 α  60 γ  55 1 θ 55 60 αθ   1.09 55

29.

(D)

[Section 33.5] By Bernoulli maximum likelihood, e −λ  0.5, so λ  − ln 0.5  0.6931 . (B)

30. [Section 61.3] ln (1 + β )  ln 1.5  0.405465. For 50 insureds, r  50 (0.1)  5. The first 6 events used λ0 through λ 5 . We generate interevent times s k . λ6  5 (0.405465) + 6 (0.405465)  4.46011

s6  −

ln (1 − 0.613)  0.21285 4.46011

0.80 + 0.21285 > 1. It took 7 numbers to get above 1, so 6 is generated. (A) 31.

[Lesson 18] Let f k be the probability of a loss size equal to k. Then for the zero-truncated Poisson, 1  0.5820 e −1 12  0.2910 f2  2 ( e − 1)

f1 

We calculate the probability of aggregate losses of k  0, 1, 2, which we’ll call g k . g0  0.3 g1  0.4 f1  0.4 (0.5820)  0.2328 g2  0.1 f12 + 0.4 f2  0.1 (0.58202 ) + 0.4 (0.2910)  0.1503 The probability of 3 or more in aggregate losses is 1 − 0.3 − 0.2328 − 0.1503  0.3169. Expected retained losses are 0.2328 (1) + 0.1503 (2) + 0.3169 (3)  1.484 . (A) 32. [Section 35.4] Maximum likelihood sets p0M equal to the observed proportion, so p0M  0 and the fitted distribution is zero-truncated. For a Sibuya distribution, β → ∞, so β/ (1 + β )  1 and p1T 



(1 + β ) (1 + β ) r − 1 

  −r

since r < 0 and therefore (1 + β ) r  0. Also, a  1 and b  r − 1, so



p2T  (−r ) 1 + C/4 Study Manual—17th edition Copyright ©2014 ASM

r−1 (−r )( r + 1)  2 2



PRACTICE EXAM 10, SOLUTIONS TO QUESTIONS 33–35

1550

To calculate the likelihood, we can drop the multiplicative constant 1/2. Also, let u  −r so that u will be positive, and it is less confusing to work with a positive variable. Then the likelihood function is L ( u )  u 30 u (1 − u )



 10

 u 40 (1 − u ) 10

This familiar likelihood function form, x a (1 − x ) b , is maximized for x  a/ ( a + b ) . See the beginning of Section 33.5 for a derivation of this. So here, u  40/50  0.8 and r  −0.8 . (E) 33. [Lesson 24] It is a little confusing that payment amounts rather than loss amounts are given. Let X be the loss random variable. Considering payments from all three coverages, we have yi

ri

si

1500 3000 3500 5000

10 9 7 6

1 2 1 3

Sˆ X ( y i | X > 500) 0.9 0.7 0.6 0.3

Pr ( X  y i | X > 500) 0.1 0.2 0.1 0.3

The last column, Pr ( X  y i ) , is computed as differences of the survival functions. For example, Pr ( X  3000)  Pr ( X ≥ 3000) −Pr ( X > 3000)  SX (3000− | X > 500) −SX (3000 | X > 500)  0.9−0.7  0.2

Notice that the survival function can only be computed conditional on the loss being greater than 500, since there is no data for losses below 500. Let Y P be the average payment on the first coverage, which has a deductible of 500 and a maximum covered loss of 5000. The average payment is the sum of the probabilities times the amounts of the payments, taking into account that the payment is 4500 for any loss above 3500 (since such a loss is always 5000 or higher): E[Y P ] 

X x 500) + 4500S (3500 | X > 500)

 0.1 (1000) + 0.2 (2500) + 0.1 (3000) + 0.6 (4500)  3600

(D)

Alternatively, you can integrate the survival function from 500 to 5000: E[Y P ]  

Z

5000

500 Z 1500 500

SX ( x | X > 500) dx 1 dx +

Z

3000 1500

0.9 dx +

Z

3500 3000

0.7 dx +

Z

5000 3500

0.6 dx

 1000 + 0.9 (1500) + 0.7 (500) + 0.6 (1500)  3600 34.

[Subsection 33.1.1] Exposure is 3 (750 − 200) + 3 (200) + 4 (300) + 6 (10,000) + 4 (400 − 300)  63,850

There are 14 uncensored claims. Therefore, θ  63,850/14  4560.7 . (D) 35.

[Lesson 37] Taking the deductible into account, F∗ (x ) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

F ( x ) − F (5) e −10/x − e −2  1 − F (5) 1 − e −2

PRACTICE EXAM 10, SOLUTION TO QUESTION 35

1551

We compute this for the 5 points. x

F∗ (x )

Fn ( x − )

Fn ( x )

Max dif

10 12 15 18 30

0.2689 0.3461 0.4373 0.5070 0.6896

0 0.1667 0.3333 0.6667 0.8333

0.1667 0.3333 0.6667 0.8333 1

0.2689 0.1794 0.2294 0.3263 0.3104

The largest difference is 0.3263 . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 1–3

1552

Answer Key for Practice Exam 11 1 2 3 4 5 6 7 8 9 10

B A E A E B A C B C

11 12 13 14 15 16 17 18 19 20

E D C B A E C A A B

21 22 23 24 25 26 27 28 29 30

A D B B D D C B C A

31 32 33 34 35

B E A A E

Practice Exam 11 1. [Lesson 5] The expected payment per loss is the weighted average of E[X ∧ 2000] for both components. 1000 1000 * 1000 E[X ∧ 2000]  0.75 (1000) 1 − + 0.25 1− 3000 −0.5 3000





!

! −0.5

,

 0.75 (666.67) + 0.25 (1464.10)  866.03

+ -

(B)

Since every loss results in a payment, 866.03 is also the expected payment per payment. 2.

[Lesson 58] 35 + 20 (2) + 8 (3) + 4 (4) 115   0.839416 137 137 X x 2i  35 (1) + 20 (4) + 8 (9) + 4 (16)  251 µˆ  vˆ 

137 251 − 0.8394162  1.135788 136 137  1.135788 − 0.839416  0.296372 0.296372   0.260939 1.135788  3 (0.260939) + 0.839416 (1 − 0.260939)  1.4032

s2  aˆ Zˆ PC

3.







(A)

[Lesson 39] The Pareto probabilities of each interval are p i  F ( c i ) − F ( c i−1 ) , with F ( c i )  1 −

12,000

.

3 (12,000 + c i ) .

C/4 Study Manual—17th edition Copyright ©2014 ASM

ci

F (ci )

1000 2000 5000 10000 ∞

0.21347 0.37026 0.64828 0.83772 1

p i  F ( c i ) − F ( c i − 1) 0.21347 0.15679 0.27802 0.18944 0.16228

E i  np i 17.078 12.543 22.241 15.155 12.983

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 4–5

Q

1553

202 102 302 102 102 + + + + − 80  6.161 17.078 12.543 22.241 15.155 12.983

(E)

4. [Lesson 45] A payment of 2500 for the first coverage is equivalent to a loss of 3000. The density of a loss of 3000 is 2 (20002 )  6.4 × 10−5 f (3000)  50003 A payment of 2500 for the second coverage is equivalent to a loss of 3500. The likelihood of 3500, or the density of a loss of 3500 is f (3500) 

2 (20002 )  4.80841 × 10−5 55003

The question is asking for the average size of the next payment, so it is asking for the average payment per payment, or the mean excess loss. For a Pareto, the mean excess loss at d, by equation 6.10, is e ( d )  ( θ + d ) / ( α − 1) . Thus, for a deductible of 500, the mean excess loss is (2000 + 500) /1  2500, and for a deductible of 1000 the mean excess loss is (2000 + 1000) /1  3000. We weight these by the product of the prior (2/3 vs. 1/3) and the likelihoods. Notice that the prior distribution is a distribution for number of insureds, not for number of payments: 2/3 of the insureds, not 2/3 of the payments, have a deductible of 500. Since frequency of losses does not vary by deductible, 2/3 of losses have a deductible of 500. Therefore, the likelihoods are the likelihoods of losses, not the likelihoods of payments. The expected claim given a payment of 2500 is then

(2/3)(6.4 × 10−5 )(2500) + (1/3)(4.80841 × 10−5 )(3000)  2636.54 (2/3)(6.4 × 10−5 ) + (1/3)(4.80841 × 10−5 ) 5.

[Sections 8.3 and 33.4.2] Using the formulas and notation in Subsection 33.4.2, K  10 ln 2000 − (2 ln 2500 + 5 ln 3000 + 2 ln 7000 + ln 12,000)  −6.7709 9  1.3292 αˆ  − −6.7709

The 90th percentile is x such that

! 1.3292

2000  0.1 2000 + x 2000  0.11/1.3292  0.17688 2000 + x 2000 x − 2000  9307.1 0.17688 so VaR0.90 ( X )  9307.1. Then TVaR0.90 ( X )  9307.1 +

C/4 Study Manual—17th edition Copyright ©2014 ASM

9307.1 + 2000  43,654 1.3292 − 1

(E)

(A)

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 6–8

1554

6.

[Lesson 30] x 2i

1222  152.75 8 8 P 3 xi 24,012   3001.5 8 8 2 e 2µ+2σ  152.75

P



2

e 3µ+4.5σ  3001.5 2µ + 2σ 2  ln 152.75  5.02880 3µ + 4.5σ 2  ln 3001.5  8.00687 Multiply the first equation by 3, the second by 2, and subtract the first from the second, and divide by 3. 2 (8.00687) − 3 (5.02880)  0.30911 3 √ σˆ  0.30911  0.55598 (B)

σ2 

7.

[Lesson 24] Let x be the risk set after censoring at time 7. 197 0.953564  200

!

x−6 x

!

x−6 200  0.953564  0.968085 x 197 6 x  188 1 − 0.968085

!

Therefore 9 lives left. (A) 8.

[Lesson 6] From the tabular values with d  20. E[X]  E[X ∧ 20] + Pr ( X > 20) e (20)  19 + 0.9 (200)  199

We also have E[X]  E[X ∧ 50] + Pr ( X > 50) e (50) 199  E[X ∧ 50] + 195 Pr ( X > 50)

Let’s get a lower bound for Pr ( X > 50) by using the values of E[X ∧ 50]. E[X ∧ 20]  20 Pr ( X > 20) + 19  20 (0.9) + 20

Z 0

C/4 Study Manual—17th edition Copyright ©2014 ASM

x f ( x ) dx  1

20

Z 0

20

Z 0

x f ( x ) dx

x f ( x ) dx

(*)

PRACTICE EXAM 11, SOLUTION TO QUESTION 9

1555

E[X ∧ 50]  50 Pr ( X > 50) +

Z

 50 Pr ( X > 50) +

Z

50 0

20

0

x f ( x ) dx x f ( x ) dx +

Z

50 20

x f ( x ) dx

≥ 50 Pr ( X > 50) + 1 + 20 Pr (20 < X ≤ 50)

 50 Pr ( X > 50) + 1 + 20 1 − 0.1 − Pr ( X > 50)





 30 Pr ( X > 50) + 19

(**)

Plugging this into (*), we get 199 ≤ 30 Pr ( X > 50) + 19 + 195 Pr ( X > 50)  225 Pr ( X > 50) + 19

225 Pr ( X > 50) ≥ 180 Pr ( X > 50) ≥ 0.8

Then plugging into (**), E[X ∧ 50] ≥ 50 (0.8) + 19  43

(C)

To prove that this lower bound is attained, here is a discrete random variable satisfying the question’s assumptions with E[X ∧ 50]  43: x

Pr ( X  x )

10 20 245

0.1 0.1 0.8

2 1 9. [Section 63.1] For the geometric distribution, the probabilities are p0  1.5  23 , p 1  p 0 0.5 1.5  9 , 2 2 8 26 0.5 p2  p1 1.5  27 . Thus the cumulative distribution is F (0)  3 , F (1)  9 , and F (2)  27 , and there are 0 claims in the first two years and 2 in the third year. Inverting the paralogistic,



1 1 + ( x/θ ) α √ 1 α 1−u  1 + ( x/θ ) α !α 1 x 1+  √ α θ 1−u u 1−

x  θ

s α

√ α

s xθ

α

1

1−u

√ α

 1000

C/4 Study Manual—17th edition Copyright ©2014 ASM

−1

1

1−u

s





−1

1

1−u

−1

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 10–12

1556

0.41 → 549.44

0.60 → 762.32

After the per-claim deductible of 100, the average over three years is (449.44 + 662.32) /3  370.59 . (B) 10. [Section 33.5] Since there are only three intervals, or two probabilities, to fit, and two parameters, the maximum likelihood estimate matches the fitted and actual probabilities. To set up two equations in two unknowns, it is convenient to use Pr ( X > 1000)  0.5 and Pr ( X > 2000)  0.2. Then e − (1000/θ )  0.5

e − (2000/θ )  0.2

τ

1000 θ 1 2

11.



τ

2000 θ

 − ln 0.5



 − ln 0.2



ln 0.5  0.430677 ln 0.2 ln 0.430677 τˆ   1.21532 ln 0.5 

(C)

[Lesson 31] For the product-limit estimator, we have yi

ri

si

Sˆ ( y i )

7 8 10 75

9 7 6 2

2 1 2 2

7/9 6/9 4/9 0

Integrating the survival function, we have ∞

Z E[X] 

0

Sˆ ( x ) dx 

 7 (1) + (8 − 7)

7

Z 0

Sˆ ( x ) dx +

8

Z 7

Sˆ ( x ) dx +

10

Z 8

Sˆ ( x ) dx +

7 6 4 + (10 − 8) + (75 − 10)  38 9 9 9

!

!

Z

75 10

Sˆ ( x ) dx

!

For the single-parameter Pareto, the median of the data is 10; it is clear, despite the censored data, that 10 is the fifth highest loss of the nine. 6 10



 0.5

α ln 0.6  ln 0.5 α  1.3569 αθ 1.3569 (6) E[X]    22.811 α−1 0.3569 The difference is 38 − 22.811  15.189 . (E) 12. [Lesson 27] We will use double expectation. For the first four points {1,2,5,6}, the expected value of the kernel smoothed X ∧10 is the point. For the point 10, the left half of the triangle’s partial expectation C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 13–14

1557

is obtained by integrating x over the left half triangle’s line: 10

10

1 x3 x−6 dx  − 3x 2 16 16 3 6 6   1 1000 216  − 300 − + 108  4 13 16 3 3 In the right half of the triangle, X ∧10  10, and the probability of the right half is 0.5. The total is therefore Z

!

!

x

4 13 + 5  9 31 . So E[X ∧ 10] 

1 5



1 + 2 + 5 + 6 + 9 31  4 23 . (D)



13. [Lesson 14] Since a zero-truncated geometric is a non-truncated geometric shifted by 1, the mean is the mean of a non-truncated geometric plus the shift, or 1 + β, while the variance is the same as a non-truncated geometric, β (1 + β ) . That means β  1. E[N]  2 Var ( N )  1 (1 + 1)  2 E[X]  500Γ (3)  1000 Var ( X )  5002 Γ (5) − E[X]2  6,000,000 − 1,000,000  5,000,000 E[S]  2 (1000)  2000

Var ( S )  2 (5,000,000) + 2 (10002 )  12,000,000 12,000,000  3 (C) CV2  20002 14.

[Lesson 11] Solve for a and b. b  1.2 3 b a +  0.8 8 5 b  0.4 24 a+

24  1.92 b  0.4 5 1.92 a  1.2 −  0.56 3

!

Determine when p n /p n−1 > 1.

b 1 n 1.92 0.56 + 1 n 1.92  0.44 n 1.92 n  4.36 0.44 p5 < p4 a+

p4 > p3 The mode is 4 . (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 15–18

1558

15.

[Lesson 64] The bootstrap calculation depends on the number of 100’s:

Adding it all up,

Number of 100’s

Probability

0 1 2 3

8/27 12/27 6/27 1/27

Empirical E[X ∧ 400] 400 300 200 100

Square Error 1002 0 1002 2002

8 (10,000) + 6 (10,000) + 40,000  6666.67 27

(A)

16. [Lesson 9] The tables don’t help since the formula for E[X ∧ x] only works for α , k. So we’ll do it from first principles. The first moment is E[X ∧ 10,000] 

10,000

Z 0

 1000 +

S ( x ) dx

Z

10,000 1000

1000 dx x

 1000 + 1000 (ln 10,000 − ln 1000)  3302.585 E ( X ∧ 10,000) 2 

f

g



Z

10,000

Z

1000 10,000 1000

x 2 f ( x ) dx + 10,0002 S (10,000) 1000 1000 dx + 10,000 10,000 2

!

 9 · 106 + 107  19,000,000

Var ( X ∧ 10,000)  19,000,000 − 3302.5852  8,092,932

p

17.

8,092,932  2844.8

(E)

[Lesson 46] The likelihood of 20, plugging into the density function of the inverse gamma, is

( θ/20) 3 e −θ/20  cθ 3 e −θ/20 20Γ (3) where c is not dependent on θ. Multiplying by the prior density, the posterior density is of the form πΘ|x ( θ | X  20)  cθ 3 e −θ/20 e −θ/10  cθ 3 e −3θ/20 where c is a different constant, but still not dependent on θ. We recognize this as a gamma distribution with parameters 4 and 20/3, so the posterior mean for Θ is 4 (20/3)  80/3. Expected claim size for the Θ inverse gamma given Θ is α−1  Θ2 , so expected claim size is 40/3 . (C) Let N1 be the number of losses in (0, 1000) and N2 the number of losses in N1 3N2 (1000, 5000) . Then the estimated probability is + . We use the multinomial distribution with 200 800 18.

[Subsection 23.2]

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 19–20

1559

n  100 and probabilities p1  0.45 (for the first interval), p2  0.35 (for the second interval), and p3  0.20 (for the rest). The variance of a multinomial is np i (1 − p i ) and the covariance is −np i p j , so we have

L ( N1 )  100 (0.45)(0.55)  24.75 Var L ( N2 )  100 (0.35)(0.65)  22.75 Var M ( N1 , N2 )  −100 (0.35)(0.45)  −15.75 Cov

So

  L Pr D (500 ≤ X ≤ 2500)  24.75 + Var 2 200

!2

3 2 (22.75) − 800 200

 0.000348046875

!

3 (15.75) 800

!

(A)

2

19. [Lesson 53] The hypothetical mean is µ (Θ)  e Θ+0.5σ  e Θ+2 . Since Θ is normally distributed, e Θ is lognormally distributed. The mean and variance of the hypothetical mean are µ  E[e Θ+2 ]  e 2 E[e Θ ]  e 2 e 5+0.5 (3)  e 8.5 a  Var ( e θ+2 )  e 4 Var ( e Θ )  e 4 E[e 2Θ ] − E[e Θ ]2





 e 4 e 2 (5) +2 (3) − e 2[5+0.5 (3) ]  e 20 − e 17



The process variance is

2



2

v (Θ)  e 2Θ+2σ − ( e Θ+0.5σ ) 2  e 2Θ ( e 8 − e 4 )

The expected process variance is

v  ( e 8 − e 4 ) E[e 2Θ ]  ( e 8 − e 4 ) e 2 (5) +2 (3)  e 24 − e 20

For 20 observations, the credibility factor is Z The credibility premium is

20 ( e 20 − e 17 ) 20a   0.261758 20a + v 20 ( e 20 − e 17 ) + e 24 − e 20

PC  0.261758 (10,000) + (1 − 0.261758)( e 8.5 )  6246

20.

(A)

[Lesson 45] The likelihood of the 2 claims given λ is e

−λ

λ2 2

λ e −λ λ 4 35 *. λ (5/6) λ +/ *. λ (7/6) λ +/ 36   . λ2 / . λ2 / ∼  λ2  λ2 500 1 + 56 700 1 + 76 1 + 56 1 + 67 , -, -

We evaluate this for λ  0.5 and λ  0.6 and get 0.00236084 and 0.00442264 respectively. Then the expected number of claims next year is 0.00236084 (0.75)(0.5) + 0.00442264 (0.25)(0.6)  0.5384 0.00236084 (0.75) + 0.00442264 (0.25) C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 21–23

1560

21. [Section 34.2] We invert the information matrix to obtain the covariance matrix. The determinant of the matrix is (100)(9) − (1)(1)  899, so the inverse matrix is 1 9 899 −1

!

−1 . 100

The mean of a Pareto is g ( α, θ )  θ/ ( α − 1) . We use equation (34.5) to calculate the variance. In the 2-variable case, the equation reduces to ∂g Var g ( α, θ )  Var ( α ) ∂α





!2

∂g + Var ( θ ) ∂θ

!2

∂g + 2 Cov ( α, θ ) ∂α

!

∂g ∂θ

!

so we have ∂g θ 600 −  − 2  −150 2 ∂α ( α − 1) 2 ∂g 1 1   ∂θ α − 1 2

1 * 1 .9 (−150) 2 + 100 Var (estimated mean)  899 2

!2

1 + / + 2 (−1)(−150) 2

,

!

-

 225.445 √ The width of a 95% confidence interval is then 2 (1.96) 225.445  58.86 (A)

22. [Lesson 25] The ogive puts 300 losses in the range 2000–3000. Expected payments for all losses are the following divided by 1500: 300 (2500) + 340 (4000) + 80 (6000)  2,590,000 Expected number of payments per loss is the following divided by 1500: 300 + 340 + 80  720 Dividing, the answer is 2,590,000/720  3597 29 . (D) 23.

[Lesson 42] The mean of claim size X is αθ  500 and the variance is E Var ( X ) + Var E[X]  E[µ2 ] + Var ( µ )  (5)(6)(1002 ) + 5 (1002 )  350,000

f

g





Therefore, using z p to be the standard normal distribution’s ( p + 1) /2 quantile, 2000   zp

!2

0.05 zp 0.05 C/4 Study Manual—17th edition Copyright ©2014 ASM

zp

!2 

0.05 zp 0.05

!2

350,000 1+ 5002

(2.4)

2000  833.33 2.4 √  833.33  28.87 



PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 24–27

1561

z p  (28.87)(0.05)  1.44 p+1  Φ (1.44)  0.9251 2 p  0.8502 (B)

24. [Lesson 40] The 2-parameter fit would be preferred to the 1-parameter fit if twice the difference between the negative loglikelihoods is greater than chi-square with one degree of freedom at 95%, or 3.84. So it would be preferred if the negative loglikelihood is less than 320.4 − 1.92  318.48. If not, the 3-parameter model would be selected, since 2 (320.4 − 316.2)  8.4 > 5.99, the 95th percentile of chi-square with 2 degrees of freedom. If the 2-parameter model’s negative loglikelihood is less than 318.48, but greater than 316.2 + 1.92  318.12, then the 3-parameter model is selected over the 2-parameter one. So the 3 parameter model is accepted if x > 318.48 and also if 318.12 < x ≤ 318.48. x > 318.12 . (B) 25.

[Section 21.1] The mean square error is the variance plus the bias squared. The variance of x 3 is

2

E[X 6 ] − E[X 3 ] .



E[X 3 ]  6θ 3 biasx 3 ( θ 3 )  E[X 3 ] − θ 3  5θ 3 E[X 6 ]  720θ 6

Var ( X 3 )  720θ 6 − (6θ 3 ) 2  684θ 6

MSEx 3 ( θ 3 )  684θ 6 + (5θ 3 ) 2  709θ6

26.

(D)

[Lesson 26] We tabulate the results: yi

ri

si

Hˆ ( y i )

L Hˆ ( y i ) Var

300 400 500 600

50 49 48 147

1 1 1 3

0.02 0.040408 0.061241 0.081650

0.0004 0.000816 0.001251 0.001389

U H U HU





√ ! 1.96 0.001389  exp  2.4468 0.081650 0.081650   0.033371 2.4468  (0.081650)(2.4468)  0.199777

We exponentiate and complement Hˆ ( x ) to obtain Fˆ ( x ) . The answer is 1 − e −0.033371 , 1 − e −0.199777 



(0.0328, 0.1811) . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM



PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 28–29

1562

27. [Section 35.4] Let N be the truncated geometric random variable. Maximum likelihood sets p0M equal to the observed proportion, which is 0.74. We can then maximize the conditional distribution given N > 0. Let p 1T  1 − p. Then p Tk  (1 − p ) p k−1 . Since a truncated geometric’s probabilities are in geometric P k−1  p 3 . The likelihood of the conditional distribution is progression, Pr ( N ≥ 4)  ∞ k4 (1 − p ) p L ( p )  (1 − p ) 16 (1 − p ) p



6 

3 (1 − p ) p 2 p 3  (1 − p ) 25 p 15

This type of likelihood function is maximized for p  15/ (25 + 15)  3/8, as discussed in Section 33.5. Now, p  β/ (1 + β ) , so β  p/ (1 − p )  3/5. The mean of a truncated geometric is 1 + β  1.6. Then the mean of the zero-modified geometric is 0.26 (1.6)  0.416 . (C) 28.

[Section 61.2] For death, we simulate a binomial with m  75, q  0.003. p 0  0.99775  0.7982 p1  75 (0.99774 )(0.003)  0.1801

Since 0.7982 + 0.1801  0.9783 > 0.80, there is 1 death. For disability, we simulate a binomial with m  75 − 1  74, q  0.07/0.997  0.070211 p 0  (1 − 0.070211) 74  0.004576 p 1  74 (1 − q ) 73 q  0.025569

74 (1 − q ) 72 q 2  0.070473 2

!

p2 

74 p3  (1 − q ) 71 q 3  0.127717 3

!

74 p4  (1 − q ) 71 q 4  0.171185 4

!

The sum of the first four probabilities is 0.2283 and the sum of the first five probabilities is 0.3995, so there are 4 disabilities. That leaves 70 lives subject to termination. The parameters for termination are m  70, q  0.25/0.927  0.269687 . (B) 29. [Lesson 40] The maximum likelihood estimate for µ is pP √ hood estimate for σ is (ln x i − µˆ ) 2 /5  6.544/5  1.144. The density of the lognormal distribution is f ( x; µ, σ ) 

1 √

σx 2π

P

e − (ln x−µ)

(ln x i ) /5  2.514. The maximum likeli-

2 /2σ 2

The loglikelihood function for five observations is the sum of five logarithms of the density function, or l ( µ, θ )  − 52 ln 2π −

X

ln x i − 5 ln σ −

P

(ln x i − µ ) 2 2σ2

We want the difference of two loglikelihood functions for two different values of µ and σ, so we can ignore P the first and second summands 52 ln 2π and ln x i which will be the same regardless of the estimate for µ and σ. At the unconstrained maximum likelihood estimate µ  2.514, the sum of the third and fourth summands is P (ln x i − µ ) 2 6.544 −5 ln σˆ −  −5 ln 1.144 −  −3.173 2σ2 2 (1.1442 ) C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 30–33

1563

If µ is constrained to equal 2, the maximum likelihood estimate for σ is of the third and fourth summands is −5 ln σˆ −

P

(ln x i − 2) 2 2σ2

 −5 ln 1.254 −

pP

(ln x i − 2) 2 /5  1.254. The sum

7.866  −3.633 2 (1.2542 )

The likelihood ratio statistic is 2 (−3.173 + 3.633)  0.920 . (C) 30.

[Lesson 57] vˆ  aˆ   Zˆ  

31.

12,000  6000 1+1+0+0 1002 + 752 + 202 + 52 200 − 200 32,000  267.22 119.75 100aˆ 100 aˆ + vˆ 26722  0.8166 (A) 32722

! −1 

50,000 − 3 (6000)



[Section 1.4] p 1 is the first derivative of P ( z ) evaluated at 0. P0 ( z )  P 0 (0) 

0.5 + z

(1.1 − 0.5z − 0.5z 2 ) 2

0.5  0.41322 1.12

(B)

32. [Lesson 49] Since there are only two possibilities (either 0 claims are submitted or not), the model is Bernoulli. The prior is beta with a  3, b  1, which in the posterior go to a 0  3 + 2  5, b 0  1 + 0  1, and the posterior expected value of θ is then a/ ( a + b )  5/6  0.8333 , which is the posterior probability of no claims. (E) 33.

[Lesson 24 and Subsection 33.4.2] The Nelson-Åalen estimate is derived as follows: 2 1 Hˆ (30 | X > 20)  +  0.733333 5 3 ˆ S (30 | X > 20)  e −0.733333  0.480305

Pr ( X ≤ 30 | X > 20)  1 − 0.480305  0.519695

The MLE formula for a single-parameter Pareto is α  −n/K, where in this case 205  ln 0.097524  −2.32766 252 · 30 · 35 · 50 5 αˆ   2.14808 2.32766

K  ln

Therefore, Pr ( X ≤ 30 | X > 20)  1 −

(10/30) α 2 1− (10/20) α 3

! 2.14808

The absolute difference is 0.581455 − 0.519695  0.06176 . (A) C/4 Study Manual—17th edition Copyright ©2014 ASM

 0.581455

PRACTICE EXAM 11, SOLUTIONS TO QUESTIONS 34–35

1564

34. [Section 34.2] The variance of rˆ is 0.52  0.25; the variance of βˆ is 0.022  0.0004; and the covariance if −0.2 (0.5)(0.02)  −0.002. The mean is g ( r, β )  rβ, so the partial derivatives are g r  β, g β  r. The delta method calculates the variance as g 2r Var ( rˆ ) + g 2β Var ( βˆ ) + 2g r g β Cov ( rˆ , βˆ )  0.42 (0.25) + 32 (0.0004) − 2 (0.002)(3)(0.4)  0.0388

√ The 95% confidence interval is (3)(0.4) ± 1.96 0.0388  (0.814, 1.586) . (A) 35.

[Lesson 62] By the limited fluctuation credibility general formula, we need zp α

!2

1.96 CV  0.005 2

!2

6000  92,199 1002

!

runs, where we’ve rounded up to the next higher integer. (E)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 1–3

1565

Answer Key for Practice Exam 12 1 2 3 4 5 6 7 8 9 10

C B C E C A D B E C

11 12 13 14 15 16 17 18 19 20

A D E D A B B D A A

21 22 23 24 25 26 27 28 29 30

C D D A E A A C D B

31 32 33 34 35

A C C D B

Practice Exam 12 1. [Lesson 7] E[X]  E[X ∧ ∞]  2500, and after inflation this doubles to 5000. After inflation, X 800 becomes 2X and E[2X ∧ 2000]  2 E[X ∧ 1000]  2 (400)  800. So the revised LER is 5000  0.16 . (C) 2.

[Lesson 47] The expected number of claims, by equation (47.1) is 3 + ni1 x i  0.2 10 + n

P

where x i is the number of claims in year i and n is the number of years. Here, x i  0 for all i, so we have 3  0.2 10 + n n 5

(B)

3. [Lesson 33] See the exponential shortcut in Example 33A on page 601, and the paragraph before the example. The exposure for the payments x i below 9500 is x i , because the deductible is already taken into account in the payment size, so the sum of these exposures is 10 (2000)  20,000. The exposure for each of the six payments of 9500 is 9500 for the same reason. So the total exposure is 10 (2000) + 6 (9500)  77,000. Dividing by the number of uncensored claims, 10, the answer is 77,000 10  7700 . (C) You could do the problem from first principles too. The likelihood function is

x i (after the deductible) below 9500, and L (θ) 

1−F (10,000) 1−F (500)

1  − e θ 10

10 i1 ( x i +500) −6 (10,000)

e − (500/θ ) 16

1  10 e −77,000/θ θ 77,000 l ( θ )  −10 ln θ − θ

!

C/4 Study Manual—17th edition Copyright ©2014 ASM

for the payments

for the truncated payments of 9500. Then 

1 [−10 (2000) −10 (500) −6 (10,000) +16 (500) ]/θ e θ 10

!



P

f ( x i +500) 1−F (500)

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 4–5

1566

10 77,000 dl − + 0 dθ θ θ2 (C) θ  7700 4.

[Section 4.2] For rural drivers, solving for the probability p of pleasure use: 200p + 300 (1 − p )  260 300 − 100p  260 p  0.4

The second moment for rural drivers is

(0.4)(2002 + 30,000) + (0.6)(3002 + 35,000)  103,000 so the variance for rural drivers is 103,000 − 2602  35,400. For urban drivers, the probability q of pleasure use can be backed out using second moments, but I think conditional variance sets up the quadratic a little more easily. Letting I be the type of use and X the random variable for urban claim sizes, Var ( X )  E Var ( X | I ) + Var E[X | I]









27,600  30,000q + 25,000 (1 − q ) + q (1 − q )(400 − 300) 2





Collecting terms and dividing through by 100, 276  50q + 250 + 100q − 100q 2 100q 2 − 150q + 26  0

150 ± 1502 − 10,400  0.2, 1.3 q 200

p

and 1.3 is rejected as being more than 1. So the mean urban claim is 0.2 (300) + 0.8 (400)  380. To calculate overall variance, we can either use second moments or conditional variance. Using second moments, the overall mean is 0.75 (380) + 0.25 (260)  350 and the overall second moment is 0.75 (3802 + 27,600) + 0.25 (2602 + 35,400)  129,000 + 25,750  154,750 Then the variance is 154,750 − 3502  32,250 . (E) 5.

[Section 57.1] Use formulas (57.1), (57.2), and (57.3). 5  0.3125 16 1   0.25 4 3   0.75 4 0   1 1 1 3 5  + + +0  4 4 4 4 16  0.75 1  3 (0 − 0.25) 2 + (1 − 0.25) 2   0.25 3 3

µˆ  x¯ 1  x¯ 2 x¯ 3 x¯ 4 x¯ v1  v2  v3 C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 12, SOLUTION TO QUESTION 6

1567

v3 is the same as v1 and v2 , since the variance of three 1’s and one 0 is the same as the variance three 0’s and one 1; the latter random variable is 1 minus the former. v4  0 1 vˆ  [3 (0.25) ]  0.1875 4

 aˆ 

1 4



 5 2 16

+



1 4



 5 2 16

+



3 4



 5 2 16

3 0.296875  − 0.046875  0.05208 3 4aˆ Zˆ  4aˆ + vˆ 4 (0.05208)   0.5263 4 (0.05208) + 0.1875

+ 0−



 5 2 16



0.1875 4

PC  0.5263 (0.25) + (1 − 0.5263)(0.3125)  0.2796

(C)

6. [Section 54.1] The hypothetical mean is µ ( Q, Θ)  (3Q )(10Θ)  30QΘ. The overall mean is E[30QΘ]  (30)(0.5)(35)  525. The variance of the hypothetical means, using the fact that the variance of a uniform is the range squared over 12, is E[ (30QΘ) 2 ]  302 E[Q 2 ] E[Θ2 ]    0.01 100  900 + 0.52 + 352  278,425 12 12 a  278,425 − 5252  2800 The process variance, by the compound variance formula (14.2) on page 236, is v ( Q )  3Q (10Θ2 ) + 3Q (1 − Q )(10Θ) 2  330QΘ2 − 300Q 2 Θ2

The expected value of the process variance is



v  330 (0.5) 352 +

100 0.01 − 300 0.52 + 12 12







352 +

100  110,691.67 12



The exposure unit for aggregate claims is a member-year. The total of exposure units is 95 + 100 + 110  305 (2800) na 305. The credibility factor is Z  na+v  305 (2800 ) +110,691.67  0.88526. The observed aggregate losses is x¯ 

40,000 + 50,000 + 50,000  459.0164 95 + 100 + 110

The Bühlmann-Straub prediction of next year’s aggregate claims is 120 0.88526 (459.0164) + (1 − 0.88526)(525)  55,991



C/4 Study Manual—17th edition Copyright ©2014 ASM



(A)

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 7–8

1568

7.

[Lesson 28]

0( d ) We will use standard Life Contingencies notation for the death rates: q 60 for the

0( d ) first year death rate and q 61 for the second year death rate. The population is 100 for the first year and 100 − 1 − 20  79 for the second year, so 0( d ) qˆ60  0.01 0( d ) qˆ61  3/79

0( s ) 0( s )  0.1, the survival probability with no surrender for For a population with surrender rates q 60  q 61 two years is the probability of survival from death, or (1 − 0.01)(1 − 3/79)  0.9524, times the probability of not surrendering, or (1 − 0.1) 2  0.81. The answer is therefore (0.9524)(0.81)  0.7714 . (D)

8. [Section 25.1.2] We need the conditional variance, conditional on survival time being less than 5. The conditional distribution less than 2 and 2/3 of being greater. The  1  1 has probability 1/3 of  being 1   6 between 0 and 2, and 23 5−2  29 between 2 and 5. histogram is then 13 2−0 The conditional expected value can be calculated by multiplying the midpoint of each interval by the probability of being in the interval and adding up. If we let Y be the random variable for survival time conditional on survival time being less than 5, we have E[Y]  13 (1) +

2 7 3 2



8 3

Calculating the second moment requires integrating y 2 with the histogram as the weight:

f

E Y

2

g

2

Z 

0

1 2 x dx + 6

2

2 2 x dx 9

2 1* 1 . (8 − 0) + (125 − 8) +/ 3 6 9

!



5

Z

!

-

,

82  9

2

8 The variance is 82  2 . (B) 9 − 3 An alternative method for solving the problem is to use the conditional variance formula. Let I be the indicator variable indicating whether death is before 2 or after 2 (but before 5). Let X be survival time conditional on surviving 5 or less. Then

Var ( X )  E Var ( X | I ) + Var E[X | I]

f

g





Given I, X is uniformly distributed on the interval [0, 2] or [2, 5]. A uniform variable’s mean is the midpoint 1 of the interval, and its variance is 12 of the interval’s length squared. So Interval [0, 2]

E[X | I]

Var ( X | I )

3.5



1

[2, 5]

22 12 32 12



1 3 3 4

The probability that the interval is [0, 2] is 1/3 and the probability that the interval is [2, 5] is 2/3 (as discussed above). Thus E Var ( X | I ) 

f

C/4 Study Manual—17th edition Copyright ©2014 ASM

g

1 1 2 3 11 +  3 3 3 4 18

!

!

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 9–10 1 Var E[X | I]  3





!

1569

2 25 (3.5 − 1) 2  3 18

!

Note that E[X | I] is a Bernoulli variable since it can only have two values, 1 and 3.5, so the variance was computed using the Bernoulli shortcut as the product of the two probabilities and the square of the difference between the two possible values. Finally, Var ( X ) 

9.

11 25 +  2 18 18

(B)

[Lessons 17 and 18] First, the expectations: E[N]  (0.3)(10)  3 E[X]  0.4 (10) + 0.3 (20) + 0.3 (35)  20.5 E[S]  (3)(20.5)  61.5

Now we calculate the aggregate probabilities of 0, 10, 20. g 0  0.710  0.028248

! * 10 (0.79 )(0.3) +/ (0.4)  10 (0.79 )(0.3)(0.4)  0.048424 1 , ! ! 10 10 * * 8 2 + 2 . (0.7 )(0.3 ) / (0.4 ) + . (0.79 )(0.3) +/ (0.3) 2 1 , ,  45 (0.78 )(0.32 )(0.42 ) + 10 (0.79 )(0.3)(0.3)  0.073674

g10  .

g20

SS (0)  1 − 0.028248  0.971752

SS (10)  0.971752 − 0.048424  0.923328

SS (20)  0.951576 − 0.073674  0.849654

E[S ∧ 30]  10 (0.971752 + 0.923328 + 0.849654)  27.44734 Alternatively, you can compute E[S ∧ 30] as E[S ∧ 30]  10g10 + 20g20 + 30 (1 − g0 − g10 − g20 )

 10 (0.048424) + 20 (0.073674) + 30 (1 − 0.028248 − 0.048424 − 0.073674)  27.44734

The answer is then

E[ ( S − 30)+ ]  61.5 − 27.44734  34.0526

(E)

10. [Lesson 27] The base of the kernel is 8, so the height of the triangle is 41 . Figure A.2 shows the density kernels at 74 for the 3 observation points 72, 74, and 75. The kernels are 1/8 at 72, 1/4 at 74, and 3/16 at 75. Therefore: 3 + 6 1 *1 1 / fˆ(74)  . + + (2)  0.15 5 8 4 16 40

!

,

C/4 Study Manual—17th edition Copyright ©2014 ASM

-

(C)

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 11–12

1570

(74,1/4)

1/4

1/4

3/16

3/16

3/16

1/8

1/8

1/16

1/16

(74,1/8)

1/8 1/16 68

72 (a) k 72 (74)

74

70

76

1/4

74 (b) k 74 (74)

78

(74,3/16)

71

74 75 (c) k 75 (74)

79

Figure A.2: Density kernels for question 10

11. [Lesson 63] For the Poisson distribution, using ( a, b, 0) properties, p0  e −2  0.1353, p1  p2  2 2p 0  0.2707, p3  3 (0.2707)  0.1804. These sum up to more than 0.8, the highest uniform random number we use. Thus: x 0 1 2 3

px 0.1353 0.2707 0.2707 0.1804

F (x ) 0.1353 0.4060 0.6767 0.8571

We see that 0.8 → 3, while 0.2 and 0.4 go to 1. We have to simulate 5 claims. We calculate the point x i on the standard normal distribution such that Φ ( x i )  u i , then generate a lognormal random number y i by transforming it with y i  e µ+σx i . Uniform Number ui 0.1 0.5 0.2 0.1 0.6

Standard Normal Number x i  Φ−1 ( u i ) −1.282 0 −0.842 −1.282 0.25

Lognormal Number y i  e 5+2x i 11.437 148.413 27.571 11.437 244.692

The first year, the first three claims, add up to 187.42, or 87.42 after the deductible. The second year, the fourth claim, is less than 100. The third year, the fifth claim, is 244.69, or 144.69 after the deductible. Total payments are 87.42 + 144.69  232.11 . (A) 12.

[Lesson 62] The formula for the upper bound of a 75th percentile is Yb , where b  75.5 + z π n (0.75)(0.25)

j

p

k

and here π  (0.8 + 1) /2  0.9, so z π  1.282. b  75.5 + 1.282 (10) (0.75)(0.25)  d81.05e  82

j

and the answer is Y82  135 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

p

k

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 13–15

13.

1571

[Lesson 46] The prior density is proportional to e −10λ . The posterior is π (λ | x )  R 

∞ 0

0.6e −11λ + 0.4e −12λ 0.6e −11λ + 0.4e −12λ dλ



0.6e −11λ + 0.4e −12λ 0.6 11

+

0.4 12

Claim counts conditional on λ are a mixture of a Poisson with mean λ and a Poisson with mean 2λ, so E[X R| λ]  0.6λ + 0.4 (2λ )  1.4λ. Therefore, we integrate 1.4λ times the posterior. It is helpful to note ∞ that 0 λe −cλ dλ  c12 . E[X2 | X1  0]  1.4

0.6 112 0.6 11

+ +

0.4 ! 122 0.4 12

0.00773646  0.12325 0.0878788

!

 1.4

(E)

14. [Lesson 9] For low spenders the average amount spent below 10,000 is 2,500. For high spenders, we need to calculate E[X ∧ 10,000]. The formula in the table for the single parameter Pareto doesn’t work for α  1, but we can calculate it directly as the integral of the survival function from 0 to 10,000. Since the survival function is 1 below 5000, the integral from 0 to 5000 is 5000. Then (with X being spending by high spenders) E[X ∧ 10,000]  5000 +

Z

10,000 5000

5000 dx x

 5000 + 5000 (ln 10,000 − ln 5000)  5000 + 5000 ln 2  8465.74

Weighting the two types of spenders, the average rebate is

0.01 0.5 (2500 + 8465.74)  54.8287



15.



(D)

[Lesson 45] The density of a claim of 500 with a deductible of 250 from the first insured is f (500)  1 − F (250)

1000 15002 1000 1250



1250 15002

The density of a claim of 1500 with a deductible of 500 from the first insured is f (1500)  1 − F (500)

1000 25002 1000 1500



1500 25002

Multiplying these together (both types of insureds are equally likely, so we can ignore the 1/2 factor when calculating relative likelihoods) we get 1250  1.3333 × 10−7 (1500)(25002 ) The density of a claim of 500 with a deductible of 250 from the second insured is f (500)  1 − F (250) C/4 Study Manual—17th edition Copyright ©2014 ASM

2 (20002 ) 25003 2000 2 2250



2 (22502 ) 25003

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 16–18

1572

The density of a claim of 1500 with a deductible of 500 from the second insured is f (1500)  1 − F (500)

2 (20002 ) 35003 2000 2 2500



2 (25002 ) 35003

The product of these two densities is 22 (22502 )  1.8892 × 10−7 2500 (35003 ) The posterior probability that the insured is of the first type is then 1.3333  0.4138 1.3333 + 1.8892

(A)

16. [Subsection 23.2] Essentially we use equation (23.5) on page 383, but we’ll work it out from first principles. The estimate of f is the number of losses in the interval over n and over the size of the interval, which are here 40 and 3. The number of losses in the interval is a binomial variable with parameters m  n  40 8.4 and (estimated) q  12 40  0.3. So its variance is 40 (0.3)(0.7)  8.4. The variance of f40 (4) is then [ (40)(3) ]2  √ 0.0005833. The width of a 95% confidence interval is (2)(1.96) 0.0005833  0.09468 . (B) 17. [Lesson 32] The likelihoods for the four uncensored exponential losses in terms of the mean loss size µ are µ1 e −x i /µ . The likelihoods for the two censored losses are e −10/µ . Since the Pareto has the same mean as the exponential, the Pareto’s θ  3µ, and the likelihood of 3 is plicative constant

4 ( 34 ) 35

, and write this as

µ4

( µ+1) 5

4 (3µ )

4

(3µ+3) 5

. We can ignore the multi-

. So the likelihood function is

µ4 e −34/µ e − (1+2+4+7) µ  −20µ   L (µ)  e ( µ + 1) 5 ( µ + 1) 5 µ4

!

!

We log and differentiate this, and solve for µ. 34 − 5 ln ( µ + 1) µ dl 34 5  − 0 dµ µ2 µ + 1

l (µ)  −

34 ( µ + 1) − 5µ2  0

5µ2 − 34µ − 34  0 √ 34 + 342 + 680 µ  7.6849 10 [Lesson 30] This is a Pareto with α  2. E ( X ∧ 50)  θ 1 −



18.

50 605 121   50 + θ 15 3

!

θ C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

θ 50+θ



. The observed mean is 605/15.

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 19–21

1573

(50θ )(3)  (121)(50) + 121θ 29θ  6050 6050 θ  208.62 29 The fitted probability of survival past 50 is then S (50) 

19.

θ θ + 50

!2 

208.62 258.62

!2  0.6507

(D)

[Section 46.3] The prior gamma is π (δ) 

Multiplying it by the likelihood δ

1 δ−1 4

1 δ 4

δ2 e −δ Γ (3)

we get a constant multiple of

! δ−1

δ2 e −δ  δ3 e (− ln 4)( δ−1) e −δ  δ3 e −δ (ln 4+1) e − ln 4

so the posterior must be a gamma with α  4, θ  1/ (ln 4 + 1) . The Bayesian estimate with the zero-one loss function is the mode. For a gamma distribution, the mode is θ ( α − 1)  3/ (ln 4 + 1)  1.2572 . (A) 20. [Lesson 8] For a normal random variable, VaRp ( X )  µ+z p σ, whereas TVaRp ( X )  µ+σφ ( z p ) / (1− p ) . Here, z p  z 0.95  1.645, and 2

e −1.645 /2 φ (1.645)  √  0.103111 2π φ (1.645)  2.062 0.05 Therefore, TVaR0.95 ( X ) − Var0.95 ( X )  (2.062 − 1.645) σ  0.417σ. Setting this equal to 10, σ  10/0.417  24 . (A) 21. [Lesson 64] The empirical mean is 0.75. For x1 and x 2 , the maximum likelihood estimator of α is derived by: L (α)  

α2

(1 + x1 )(1 + x2 )



(As usual, “+1” in the exponent in the denominator can be omitted as a multiplicative constant.) l ( α )  2 ln α − α ln (1 + x 1 )(1 + x 2 )



l0 ( α )  αˆ  C/4 Study Manual—17th edition Copyright ©2014 ASM

  2 − ln (1 + x1 )(1 + x2 ) α 2

ln (1 + x1 )(1 + x2 )







PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 22–23

1574

and then the mean is estimated as µˆ 

1 ˆ . α−1

Thus we get:

Bootstrap Sample

αˆ

(0.5,0.5)

2 2 ln 1.5

(0.5,1.0)

2 ln 3

(1.0,1.0)

2 2 ln 2

µˆ

 2.4663

0.6820

 1.8205

1.2188

 1.4427

2.2589

The bootstrap approximation of the mean square error is

(0.6820 − 0.75) 2 + 2 (1.2188 − 0.75) 2 + (2.2589 − 0.75) 2 4 0.004626 + 2 (0.21977) + 2.27675  0.6802  4

22. [Lesson 46] The prior is therefore

1 5

(C)

on [5, 10] and the likelihood of a loss less than 5 is L5 . The posterior is

π ( l|X )  R

1/l 10 dl l 5



1/l 1  ln 10 − ln 5 l ln 2

5 ≤ l ≤ 10

The probability of a claim less than 5 is then 10

Z

5 dl 5 1 1  − l l ln 2 ln 2 5 10 1   0.721348 2 ln 2

!

5

!



(D)

23. [Lessons 30 and 31] The median is fitted by logging the median of the observations and matching the median of a normal, µ, so µ  ln 37.5  3.624341. The mean is matched as follows: e µ+σ µ+

2 /2



10 + 25 + 50 + 100  46.25 4

σ2  ln 46.25  3.834061 2 σ2  3.834061 − 3.624341  0.2097 2 √ σ  0.4194  0.6476

The probability that an observation X is greater than 40 is Pr ( X > 40)  1 − Pr (ln X < ln 40)

ln 40 − 3.6243 1−Φ 0.6476

!

3.6889 − 3.6243 1−Φ  1 − Φ (0.100) 0.6476

!

and 1 − Φ (0.1)  0.4602 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 24–26

1575

24. [Lesson 11] Claim counts are from an ( a, b, 0) class distribution with a  0.6 and b  0.3, so it is negative binomial and β 1+β β 0.6  1+β a

β  1.5 b  ( r − 1) a

0.3  ( r − 1) 0.6 r  1.5

E[N]  rβ  (1.5)(1.5)  2.25 Var ( N )  rβ (1 + β )  2.25 (2.5)  5.625 For the Pareto, 1000  500 2 2 (10002 ) Var ( X )  − 5002  750, 000 (2)(1) E[X] 

By the compound variance formula, Var ( S )  2.25 (750,000) + 5.625 (5002 )  3,093,750

25.

(A)

[Lesson 21] The sample mean is an unbiased estimator, so its expected value is the population

mean, which here is

1+3+9 3



13 3 .

The true median is 3. Thus the bias is

13 3

−3

4 3

. (E)

26. [Lesson 30] For a Pareto with a deductible d, mean excess loss is θ+d α−1 , or in our case with α  2, θ + d. This is k + 1 and 2k + 3 for the two deductibles. Since 3/4 of the claims are for deductible 1 and 1/4 are for deductible 3, the overall expected value is 3 1 5 3 ( k + 1) + (2k + 3)  k + 4 4 4 2 The observed mean excess loss (the amount above the deductible) is 3 + 6 + 8 + 9 + 10 + 14 + 10 + 30  11.25 8 We equate the overall expected value with the observed mean loss and solve for k. 5 3 45 k+  4 2 4 5 39 k 4 4 39 k  7.8 5 C/4 Study Manual—17th edition Copyright ©2014 ASM

(A)

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 27–28

1576

27.

[Lessons 42 and 44] The partial credibility factor is backed out from equation (44.2): PC  M + Z ( X¯ − M ) 18,500  20,000 + Z (−2000) Z  0.75

3356 Then full credibility requires 0.75 2  5966 claims. 2 2 Here, 1 + CVs  exp (1.2 )  4.2207 (see table 42.2 on page 831). So we have:

yp

!2

0.05

(4.2207)  5966 y p  1.8798

1.8798 is at the 97th percentile, so for a 2-sided test, 2 (0.97) − 1  0.94 ; there is a 94% probability of being within 5% of actual. (A) 28. [Lesson 15] Let’s create a table of the first and second moments and the variances of the 3 uniform random variables Λ, A, and Θ. Note that the expected value is the midrange, the variance is the range squared over 12, and the second moment is the variance plus the mean squared. X E[X]  Var ( X ) 

a+b 2 ( b−a ) 2 12

E X 2  Var ( X ) + E[X]2

f

g

Λ 0.1

A 3.5

0.0012

0.75

0.0112

13

Θ 2,250 187,500 5,250,000

Now we use the conditional variance formula on aggregate claims S. E[S]  E[Λ] E[A] E[Θ]  (0.1)(3.5)(2250)  787.5 Var ( S )  Var (E[S|Λ, A, Θ]) + E[Var ( S|Λ, A, Θ) ] E[S | Λ, A, Θ]  ΛAΘ

Var (ΛAΘ)  E[ (ΛAΘ) 2 ] − E[ΛAΘ]2

E[ (ΛAΘ) 2 ]  E[Λ2 ] E[A2 ] E[Θ2 ]

 (0.0112)(13)(5,250,000)  764,400 Var (ΛAΘ)  764,400 − 787.52  144,243.75

Var ( S|Λ, A, Θ)  ΛA ( A + 1) Θ2 The previous line used the compound Poisson variance formula and the formula for the second moment of a gamma distribution. E[ΛA ( A + 1) Θ2 ]  E[Λ] (E[A2 ] + E[A]) E[Θ2 ]  (0.1)(13 + 3.5)(5,250,000)  8,662,500 Var ( S )  144,243.75 + 8,662,500  8,806,743.75 Now we use the normal approximation. For 100 insureds, the expected value and variance are multiplied by 100. 1−Φ

100,000 − 78,750  1 − Φ (0.72) √ 880,674,375

!

 1 − 0.7642  0.2358 C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)

PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 29–33

1577

29. [Lesson 24] The risk set at time 0.3, the time of the first event is 100. At time 1.6, the time of the second event, 3 lives have entered (times 0.6 and 1.3) and 2 lives have left (one death, one right censored at time 1.0), so there are 101 lives. Therefore 1 1 +  0.019901 Hˆ (1.6)  100 101 Sˆ (1.6)  e −0.019901  0.980296

(D)

30. [Section 61.3] ln (1 − q )  ln 0.9  −0.105361. For 10 insureds, m  10 (20)  200. The first 6 events used λ0 through λ 5 . We generate interevent times s k . λ 6  −200 (0.105361) + 6 (0.105361)  20.4399 λ7  −200 (0.105361) + 7 (0.105361)  20.3346

ln (1 − 0.813)  0.082028 20.4399 ln (1 − 0.325) s7  −  0.019329 20.3346 s6  −

0.90 + 0.082028 + 0.019329 > 1. It took 8 numbers to get above 1, so the generated binomial number is 7 . (B) 31. [Lesson 36] If the point corresponding to 1 is not 0 and the point corresponding to 10 is not 1, then the jump from 2 to 10 should be 8 times the jump from 1 to 2 if it is uniform. This eliminates (B) and (D). Similarly, if the point corresponding to 15 is not 1 and the point corresponding to 2 is not 0, the jump from 10 to 15 should be 5/8 the jump from 2 to 10. This eliminates (C) and (E). (A) works for [a, b]  [1, 9]. 32. [Lesson 53] The gamma has parameters θ  2 (a different θ from the Pareto) and α  10, and so has variance 22 (10)  40 and second moment 202 + 22 (10)  440. µ(θ)  a v (θ)  v Z

θ 3 1 40 Var ( θ )  9 9 2θ 2 θ 2 2θ 2 −  6 9 9 2 880 (440)  9 9 a 40 1    0.04348 a + v 40 + 880 23

(C)

33. [Lesson 26, page 439] First we estimate S (7) using the product-limit estimator, equation (24.1) on page 394: 98  94 S100 (7)  100 95  0.969684 Then we use the Greenwood formula, equation (26.1) on page 437 to estimate the variance of S100 (7) :

L S100 (7)  S100 (7) Var 

C/4 Study Manual—17th edition Copyright ©2014 ASM





2 

2 1 +  0.00029719 (98)(100) (95)(94)



PRACTICE EXAM 12, SOLUTIONS TO QUESTIONS 34–35

1578

Then we use equation (26.3) to construct a log-transformed confidence interval. √ ! 1.96 V U  exp  exp (−1.131897)  0.322421 S ln S

(S1/U , SU )  (0.9089, 0.9901) The width is 0.9901 − 0.9089  0.0812 . (C) 34. [Section 1.4] We recognize X as a discrete random variable with p1  0.1, p2  0.4, p 3  0.4, and p5  0.1, so the mean is 0.1 (1) + 0.4 (2) + 0.4 (3) + 0.1 (5)  2.6, the second moment is√0.1 (1) + 0.4 (4) + 0.4 (9) +

0.1 (25)  7.8, and the variance is 7.8 − 2.62  1.04. The coefficient of variation is

1.04 2.6

 0.39223 . (D)

35. [Lesson 44] We use the formula (42.1) on page 828, to determine the exposures needed for credibility of aggregate losses, since the severity mean and variance are not given. σ eF  n0 µ

!2

2.576  0.05

!2

200,000  2654.31 (1.25)  3317.8875 4002

!

The number of expected claims needed for full credibility is 3317.8875βr  3317.8875 (2)(1)  6635.775. For 25% credibility, we need (0.252 )(6635.775)  414.74 claims. (B)

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 13, SOLUTIONS TO QUESTIONS 1–3

1579

Answer Key for Practice Exam 13 1 2 3 4 5 6 7 8 9 10

E A A A B E D C B B

11 12 13 14 15 16 17 18 19 20

E B B C A E C D D E

21 22 23 24 25 26 27 28 29 30

B A D D C E E D E D

31 32 33 34 35

A B B E D

Practice Exam 13 1.

[Lesson 6] If the median is 500, then e −500/θ 

1 2

500  ln 2 θ 500 θ  721.35 ln 2 For an exponential, the mean excess loss is θ. Adding the deductible, which is paid for a franchise deductible if the loss is above the deductible, the average payment per paid claim is 721.35 + 250  971.35 . (E) 2. [Section 46.3 and Lesson 47] The posterior parameters are calculated by setting γ  θ1  7 and then α∗  α + 1  4, γ∗  γ + 3  10, so θ ∗  0.1. For the zero-one loss function, the Bayesian estimate is the mode. According to the tables in the Appendix (which you are given at the exam), the mode of a gamma is θ ( α − 1)  0.1 (4 − 1)  0.3 . (A) 3. [Lesson 64] In working out this problem, you must keep in mind that the way the bootstrap estimator works, the sample is treated as if it were the true population. The item being estimated is calculated for this population and compared against the estimator. In this case, you are estimating the variance. The “true” variance of the population is the empirical distribution variance, which means that the sum of the square differences is divided by n, not n −1. This is despite the fact that you are measuring the quality of an estimator that divides by n − 1. In fact, for this population, there are only two possible values, 2 and 5; 2 with probability 1/4 and 5 with probability 3/4. Hence this is a Bernoulli distribution, and the variance may be computed using the Bernoulli shortcut. (See Section 3.3.) The variance is 1 Var ( Fe )  4

!

3 2 27 (3 )  . 4 16

!

n Now we compute S n for every possible sample. Since S n is the sample variance, it is n−1  43 times the empirical variance, which may be computed for each sample using the Bernoulli shortcut. The probability

C/4 Study Manual—17th edition Copyright ©2014 ASM

PRACTICE EXAM 13, SOLUTION TO QUESTION 4

1580

of each sample occurring is binomial. For example, the probability of selecting 2 2’s and 2 5’s is The following table lists all samples, their probabilities, and the values of S n for each sample: Sample

Probability of sample

2,2,2,2

4 1 4 0 4

2,2,2,5

4 1 3 3 1 4 4



2,2,5,5

4 1 2 3 2 4 2 4



2,5,5,5

4 1 3 3 3 4 4

5,5,5,5

4 3 4 4 4

Value of S n for sample

1 256







4 1 2 3 2 2 4 4 .

0

12 256

   3 1

( 32 )



4 3



54 256

   1 1

( 32 )



4 3

3

108 256

   1 3

( 32 )



4 3



4

2

4

4

2

4

81 256

9 4

9 4

0

Hence the bootstrap approximation of the mean square error is 27 2 1 9 27 (1 + 81) 0 − − + (12 + 108) 256 16 4 16 364.5   1.4238 (A) 256







2

27 + (54) 3 − 16



2!

4. [Subsection 25.1.2] We first calculate the mean ignoring the franchise deductible as the average of the averages for each interval. Here, X is the loss before application of the deductible and Y is the loss after application of the deductible. 4 1000 + 5000 3 5000 + 10,000 5 0 + 1000 + +  3083 13 E[X]  12 2 12 2 12 2

!

!

!

Then we subtract the amount not paid as a result of the franchise deductible. In the presence of a franchise deductible, insurance pays the full claim if it is above the deductible, but pays nothing if the claim is below the deductible. The expected amount not paid, therefore, is the integral of x f ( x ) over the interval below the deductible (500). Because the ogive is used, this can be simplified to the probability of the loss being 5 below the deductible times the average amount below the deductible: 24 (250) . So the expected payment is 5 E[Y]  3083 13 − 24 (250)  3031.25.

Now we calculate the second moment of Y directly, integrating x 2 f ( x ) . The histogram f ( x ) is the number of claims in the interval divided by the total number of claims (n  12) and the length of the interval. Therefore, we have: E[Y 2 ] 

5 (0) + 24

0+

1000

Z

500

5 12,000

5 x 2 dx + (12)(1000)

10003

− 3

5003

!

+

Z

4 48,000

5000 1000

4 x 2 dx + (12)(4000)

50003

− 3

10003

!

+

Z

10,000 5000

3 10,0003 − 50003 60,000 3

 121,527.8 + 3,444,444.4 + 14,583,333.3  18,149,306 So the variance is

C/4 Study Manual—17th edition Copyright ©2014 ASM

3 x 2 dx (12)(5000)

Var ( Y )  18,149,306 − 3031.252  8,960,829

(A)

!

PRACTICE EXAM 13, SOLUTIONS TO QUESTIONS 5–8

1581

5. [Lesson 28] Those who enter at time 0 give a full year exposure, and policies #3–#5 give a half-year exposure. Thus we start with a exposure of 3.5. For the mortality study, the two surrenders leave in the middle of the year and remove a half-year 0( d ) exposure apiece. So the exposure is 2.5 and there is one death. The estimated mortality rate is qˆ40  1/2.5  0.4. For the surrender study, the death removes a half-year exposure, so we have exposure of 3 and the 0( w ) estimated withdrawal rate is qˆ40  2/3.    0( d ) 0( w ) The total probability of decrement is 1 − 1 − qˆ40 1 − qˆ40  1 − (1 − 0.4)(1 − 2/3)  0.8 . (B) 6. [Section 34.2] The mean is the center of the interval (0.85, 0.95) , or 0.90. 0.05 is 1.96 times the standard deviation, which we will denote by σ. So σ2 

0.05 1.96

!2

 0.0006508

We use equation (34.4). The transforming function is g ( x )  x 1/3 . Then g 0 ( x )  31 x −2/3 The mean of the transformed interval is 0.91/3  0.9655. The transformed variance is σ

2



g (0.9) 0

2

0.9−2/3  0.0006508 3

!2

 0.0006508 (0.1279)  0.000083214 The transformed interval is √ 0.9655 ± 1.96 0.000083214  (0.9476, 0.9834) We cube this interval (for the problem, it is only necessary to cube 0.9834) to obtain (0.8509, 0.9509 ) . (E) 7. [Lesson 48] Bühlmann and Bayes credibility are equal, so we can calculate a Bühlmann credibility factor. Since after 50 claims 21  20 (1 − Z ) + 25Z, Z  0.2. v  100 and a  c, so 50c  0.2 50c + 100 c  0.5 100c 1  100c + 100 3

After 100 claims, credibility Z  13 , and the expected claim is 31 (15) + 23 (20)  18 13 . (D) 8. [Lesson 17] For the Poisson, the probabilities of 0, 1, 2, 3, and 4 claims are e −1 , e −1 , e −1 /2, e −1 /6, and e −1 /24 respectively. The aggregate probabilities are g0  e −1 g1  e −1 (0.3) g2  e −1 C/4 Study Manual—17th edition Copyright ©2014 ASM

2 1 2 (0.3 )

 e −1 (0.045)

PRACTICE EXAM 13, SOLUTION TO QUESTION 9

1582

 3 ( 0.3 ) + 0.2  e −1 (0.2045) 6     1 g4  e −1 24 (0.34 ) + 12 (2)(0.3)(0.2)  e −1 (0.0603375) g 3  e −1

So

  1

FS (4)  e −1 (1 + 0.3 + 0.045 + 0.2045 + 0.0603375)  e −1 (1.6098375)  0.5922 9.

(C)

[Lessons 42 and 52] The hypothetical mean is q. We calculate the mean of q.

 3 f ( q )  F0 ( q )   1 µ  E (Q ) 



1 4

Z

0 4000) 

12,000  2 12,000+4,000

 0.5625, whereas here it is 0.4, so this Pareto’s density is

0.4/0.5625 times the normal density. Normally, Pr ( X > 25,000)  Pareto, Pr ( X > 25,000) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

0.4 0.5625 (0.105186)

12,000 2 37,000

 0.105186, so for the spliced

 0.0748, and Pr ( X < 25,000)  1 − 0.0748  0.9252 . (C)

SOLUTIONS TO CAS EXAM 3, FALL 2006, QUESTIONS 19–25

19.

1627

[Section 4.1] If N is the number of claims, 1 − Pr ( N  0)  1 −

mixing uniform distribution, which has density 1 1 − Pr ( N  0)  2

1 2. 2

Z

!

#2

1 1  *2 + 2 3 (1 + β ) 3

"

1+

. We integrate this over the

1 dβ 1− (1 + β ) 4

0

,

1

(1+β ) 4

+ 0-

1 1 1 −  0.839506 2 81 3





(A)

20. [Section 4.1] This is a mixture distribution. If we let F1 ( x ) be the standard hospital stay distribution and F2 ( x ) the accident stay distribution, we add ln 2 to scale the distribution up to F2 ( x ) , so F2 ( x ) is lognormal with µ  7 + ln 2 and σ  2. Equivalently, we could calculate S2 (7500) . By the Law of Total Probability, Pr ( X > 15,000)  0.75S1 (15,000) + 0.25S2 (15,000) ln 15,000 − 7  0.75 1 − Φ 2

"

!#

ln 7,500 − 7 + 0.25 1 − Φ 2

"

!#

 0.75[1 − Φ (1.31) ] + 0.25[1 − Φ (0.96) ]

 0.75 (1 − 0.9049) + 0.25 (1 − 0.8315)  0.1135 21–22. 23.

(A)

Questions 21–22 are not on the current Exam C/4 syllabus [Lesson 11] For a negative binomial, the variance is 1 + β times the mean, so β  3 and then the

mean is rβ so r  1. This is a geometric distribution, with p k  1 Pr (3 ≤ N < 5)  4

k 1  β  1+β 1+β .

Here, p 0 

!  !3 !4 ! 5  3 3 3   4 + 4 + 4   0.2439  

1 4

and

(B)

24. [Lesson 13] We know that to modify exposure, which means adding additional negative binomial variables, we adjust r and the distribution is still negative binomial. However, β must be the same. One way to see this is to look at the probability generating function, P ( z )  [1 − β ( z − 1) ]−r . Multiplying two of these with the same β has the same form, but if β varies the result doesn’t have this form. (B) 25.

[Lesson 1] Let X be the random variable. P 0 (1)  E[X]  2 P 00 (1)  E X ( X − 1)  6

f

g

E[X 2 ] − E[X]  6 E[X 2 ]  8

Var ( X )  E[X 2 ] − E[X]2 8−4 4

26–28.

Questions 26–28 are not on the current Exam C/4 syllabus

C/4 Study Manual—17th edition Copyright ©2014 ASM

(D)

SOLUTIONS TO CAS EXAM 3, FALL 2006, QUESTIONS 29–31

1628

29.

[Lesson 14] Using the compound variance formula,

!2

2 (5002 ) 500 −  187,500 2 2 500 Var ( N )  1000 (0.3)(0.7)  210 E[X]   250 2 Var ( S )  (300)(187,500) + (210)(2502 )  69,375,000 E[N]  1000 (0.3)  300

p

30.

Var ( S )  8329.17

Var ( X ) 

(D)

[Lesson 9] After inflation, θ goes to 3000 (1.064 )  3787.43. Then E[X] 

3787.43  1262.48 3

!3 3787.43 + *  1043.13 E[X ∧ 3000]  1262.48 1 − 6787.43 , !3 3787.43 + * E[X ∧ 500]  1262.48 1 −  392.19 4287.43 , -

E[X ∧ 3000] − E[X ∧ 500]  1043.13 − 392.19  650.94 The difference is 1262.48 − 650.94  611.53 . (D)

2

31. [Lesson 13] Let X be the severity random variable. Pr ( X > 500)  2000  0.64. We must 2500 adjust the β from the negative binomial: 4 (0.64)  2.56. The variance of the modified negative binomial is 3 (2.56)(3.56)  27.3408 . (A) 32–40.

Questions 32–40 are not on the current Exam C/4 syllabus

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO SOA EXAM M, FALL 2006, QUESTIONS 1–20

B.8

1629

Solutions to SOA Exam M, Fall 2006

1–5. Questions 1–5 are not on the current Exam C syllabus 6. [Lesson 9] Let X be loss size. Let Y L be the payment per loss. Notice that a payment of 60 means  that the loss at least is 60/0.8  75 above the deductible, or 95. So E[Y L ]  0.8 E[X ∧ 95] − E[X ∧ 20] . It is easiest to calculate this using equation 6.3. E[Y L ]  0.8

Z

 0.8

Z

95  20 95 20

1 − F ( x ) dx





1 − ( x/100) 2 dx



953 − 203  0.8 75 − 3 (1002 )

!

 0.8 (75 − 28.3125)  0.8 (46.6875)  37.35 Let Y P be the payment per payment. This is Y L / 1 − F (20) , and F (20)  (20/100) 2  0.04, so E[Y P ] 



37.35 1−0.04



 38.90625 . (B)

7. [Lesson 18] Let S be aggregate losses, and as usual N will be claim counts and X claim amounts. E[S]  E[N] E[X]  5 (0.6) 5 + (0.4) k  15 + 2k



E[S ∧ 5]  5 1 − e −5







because with a limit of 5, S ∧ 5 is 0 when there are no claims, otherwise it is 5, and the probability of no claims is e −5 . We are given E[S] − E[S ∧ 5]  28.03. 15 + 2k − 5 1 − e −5  28.03





10 + 5e −5 + 2k  28.03 10.03 + 2k  28.03 k 9

8–19.

(D)

Questions 8–19 are not on the current Exam C syllabus

20. [Lesson 9] The minimum crediting rate is effective when the equity index is below 4%. Let X be the equity index. We need 0.75 max ( X, 0.04) , which we translate into wedges as follows: max ( X, 0.04)  X + max (0, 0.04 − X )  X + 0.04 − min (0.04, X )  X + 0.04 − X ∧ 0.04 X has mean 8% and standard deviation 16%, so we use the entry E[X ∧ 4%]  −0.58% from the tables and get E[max ( X, 0.04) ]  0.08 + 0.04 − (−0.0058)  0.1258

and 0.75 (0.1258)  0.09435 . (B) The official solution sets Y  0.75X and calculates max ( Y, 0.03) . Y has mean 6% and standard deviation 12%, so E[Y ∧ 3%] can be looked up in the table, and (not surprisingly) is 0.75 E[X ∧ 4%], so the solution is equivalent. C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO SOA EXAM M, FALL 2006, QUESTIONS 21–31

1630

21. [Lesson 14] We use the compound variance formula for Poissons, equation (14.4). For severity X (remember that for positive integer n, Γ ( n )  ( n − 1) !) θΓ (1 + 2/γ ) Γ ( α − 2/γ ) Γ(α) 22 Γ (3) Γ (1) 22 (2)(1)   4 Γ (3) 2

E[X 2 ] 

Var ( S )  3 (4)  12

(A)

22. [Lesson 13] Adding 4 geometrics is an exposure modification which multiplies r by 4. Since for a geometric r  1, this results in a negative binomial with r  4, β  1.5. E[N]  6. Let N be the negative β binomial random variable. We need E[N ∧ 3]. We’ll use ( a, b, 0) to speed up the calculation; a  1+β  0.6, b  ( r − 1) a  1.8 1 p0  2.5

!4

 0.0256

p1  (0.6 + 1.8) p0  0.06144 p2  (0.6 + 0.9) p1  0.09216 E[N ∧ 3]  p1 + 2p2 + 3 (1 − p0 − p 1 − p2 )  2.70816 The answer is 100 E[N] − E[N ∧ 3]  100 (6 − 2.70816)  329.184 . (D)



23–28. 29.



Questions 23–28 are not on the current Exam C syllabus [Lesson 9] The original insurance has expected value E[X] − E[X ∧ 2]. E[X ∧ 2]  p1 + 2 (1 − p0 − p1 )  3e −3 + 2 1 − e −3 − 3e −3  2 − 5e −3





So the original insurance has expected value 3− 2 − 5e −3  1+5e −3 . The replacement has expected value



3α, so α 

1+5e −3 3



 0.41631 . (E)

30. [Lesson 14] Let N be the number of clubs accepted, X the number of members per club, S the aggregate number of persons. The mean of S is E[S]  E[N] E[X]  (1000)(0.2)(20)  4000. For the variance we use the compound variance formula, equation 14.2. N is binomial with mean 200, variance 1000 (0.2)(0.8)  160. Var ( S )  200 (20) + 160 (202 )  68,000 √ The budget is 10 (4000 + 68,000)  42,608 . (A) 31. [Lesson 9] For provision (iii), since Michael pays 1000 under (i) and 0.2 (5000)  1000 under (ii), he pays the next 8000, and 10% above 14,000 under (iv). Thus we need 0.8 E[X ∧ 6000] − E[X ∧ 1000] + 0.9 (E[X] − E[X ∧ 14,000])





For the given parameters, E[X ∧ d]  5000

(5000)(0.8)



d  5000+d .

So the expected annual insurance reimbursement is

6,000 1000 14,000 − + (5000)(0.9) 1 − 11,000 6000 19,000

C/4 Study Manual—17th edition Copyright ©2014 ASM







 5000 (0.30303) + 5000 (0.23684)  2699

(C)

SOLUTIONS TO SOA EXAM M, FALL 2006, QUESTIONS 32–40

1631

32. [Lesson 15] As usual, N is number of claims, X claim amount, and S aggregate losses. Using the compound variance formula, E[N]  (16)(6)  96

Var ( N )  (16)(6)(7)  672

E[X]  4

Var ( X ) 

E[S]  (96)(4)  384

82  5 13 12

Var ( S )  96 (5 13 ) + 672 (42 )  11,264 Premium  384 + 1.645 11,264  558.59

p

33–38.

(D)

Questions 33–38 are not on the current Exam C syllabus

39. [Section 4.1] Pr ( N  2) for the first component is 0.52  0.25, and for the second component 4 4 2 0.5  0.375, so mixing them, 0.25p + 0.375 (1 − p )  0.375 − 0.125p . (E) 40. 1.

[Lesson 17] 600 is achieved in only two ways: With 2 claims of 100 and 500, probability: e

−5

52 (0.80)(0.16)(2)  3.2e −5 2

!

where 2 is multiplied at the end because the claims can be in 2 orders. 2.

With six 100’s: e

−5

56 (0.806 )  5.688889e −5 6!

!

The total probability is 8.888889e −5  0.05989 . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 1–3

1632

B.9

Solutions to Exam C/4, Fall 2006

1. [Lesson 31] Reminiscent of C-F05:3, but now α  2 instead of 1, and you have to calculate the smoothed empirical percentile. Use p ( n + 1) , or 0.35 (16)  4.8 for the 30th percentile and 0.65 (16)  10.4 for the 65th percentile. For the 30th smoothed empirical percentile, we interpolate 0.2 (280) +0.8 (350)  336, and for the 65th smoothed√empirical percentile, we interpolate 0.6 (450) +0.4 (490)  466. We set these equal 1 to F ( u )  1 − u α , or u  α 1 − F ( u ) . For α  2 and F ( u )  0.3 and 0.65, and u  1+ ( x/θ ) γ , we get √ 1  0.7 γ 1 + (336/θ ) √ 1  0.35 γ 1 + (466/θ ) !γ 336 1 √ − 1  0.195229 θ 0.7 !γ 1 466 − 1  0.690309 √ θ 0.35

Dividing the first into the second to eliminate θ, 466 336



0.690309  3.53590 0.195229 ln 3.53590 γ  3.8614 ln (466/336) 

(E)

2. [Lesson 45] For each type of policy, the number of claims in 4 years is Poisson with mean 4λ and the expected number in Year 5 given λ is λ. Here is a table: Type I Prior probabilities Likelihood of experience

Type II

0.05 e −1

 0.367879

Type III

0.20 2e −2

 0.270671

0.75 4e −4

 0.073263

Joint probabilities

0.018394

0.054134

0.054947

Posterior probabilities

0.144295

0.424665

0.431041

0.25

0.50

1.00

0.036074

0.212332

0.431041

Hypothetical means Bayesian premium

0.127475

0.679447

(D) 3. [Section 1.5] The coefficient of skewness is unaffected when multiplying observations by a positive constant or adding a constant, so divide all claims by 100 and then subtract 8, which gives us 2 claims of −4, 7 claims of 0, and 1 claim of 8. This has a mean of 0, which is convenient when using formula (1.2), both for calculating σ2 and for calculating the numerator; with µ  0, the formula reduces to 2 (−4) 2 + 82  9.6 10 2 (−4) 3 + 83 µ03   38.4 10 µ02 

C/4 Study Manual—17th edition Copyright ©2014 ASM

µ03

( µ02 ) 3/2

.

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 4–6

γ

1633

38.4  1.29099 9.61.5

(B)

4. [Section 63.1] Since 0.656 ≤ 0.7654 < 0.773, 4 claims occur. To invert the Weibull: u  1 − e − ( x/200)

2

2

e − ( x/200)  1 − u p x  − ln (1 − u ) 200 p x  200 − ln (1 − u ) Plugging in the first 4 u’s, we get: u 0.2738 0.5152 0.7537 0.6481 Total

√ x  200 − ln (1 − u ) 113.12 170.18 236.75 204.39 724.44

The insurer pays the maximum of is 224.44 . (C)

P

y i and

P

y  max (0, x − 150) 0 20.18 86.75 54.39 161.32

x i − 500. In this case, the latter is higher and the answer

5. [Lesson 32] This problem features left censoring, demonstrating that left censoring (and right truncation) questions may be asked; they are only off the syllabus as far as estimation with Kaplan-Meier or Nelson-Åalen. −θ/x For an inverse exponential, the density is θex 2 , but x 2 is a constant we can ignore. So the likelihood of the given experience is e −θ/x for each value censored or not, times θ for uncensored values. L ( θ )  θ3 e Let K 

P

−θ (

P

1 xi

)

1 xi

l ( θ )  3 ln θ − θK dl 3  −K 0 dθ θ 3 θ K 1 1 1 7 K + + +  0.148184 186 91 66 60 3 θ  20.2452 0.148184 The mode is θ/2  10.13 . (A) 6. [Lesson 52] The hypothetical mean is 4θ. Therefore, µ  E[4θ]  4 (600)  2400 600 For 2 years, average experience is 1650, so Z satisfies 2400 + Z (1650 − 2400)  1800 implying Z  750  2 3 6 1400+1900+2763 0.8, and Z  2+k , so k  0.5. Then for 3 years, Z  3+k  7 and the experience mean is x¯   3 6 2021 so the credibility estimate is 2400 + 7 (2021 − 2400)  2075 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 7–12

1634

7. [Lesson 26] The calculation is yi 4 8

ri 10 5

Sˆ ( y i ) 0.8 0.64

si 2 1

By formula 26.1 on page 437,

L Sˆ (11)  0.642 Var f

g



1 2  0.03072 + (10)(8) (5)(4)



(B)

8. Question 8 is not on the current Exam C/4 syllabus 9. [Section 1.3] The expectation of a sum is the sum of the expectations, even when the variables are not independent. (This is not true of variance, as students discovered on a similar question on the Spring 2007 exam.) So E[S m ] 

m X

E[X i ]  m E[X1 ]

i1

since f all X i ’sg are identically distributed. By the conditional mean formula, formula 1.3 on page 9, E[X1 ]  θa 1 E E[X1 | Q]  E[Q], and a the mean of a beta distribution is a+b  100  0.01 here. We want m (0.01) ≥ 50, making m  5000 . (E)

10. [Lesson 47] The prior is gamma with α  4, the exponent on λ, and θ  1/50, the reciprocal of the exponent on e, or γ  θ1  50. Add number of claims, 7 (1) + 2 (2) + 1 (3)  14, to α and number of risks, 100 to γ, so α → 18 and γ → 150. The new mean per risk is αγ and for 100 risks the expected number of claims is 100 11.

18  150

 12 . (D)

[Lesson 62] By limited fluctuation credibility, we need n0 CV2 . The coefficient of variation is the

standard deviation over the mean, or 1.2. n0  rounded up to 2213 . (E) 12.

Φ−1 (0.975)  0.05

2

 1536.64. The answer is 1536.64 (1.2) 2  2212.8,

[Section 35.3] The likelihood with m  2 is

Y 2!

(1 − q ) 2−x i q x i ! X X X 2 l (q )  ln + [ln (1 − q ) ] (2 − x i ) + (ln q ) xi

L (q ) 

xi

xi

Plugging in x i ’s, which are 5000 0’s and 5000 1’s, l ( q )  5000 ln 2 + [ln (1 − q ) ] (10,000 + 5,000) + (ln q )(5000) dl 15,000 5000 − + 0 dq 1−q q q

l

1

4

1 4

 5000 ln 2 + 15,000 ln 34 + 5000 ln 14  3465.74 − 4315.23 − 6931.47  −7780.96

C/4 Study Manual—17th edition Copyright ©2014 ASM

(B)

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 13–16

1635

ˆ ¯ ˆ 13. [Section 58.1] x¯  1+1+0+2+3 2+2+1+3+2  0.7, and by Poisson, µ  v  x  0.7, but since we don’t have individual data by vehicle, we must use formula (57.6) on page 1117 to estimate a. 1+1+0  0.4 5 3+2 m2  3 + 2  5 x¯ 2  1 2+3 5 (0.4 − 0.7) 2 + 5 (1 − 0.7) 2 − 0.7 aˆ   0.04 2 +52 10 − 5 10 5aˆ 5 (0.04) 2 Zˆ    5 aˆ + vˆ 5 (0.04) + 0.7 9 2 7 (C) PC  (0.4) + (0.7)  0.6333 9 9 m1  2 + 2 + 1  5

x¯ 1 

14. [Lesson 24] k is the number of distinct observation points, so X must be one of the other 4 values, eliminating (E). You want to make Hˆ (410) as high as possible by (iv), so you want to make the risk set as small as possible. Thus 100 offers the best opportunity, and works. (A) They had to state (ii) or else 200 would also work. I’m not sure why (iii) is needed. 15.

[Section 35.1] The maximum likelihood estimator is the sample mean, which here is 10 + 2 + 4 + 0 + 6 + 2 + 4 + 5 + 4 + 2  3.9 10

The variance of the sample mean is the distribution variance divided by the number of observations. For a Poisson, the distribution variance equals the mean, which we estimated as 3.9. So the estimated coefficient of variation is √ σˆ 3.9/10 1   √  0.1601 (B) µˆ 3.9 39 16.

[Lesson 45] Here’s the table: θ8 Prior probabilities Likelihood of experience

0.8 1 −5/8 8e

 0.066908

0.2 1 −5/2 2e

 0.041042

Joint probabilities

0.053526

0.008208

Posterior probabilities

0.867036

0.132964

8

2

6.93629

0.26593

Hypothetical means Bayesian premium (E) 17.

θ2

Question 17 is not on the current Exam C syllabus

C/4 Study Manual—17th edition Copyright ©2014 ASM

0.061734

7.20222

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 18–23

1636

18. [Subsection 33.4.1] First shift everything from 4 to 0. Then, as we discussed in Subsection 33.4.1, the shortcut is that the maximum likelihood estimate is the censoring point times total observation count divided by uncensored observation count. Here there are 3 uncensored and 5 total observations, and the shifted estimate is 29 − 4  25, so 5 3 p

 25

p  15

(D)

19. [Section 54.1] We do everything per month. There are n  100 + 150 + 250  500 observations with mean 10+11+14  0.07. The hypothetical mean and process variance are λ. 500 µ  v  E[λ]  θΓ 1 +



1 2 2



 0.1 (0.88623)  0.088623

a  E[λ2 ] − µ2  θ Γ (2) − θ 2 Γ (1.5)  0.01 (1 − 0.886232 )  0.0021460 500 (0.0021460)  0.92371 Z 500 (0.0021460 + 0.088623) PC  0.088623 + 0.92371 (0.07 − 0.088623)  0.071421 The credibility estimate for 12 months times 300 insureds is (12)(300)(0.071421)  257.11 . (B) 20. [Lesson 24] The difference between the two estimates is that the first one will have, in the sum, 3 terms for 2500 and 5000, while the second one will not. Those terms are 12 (at 2500, risk set is 12 and 3 events) and

2 6

(at 5000, risk set is 6 and 2 events). The sum is

1 4

+

1 3



7 12

 0.58333 . (D)

21. [Section 63.1] They put a fancy wrapper on the simulation, but since x  1 for 2005, y is the constant 0.801e −0.747 and we can hold out multiplying by it until the end. The following table derives the simulated lognormals. ui n i  Φ−1 ( u i ) 0.2877 −0.56 0.1210 −1.17 0.8238 0.93 0.6179 0.30 Average

x i  e 13.294+0.494n i 450,161 333,041 939,798 688,451 602,863

602,863 (0.801) e −0.747  228,788 . (A) The official solution by mistake has 330,041 instead of 333,041 above, but it doesn’t affect the answer choice. 22. [Section 40.2] The penalty function is ln 2260  2.78. A quick glance shows that adding parameters to the 1-parameter model never improves the loglikelihood by more than 2 per parameter, which is less than the required 2.78, so model I is favored. (A) 23.

[Lessons 45 and 52] 2.98 is arrived at through a formula of the form w 1 (1) + w 2 (3)  2.98 w1 + w2

2 1 which is essentially a weighted average of 1 and 3 with weights w1w+w and w1w+w . The second weight must 2 2 be 99 times the first weight in order to get 2.98. So w 2  99w 1 . But the weights are the prior distribution

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 24–27

1637

times the likelihood, so the equation comparing the weights is 99 0.75

e −1 r!

!!

 0.25

e −3 3r r!

!

Notice that e −1 /r! and e −3 (3r /r!) are the likelihoods of r claims under the 2 λ’s. Let’s solve equation this for r. 99 (0.75) e −1  0.25e −3 3r 3r  297e 2 2 + ln 297 r 7 ln 3 Now for the Bühlmann credibility estimate. The hypothetical mean and process variance are λ. µ  v  0.75 (1) + 0.25 (3)  1.5 a  Var ( λ )  (0.75)(0.25)(3 − 1) 2  0.75 0.75 1 Z  2.25 3 2 1 (E) PC  (7) + (1.5)  3 31 3 3

by the Bernoulli shortcut

24. [Lesson 27] We could use survival function kernels instead of distribution function kernels (and the official solution does this), but to avoid confusion we’ll use the usual distribution function kernels to estimate F (40) and then take the complement. As a function of 40, the distribution kernel K y (40) is a linear function of the observation y between 30 and 50, so it is 30 − 0.05 ( y − 30) in that range. K y (40)  1 for y ≤ 30 and 0 for y ≥ 50. So the values are y K y (40)

25 1

30 1

35 0.75

35 0.75

37 0.65

39 0.55

45 0.25

47 0.15

49 0.05

55 0

Adding up the kernels, 2 (1) + 2 (0.75) + 0.65 + 0.55 + 0.25 + 0.15 + 0.05  5.15. Divide by the number of points to get the estimate Fˆ (40)  0.515, so Sˆ (40)  1 − 0.515  0.485 . (E) 25.

Question 25 is not on the current Exam C/4 syllabus

26. [Lesson 21] Seems more like an Exam P/1 type of question. f ( x ) is exponential with mean θ. If you were totally stumped, you could try n  1 for which you know from the tables that E[X 2 ]  2θ2 , eliminating everything except (A) and (B). Anyhow, the second moment is the mean squared plus the ¯  θ, while the variance of the sample mean is the distribution variance over the sample variance. E[X]  2 2 size, or θ . So E[X¯ 2 ]  θ 2 + θ  n+1 θ 2 . (A) n

n

n

27. [Section 57.1] The means of the 3 policyholders are x¯ X  3, x¯ Y  5, and x¯ Z  4. The overall mean is x¯  3+5+4  4. 3 The sample variances of the 3 policyholders are (0 terms are omitted)

(2 − 3) 2 + (4 − 3) 2

2  3 3 (4 − 5) 2 + (6 − 5) 2 2 vY   3 3

vX 

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 29–31

1638

vZ − so vˆ 

2 2 4 3+3+3

3

(5 − 4) 2 + (5 − 4) 2 + (3 − 4) 2 + (3 − 4) 2 3



4 3

 89 . Then sample variance of the policyholder means is

(3 − 4) 2 + (5 − 4) 2 + (4 − 4) 2 2

1

By formula (57.3), the estimated variance of the hypothetical means is aˆ  1 − 28.

vˆ 8/9 7 1−  n 4 9

(C)

Question 28 is not on the current Exam C/4 syllabus

29. [Lesson 49] π ( q ) is a beta with a  2, b  2. There are 4 possible claims per year, 2 occur and 2 don’t, so add 2 to both a and b making them a  4, b  4. If you realize that this distribution is symmetric on [0, 1], or you happen to know the formula for the mode, you know the answer is 0.5 . (C) Otherwise, you’d have to derive the formula, as follows. We want to maximize the beta density, and as usual, logging makes it easier to differentiate. f ( x )  kq a−1 (1 − q ) b−1

ln f ( x )  ln k + ( a − 1) ln q + ( b − 1) ln (1 − q ) d ln f ( x ) a − 1 b − 1  − 0 dx q 1−q

( a − 1)(1 − q )  ( b − 1) q [ ( b − 1) + ( a − 1) ]q  a − 1 q and with a  b, this is 0.5.

a−1

( a − 1) + ( b − 1) zp

2

30. [Lesson 42] For the original credibility standard, n0  0.03  2000, where z p is the normal coefficient for probability 2p − 1. For the new standard, the full credibility standard in expected number zp

2 

of claims is λ F  0.05 1 + CV2s , where CVs is the coefficient of variation of the severity distribution. The severity distribution is uniform on [0, 10,000], so its mean is 10,000/2 and its variance is 10,0002 /12. Its coefficient of variation squared is 10,0002 /12 1 CV2s   3 100002 /22 The answer is

31.



0.03 λ F  2000 0.05

!2 

1 1+  960 3



(B)

[Lesson 24] The Nelson-Åalen estimate of H (10) is − ln 0.575  0.5534. Then 1 3 7 + +  0.4146 50 49 21

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTIONS 32–34

0.4146 +

1639

5  0.5534 k

5  0.5534 − 0.4146  0.1388 k 5  36 (C) k 0.1388 32.

[Section 63.1] For an exponential, the inversion formula is x  −θ ln (1 − u ) . The inversion and the

resulting benefit, which is max 0, min 0.8 ( x − 100) , 1000 , is shown in the following table.



ui 0.30 0.92 0.70 0.08





x i  −1000 ln (1 − u i ) 356.67 2525.73 1203.97 83.38

Reimbursement 205.34 1000.00 883.18 0

The average annual reimbursement is 205.34+1000.00+883.18+0  522.13 . (A) 4 Many students divided by 3 instead of by 4. However, the word “annual” means “per year”, not “per paid year”, so dividing by 3 is incorrect. 33. • • •

[Subsection 32.1.2] The likelihoods of the 3 intervals are For x ≤ 10, 1 −

For 10 < x ≤ 24,

For x > 25,

θ 25

10−θ 10 θ θ − 10 25 

θ 10



c1 θ

where c 1 is a constant. We can drop all constants when maximizing likelihood, so we end up with (10 − θ ) 9 θ 11 as the likelihood function. You may already recognize this as a beta distribution with θ  10, 11  a  12, b  10, and mode 10 11+9  5.50 (we just did this in question 29!), but if not: L ( θ )  (10 − θ ) 9 θ 11

l ( θ )  9 ln (10 − θ ) + 11 ln θ 9 11 dl − + 0 dθ 10 − θ θ 9θ  11 (10 − θ ) 110 θ  5.50 (B) 20

34. [Lesson 34] The maximum likelihood estimator for an exponential is the sample mean. The variance of the sample mean is the distribution mean squared divided by the size of the sample. The sample mean is 100 + 200 + 400 + 800 + 1400 + 3100  1000, 6 2 so the maximum likelihood estimate is θˆ  1000 and its estimated variance is 1000 6 . S (1500)  e −1500/θ , so the transformation function for the delta method is g ( x )  e −1500/x . The derivative is 1500 e −1500/x . Setting x  θˆ  1000 and squaring, the estimated variance of the maximum likelihood x2 estimator of S (1500) is

! ! ! −3   10002 15002 −3000/1000 2 e L Var S (1500)  e  1.5  0.01867 4 6

C/4 Study Manual—17th edition Copyright ©2014 ASM

1000

6

(A)

SOLUTIONS TO EXAM C/4, FALL 2006, QUESTION 35

1640

35.

[Lesson 22] From the ogive, we have 36 + 0.4x  0.21 n 36 + x + 0.6y  0.51 n



36 + 0.4x  0.21n



36 + x + 0.6y  0.51n

Subtracting the first equation from the second, 0.6 ( x + y )  0.3n x + y  0.5n Since the other numbers add up to 36 + 84 + 80  200, x + y  200 and n  400. Then Fn (90)  0.21 implies 1 84 observations below 90, or 48 in (50, 90]. Extending the line, x  0.4 (48)  120 . (A)

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 1–6

B.10

1641

Solutions to Exam C/4, Spring 2007

1. [Lesson 32] An exponential has no memory, so the conditional density of f ( x − 50|X > 50) has the same form as f ( x ) , (1/θ ) e − ( x−50)/θ , and the conditional cumulative distribution F ( x − 50|X > 50) has the same form as F ( x ) , e − ( x−50)/θ . We are given the payoffs, x − 50. The product of three densities for the first 3 payments and two distributions for the last 2 payments is 1 −50/θ −150/θ −200/θ −350/θ −350/θ e e e e e θ3

!

and 50 + 150 + 200 + 350 + 350  1100, making the product

1 −1100/θ e θ3

. (D)

2. [Section 56.2] The Bühlmann estimate must be linear, eliminating (D). The Bühlmann estimate must be both above and below the Bayesian, eliminating (B). The Bayesian estimate can’t be below 6 (0.1)  0.6 or above 6 (0.6)  3.6, eliminating (A) and (C). That leaves (E). 3. [Lesson 4.2] A common error on this question was to calculate the variance of each X i and multiply by 101. Since the X i ’s are not unconditionally independent, the variance of S101 is not the sum of the variances. We use conditional variance. Var ( S101 )  E Var ( S101 | Q ) + Var E[S101 | Q]

f

g





E[S101 | Q]  101 E[X i | Q]  101Q

Var ( S101 | Q )  101 Var ( X i | Q )  101Q (1 − Q ) 1 E[Q]   0.01 1 + 99 a ( a + 1) 2 E[Q 2 ]   ( a + b )( a + b + 1) 100 (101) ab 99 Var ( Q )   ( a + b ) 2 ( a + b + 1) 1002 (101)



E 101Q (1 − Q )  101 E[Q] − E[Q 2 ]  101 0.01 −

f

g





Var (101Q )  1012 Var ( Q )  0.9999

Var ( S101 )  0.99 + 0.9999  1.9899

2  0.99 100 (101)



(B)

4. Question 4 is not on the current Exam C/4 syllabus 5. [Lesson 39] Since the variates are uniform, the expected number in each interval is 1000 20  50. Using equation (39.2), 51,850 Q − 1000  37 50 For 19 degrees of freedom (one is lost because of breaking into intervals), the critical value at 1% significance is 36.191, so we reject at 0.01 significance. (E) 6. [Lesson 51] For a Poisson, the process variance equals the hypothetical mean. The expected value of the process variance is v  θ (0.50) + (1 − θ )(1.50)  1.5 − θ C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 7–9

1642

The variance of the hypothetical means (from the Bernoulli shortcut) is a  θ (1 − θ )(1.50 − 0.50) 2  θ − θ 2 The credibility factor is Z

a θ − θ2 θ − θ2   a + v θ − θ 2 + 1.5 − θ 1.5 − θ 2

(A)

7. [Subsection 25.1.2] Total number of claims is 30 + 36 + 18 + 16  100, so the probabilities of the intervals are 0.3, 0.36, 0.18, 0.16. Break (200, 400] up into (200, 350] and (350, 400]; the probability of (200, 350] is 0.75 (0.16)  0.12. For the four intervals up to 350, we calculate the second moment as the probability of the interval times the second moment of the uniform distribution within the interval. 0.3 (502 ) + 0.36

1003 − 503 2003 − 1003 3503 − 2003 + 0.18 + 0.12  47,550 100 − 50 200 − 100 350 − 200 47,550  15,850 3

To this we add the probability of being above 350 times 3502 , since in this interval the minimum of X and 350 is 350. 15,850 + 0.04 (3502 )  20,750 (E) 8. [Lesson 17] We will do this directly rather than using the recursive formula. e −3 appears in every aggregate probability of k, g k , so we will ignore it until the end. If p k is the Poisson probability of k, let q k  p k e 3 and let h k  g k e 3 . q0  1

q1  3

h0  1

q2 

32  4.5 2!

q3 

33  4.5 3!

h 1  3 (0.4)  1.2 h 2  3 (0.3) + 4.5 (0.42 )  1.62 h 3  3 (0.2) + 4.5 (2)(0.4)(0.3) + 4.5 (0.43 )  1.968 On the last line, we computed the probabilities of 1 claim of 3, 2 claims of 1 and 2, and 3 claims of 1. The total of the probabilities is e −3 (1 + 1.2 + 1.62 + 1.968)  5.788e −3  0.2882

(B)

9. [Section 63.1] 0.981 is between the probabilities of 3 and 4, so there are 4 simulated claims. To invert the Pareto,

! 2.8

36 1−u  36 + x √ 36 2.8  1−u 36 + x 36 x  2.8 − 36 √ 1−u C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 10–12

1643

The following table derives the four claims amounts. We could suspect that the lowest one will be below 5 and the highest one above 35, but with so few observations it doesn’t pay to use the shortcut of calculating F (5) and F (35) . u 0.571 0.932 0.303 0.471

x 12.7044 58.0294 4.9535 9.1927

max[0, min (30, x − 5) ] 7.7044 30 0 4.1927

The sum is 7.7044 + 30 + 0 + 4.1927  41.8971 . (B) 10.

[Lessons 30 and 31] The median of an exponential is x such that e −x/θ  0.5 x  −θ ln 0.5  θ ln 2

Shifting by δ adds δ to the mean and median, so that the 2 equations we have are θ + δ  300 θ ln 2 + δ  240 Substitute θ  300 − δ into the second equation. 300 ln 2 + δ (1 − ln 2)  240 240 − 300 ln 2  104.47 δ 1 − ln 2 11.

(E)

[Section 57.1] We can use the uniform formula. 1+3+1 5  3 3 5 + 9 + 6 20  x¯  3 3 !!     1 20 2 20 2 20 2 vˆ aˆ  5− + 9− + 6− − 2 3 3 3 3

vˆ 

1 5  * 2 3

!2

7 + 3

!2

!2

2 + 5 34 −  + 3 9 9

,

-

3aˆ 34/3 34    0.8718 Zˆ  3aˆ + vˆ 34/3 + 5/3 39

(D)

12. [Lesson 26] Sˆ (8) is extraneous. Since we start off with 200, Sˆ (9)  0.16, and there’s no censoring, the risk set at time 10 is 200 (0.16)  32. By the Greenwood approximation formula, equation (26.1) on page 437, the difference in the c S2 is s 10 r10 ( r10 − s10 ) s 10 0.04045 − 0.02625  32 (32 − s 10 ) c S2 (10) − c S2 (9) 

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 13–16

1644

s10  0.0142 (32)(32 − s10 )  0.4544 (32) − 0.4544s10

s10 (1.4544)  14.5408 14.5408  10 s10  1.4544

(A)

13. [Lesson 9 or Section 14.2] The coefficient of variation is scale-free, so let’s make the exponential have mean 1 and make the deductible 3. Let p be Pr ( X > 3)  e −3 . Then Y is a mixture of 0, weight 1 − p, and an exponential with mean 1, weight p. The mean is E[Y]  p and the second moment is

E[Y 2 ]  p (2)

The coefficient of variation is the square root of variance over the mean, or

p

14.

2p − p 2  p

√ 2e −3 − e −6  6.2587 e −3

(C)

[Lesson 40] The loglikelihood function is L (θ) 

20 Y i1

2θ 2 (θ + xi )3

l ( θ )  20 ln 2 + 40 ln θ − 3

X

ln ( x i + θ )

The difference between l (7.0) and l (3.1) is 40 ln (7) − 40 ln (3.1) − 3

X

ln ( x i + 7) −

X

ln ( x i + 3.1)  32.58 − 3 (49.01 − 39.30)  3.45



The test statistic is twice the difference, or 6.90. There is one degree of freedom (only θ is being fixed at 3.1). The critical value at 1% significance is 6.635, so we reject at 1% significance. (E) 15. [Lesson 49] After one year, since there are 8 potential claims per year, a → a + 2 and b → b + 6  15, a+2   28 so (since 2.54545  28/11) 8 a+17 11 which implies a  5. After two years, a → a + 2 + k  7 + k and  112 b → b + 6 + 8 − k  23 − k, and 3.73333  112/30, so 8 7+k 30  30 making k  7 . (D) 16. [Lesson 27] F5 (150) is the proportion of observations below 150, or 0.4. For Fˆ (150) , 82 has a kernel of 1 since it is more than 50 less than 150. For 126, 150 is 0.74 of the way into the kernel (which goes from 76 to 176), so its kernel is 0.74. For 161, 150 is 0.39 of the way from 111 to 211, so its kernel is 0.39. The other two observations have a kernel of 0 at 150. We sum up the probabilities and divide by 5, the number of observations. 1 + 0.74 + 0.39  0.426 Fˆ (150)  5 The difference is |0.426 − 0.4|  0.026 . (C)

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 17–24

1645

17. [Lesson 15] To make the calculations easier, we’ll scale single-losses by 10,000: make the mean 2 and the standard deviation 0.5. This has no effect on the result which is scale-free (i.e., aggregate claims being 150% times expected costs). Unfortunately, only severity may be scaled (this is like translating it into a different currency); frequency may not be scaled! Using compound variance, E[S]  100 (2)  200 Var ( S )  100 (0.52 ) + 252 (22 )  2525 300 − 200  1 − Φ (1.99)  0.0233 Pr ( S > 300) ≈ 1 − Φ √ 2525

!

18.

(A)

[Lesson 35] Total number of policies is 3000. The MLE for a Poisson is the sample mean: 1200 (1) + 600 (2) + 200 (3) 1 3000

Since it is Poisson, the variance of one observation is the mean, and the variance of the average √ of 3000 observations is the mean over 3000. So the lower end-point of the confidence interval is 1−1.645 1/3000  0.97 . (C) They carefully said “large-sample confidence interval that is symmetric around the mean” so that you would not use the method of Example 21I. 19.

Question 19 is not on the current Exam C/4 syllabus

20. [Lesson 37] We must find the largest difference between the F ∗ ( x ) column and the last two columns. Since F ∗ ( x ) > Fn ( x ) in this range, the largest difference is always with the Fn ( x − ) column. Observed x 4.26 4.30 4.35 4.36 4.39 4.42

F ∗ ( x ) Fn ( x − ) Fn ( x ) Max difference 0.584 0.505 0.510 0.079 0.599 0.510 0.515 0.089 0.613 0.515 0.520 0.098 0.621 0.520 0.525 0.101 0.636 0.525 0.530 0.111 0.638 0.530 0.535 0.108 √ The largest difference is 0.111. 0.111 200  1.57, which is between 1.52 and 1.63. (D) 2a 21. [Lesson 51] We have Z  2a+v  0.25. The hypothetical mean is αθ, so a  Var ( αθ )  θ2 Var ( α ) . The process variance is αθ 2 , so v  E[αθ2 ]  θ 2 E[α]  50θ 2 . Then

2 Var ( α )  0.25 2 Var ( α ) + 50 2 Var ( α )  0.5 Var ( α ) + 12.5 12.5 Var ( α )   8.33 (A) 1.5 22–23.

Questions 22–23 are not on the current Exam C/4 syllabus

24. [Lesson 31] The smoothed 20th percentile is the 0.2 (17)  3.4th element. Interpolating between 75 and 81, (0.4)(81) + (0.6)(75)  77.4. C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 25–26

1646

The smoothed 70th percentile is the 0.7 (17)  11.9th element. Interpolating between 122 and 125, we get (0.9)(125) + (0.1)(122)  124.7. τ The distribution function for a Weibull is F ( x )  1 − e − ( x/θ ) . We set it equal to the percentiles. e − (77.4/θ )  0.8 τ

e − (124.7/θ )  0.3 τ

77.4 θ



124.7 θ



124.7 77.4



 − ln 0.8  − ln 0.3

ln 0.3  5.3955 ln 0.8 5.3955 5.3955   3.5342 τ ln (124.7/77.4) 0.47692 √ 77.4 √τ 3.5342  − ln 0.8  − ln 0.8  0.65416 θ 77.4  118.32 (E) θ 0.65416

25.



[Lesson 58] You are given that there are 100 policies. The sample mean and variance are 34 + 13 (2) + 5 (3) + 2 (4)  0.83 100 34 + 13 (22 ) + 5 (32 ) + 2 (42 ) µ2   1.63 100  100  aˆ + vˆ  s 2  1.63 − 0.832  0.95061 99 µˆ  vˆ 

As usual, you would get almost the same answer if you didn’t unbias the sample variance. The credibility factor is aˆ 0.95061 − 0.83 Zˆ    0.12687 aˆ + vˆ 0.95061 Since Year 6 is 1/5 of the observation period, the estimate for the number of claims is 1/5 the credibility calculation, which is 1 5



µˆ + Zˆ (3 − µˆ ) 



1 5



0.83 + 0.12687 (3 − 0.83)  0.2211



(A)

26. [Lesson 28] This question is based on the notation in Loss Models Third Edition. P j is the initial population before new entrants at the start of the year. d j are new entrants, u j are withdrawals, and x j are events. α  1 means that it is assumed that all new entrants enter at the beginning of the year, and β  0 means that it is assumed that all withdrawals are at the end of the year. Since we’re not interested in claims below 500, we can start our calculation at the second line, since the product-limit is always a conditional survival function. α  1 and β  0 the risk set is made as large as possible—all entrants are considered and no withdrawals are considered. A claim payment of 5500 corresponds to c j  6000. The calculation is

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 28–29

r j  Pj + d j

sj  xj

500

11

2

9 11

1000

16

4

3 4 (0.8182)

 0.6136

2750

11

7

4 11 (0.6136)

 0.2231

5500

3

1

2 3 (0.2231)

Question 27 is not on the current Exam C/4 syllabus

28.

[Lesson 31] The mean of the data is ∞ 0

Sˆ ( c j+1 | X > 500)

cj

27.

Z

1647

 0.8182

 0.1488

S ( x ) dx  10 (1) + (100 − 10)(0.8) + (1000 − 100)(0.4)  442

To determine the parameters of the lognormal model, we use the 20th and 60th percentiles and match the lognormal to them. Φ−1 (0.2)  −0.842 and Φ−1 (0.6)  0.25. µ − 0.842σ  ln 10

µ + 0.25σ  ln 100 ln 100 − ln 10 σ  2.109 0.25 + 0.842 µ  ln 10 + 0.842σ  ln 10 + 0.842 (2.109)  4.078

2

The mean is e 4.078+0.5 (2.109 )  545.2. The difference in means is 545.2 − 442  103.2 . (B) 29. [Lesson 64] A lot of calculation. We must consider 16 possibilities: fire losses of (3,3), (3,4), (4,3), (4,4) combined with wind losses of (0,0), (0,3), (3,0), (3,3). Switching the order of losses can be considered a single case with double weight, but that still leaves 9 possibilities, 3 for fire and 3 for wind. If wind losses are (0,0) there is no savings regardless of fire, so we have 7 cases. For each case, we take the difference of the percentage savings from 0.2 and square it. The 7 cases, with their weights, are Weight 4

Wind losses (0,0)

Fire losses any

Savings 0

Square difference 0.04

2

(0,3)

(3,3)

2 9

0.000494

4

(0,3)

(3,4)

0.2

0 0.000331

2

(0,3)

(4,4)

2 11

1

(3,3)

(3,3)

1 3

0.017778

2

(3,3)

(3,4)

4 13

0.011598

1

(3,3)

(4,4)

4 14

0.007347

C/4 Study Manual—17th edition Copyright ©2014 ASM

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 30–35

1648

The weighted sum is 4 (0.04) + 2 (0.000494) + 2 (0.000331) + 0.017778 + 2 (0.011598) + 0.007347  0.0131 16

(E)

30. [Lesson 50] The inverse gamma has parameters θ  c and α  2 (one less than the exponent on β). Adding one loss of x adds x to c and 1 to 2, so the posterior is an inverse gamma with parameters θ  c + x and α  3. The mean of an inverse gamma is 31.

θ α−1



x+c 2

. (C)

[Subsection 33.4.2] By the Pareto shortcut, K  ln Q αˆ  −

5007  ln 0.02267  −3.7866 (400 + x i )

7  1.8486 K

The expected loss with no deductible is θ 400  471.35  αˆ − 1 0.8486

(A)

32. [Lesson 54] None of these conditions are required. (B) and (C) are needed for certain types of limited fluctuation credibility, and (A) is needed for ungeneralized Bühlmann. (E) 33. [Lesson 26] Through time 12, events of interest only occur at times 8 and 12. The risk sets and events are yi 8 12

ri 7 5

si 1 2

Hˆ ( y i ) 0.1428 0.1428 + 0.4  0.5428

Note that although someone switches to diet program at time 12, withdrawers are part of the risk set when they withdraw at an event time. The Åalen variance is 712 + 522  0.1004. The upper limit of the symmetric 90% linear confidence interval √ is 0.5428 + 1.645 0.1004  1.0641 . (D) 34.

Question 34 is not on the current Exam C/4 syllabus

35.

[Lesson 45] Using Bayes’ Theorem: 1 Pr G  3

    ! Pr D  0 G  31 Pr G  13 D  0          Pr D  0 G  31 Pr G  13 + Pr D  0 G  53 Pr G  35  

C/4 Study Manual—17th edition Copyright ©2014 ASM

(1/3)(2/5) (1/3)(2/5) + (1/5)(3/5) 2/15 2 15

+

3 25



10 19

(E)

SOLUTIONS TO EXAM C/4, SPRING 2007, QUESTIONS 36–40

1649

36. [Lesson 52] The process variance is constant, 10002 , so v  10002 . θ is the hypothetical mean. Its mean is 0.6 (2000) + 0.3 (3000) + 0.1 (4000)  2500 and its variance is computed by calculating the second moment. 0.6 (20002 ) + 0.3 (30002 ) + 0.1 (40002 )  6,700,000 a  6,700,000 − 25002  450,000 There were 24 + 30 + 26  80 exposures, so the credibility factor is Z

80 (450,000) 80a   0.9730. 80a + v 80 (450,000) + 1,000,000

The experience mean aggregate losses are 24,000+36,000+28,000  1100. The Bühlmann-Straub credibility 80 expectation is 2500 + 0.9730 (1100 − 2500)  1137.84 . (B) [Lesson 62] This is like a limited fluctuation credibility situation, and the answer is n0 CV2 . n0   257.62  66,358. The probability of being below 300 is a Bernoulli distribution (either you are or 0.01   you aren’t) with mean F (300)  1 − e −3 and variance F (300) 1 − F (300)  (1 − e −3 ) e −3 , so the coefficient of 37.

z0.995  2

variation squared is e −3 / (1 − e −3 )  0.05240. The final answer is 66,358 (0.05240)  3477 , which is within rounding of (E) 38.

[Lesson 24] s3 is extraneous. From S n ( y3 ) and S n ( y4 ) , we have r4 − s 4 r4 10 r4 − 3  13 r4 r4  13

0.50  0.65

Then r5  r4 − s4 − 6  13 − 3 − 6  4. From S n ( y5 ) , we have

r5 − s 5 4 − s 5  r5 4 s5 1−  0.5 4 s5  2 (B)

0.25  0.50

0.3

39. [Lesson 13] The probability of a loss above 200 is e − (200/1000)  0.539542. The number of losses above 200 is the total number of losses (rβ  (3)(5)  15) times the probability that a loss is above 200, or (15)(0.539542)  8.0931 . (C) 400 40. [Lesson 23] The mean is 2000  0.2. The variance of the sample mean, since it’s binomial (there (0.2)(0.8) are only 2 possibilities), is 2000  0.00008. The upper bound of the 95% symmetric confidence interval √ is 0.2 + 1.96 0.00008  0.2175 . (D)

C/4 Study Manual—17th edition Copyright ©2014 ASM

Appendix C. Lessons Corresponding to Syllabus Portion of Loss Models Fourth Edition

C/4 Study Manual—17th edition Copyright ©2014 ASM

Loss Models section

Sections of this manual

3.1 3.2–3.3 3.4–3.5 4 5.1 5.2.1–5.2.3 5.2.4–5.2.6 5.3.1–5.3.3,5.4 5.3.4 6.1–6.2 6.3 6.4 6.5 6.6 8.1 8.2 8.3 8.4 8.5 8.6 9.1–9.2 9.3 9.4 9.5 9.6 introduction 9.6.2–9.6.4 9.6.5 9.7 9.8.1 9.8.2 10 11 12.1 12.2 12.3 12.4 13.1 13.2 13.3 13.4 14.1 14.2

1,5,6 1,5 8 2 2 2 4 2 8.5 11 11,12 11 11,33.6 11 5,6 6,10 7,10 5,10 9,10 13 14 14,15,18 19.1 18 16,17 16,17 19.2 16 14 15 21 22,25 24,25 23,25,26 27 28 30,31 32,33 34.1–34.3.1 34.3.2 35.1 35.2 1651

APPENDIX C. CROSS REFERENCE FROM LOSS MODELS

1652

C/4 Study Manual—17th edition Copyright ©2014 ASM

Loss Models section

Sections of this manual

14.3 14.4 15.1–15.2 15.3 16.1–16.3 16.4.1 16.4.2 16.4.3 16.4.4–16.5 17.2 17.3 17.4–17.6 18.1 18.2 18.3 18.4 18.5 18.6 18.7 19.1–19.2 19.3 20.1 20.2 20.3 20.4

35.3 35.4 45 45–50,55 36 37 38 39 40 42 42,43 44 45,46 4.2,45,46 45,46 51–53,56 51–53 54 55 57 58 60,63 61 62 63,64

Appendix D. Lessons Corresponding to Questions on Released and Practice Exams The following tables help you find where relevant released exam questions are discussed in this manual. The tables are: • Table D.1: Lessons corresponding to questions on released Exams C/4. This table lists the lessons in this manual corresponding to relevant released questions from old Exam C/4. Excluded are questions on regression and time series (pre 2005) and cubic spline questions (2005–2006) as well as a couple other topics no longer on the syllabus. Occasionally there were questions that did not relate directly to the syllabus at the time but which you were expected to answer based on your background knowledge or your knowledge from previous exams. Even though they are not discussed in this manual, I labeled them "GK" (general knowledge) rather than "NS". • Table D.2: Lessons corresponding to questions on released Exams M/3. This table lists the lessons in this manual corresponding to questions on material that used to be on Exam M/3 before 2007 that was moved to this exam, such as modeling frequency and severity and simulation. I’ve also included some questions from Exam 3 on statistics, even though that material is still on Exam 3 and wasn’t moved. Those questions (on topics such as estimator quality and maximum likelihood) are simpler than the ones likely to appear on this exam, but may still be useful. • Table D.3: Lessons corresponding to practice exam questions. This table lists the lessons in this manual corresponding to the practice exam questions. • Tables D.4–D.5: Released exam questions corresponding to the 306 sample questions, and the page in this manual where they are discussed. A few semesters ago, the SOA compiled a list of 306 sample questions for this exam (actually less, since they subsequently deleted a few that were not on the syllabus). Most of these questions come from the released exams, so they do not provide you with any new material. However, the last 10 questions are new material, related to the change in syllabus starting with the October 2013 exam. These tables list the corresponding exam questions and the page in this manual having either the question or the solution. In a few cases, the question is not included in this manual. For example, the list includes some simulation questions from old Exam 3 which involved simulating a life contingencies situation. It is unlikely a question on Exam C/4 would use a life contingencies situation, so I have not included these questions in this manual. • Table D.6: Sample questions from the 306 corresponding to old exam questions. This list is for someone looking for questions from released exams not in the 306 sample questions. None of the 306 questions are from the Spring 2000, Fall 2000, Spring 2001, or Spring 2007 exams. There are very few questions from the other released exams that are on the syllabus and not in the 306. Note that the 2005 and 2006 exams had 35 questions rather than 40.

C/4 Study Manual—17th edition Copyright ©2014 ASM

1653

APPENDIX D. EXAM QUESTION INDEX

1654

Table D.1: Lessons corresponding to questions on released Exams C/4

Exam C/4 Question Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Spring 2000 NS 31 51 24 NS 9 45 NS NS 50 37 NS NS NS 57 NS 64 21 26 NS 33 45 NS NS 34 44 NS NS 39 47 NS 31 58 NS NS 30 53 26 NS 35

Fall 2000 NS 30 45 24 NS 32 58 NS NS NS 49 NS DF 43 NS 57 NS 9 52 26 NS 33 50 NS NS 64 57 45 NS NS NS 1 45 33 NS NS NS 54 31 NS

Spring 2001 NS 47 1 24 NS 51 33 NS NS 45 52 37 NS 26 NS 33 NS 46 39 40 NS NS 54∗ NS 34 NS NS 45 NS 33 NS 57 NS 35 NS NS 46 53 30 NS

Fall 2001 NS 25 47 NS NS 36 45 NS NS 33 52 NS NS 46 44 NS 62 53 24 NS NS 34 52 NS 39 54 28 NS 14 57 NS NS 30 46 NS 11 26 53 NS 32

GK=general knowledge (try these questions) NS=non-syllabus (skip these questions) DF=defective * Not covered in all three credibility options

C/4 Study Manual—17th edition Copyright ©2014 ASM

Fall 2002 NS 31 55 24 NS 35 56 26 NS 33 57 NS 30 43 NS NS 37 53 NS NS 46 NS 32 46 24 NS NS 39 52 NS 21 54 NS NS 64 14 31 NS 45 32

Fall 2003 NS 31 42 27 NS 33 49 30 NS NS 53 NS GK 45 57 39 NS 34 46 NS 26 26 52 30 NS 64 54 40 NS GK 49 35 NS 33 44 NS 25 NS 45 24

Fall 2004 55 31 NS 24 45 32 NS 35 54 39 NS 26 46 30 NS 64 57 33 NS 27 43 40 NS 30 52 33 NS NS 53 31 NS 35 46 35 NS 33 58 37 NS 21

Spring 2005 37 43 24 64 36 53 28 NS 33 34 51 60 35 46 26 21 53 NS 39 52 47 27 NS 30 57 22 33 58 NS NS 33 56 39 63 45

Fall 2005 22 55 31 NS 32 33 53 63 27 39 57 NS NS 34 45 62 26 34 52 34 30 58 NS NS 40 56 63 21 35 58 36 46 22 39 42

Fall 2006 31 45 25 63 32 52 26 NS 49 47 62 35 58 24 35 45 28 33 54 24 63 40 52 27 NS 21 57 NS 49 42 24 63 32 34 22

Spring 2007 32 56 4 NS 39 51 25 17 63 31 57 26 9 40 49 27 15 35 NS 37 51 NS NS 31 58 28 NS 31 64 50 33 54 26 NS 45 54 62 24 13 23

APPENDIX D. EXAM QUESTION INDEX

1655

Table D.2: Lessons corresponding to questions on released Exams M/3

Exam M/3 Joint Exams Spring Fall Spring Fall Q 2000 2000 2001 2001 1 NS NS NS NS 2 NS 15 NS NS 3 NS NS 12 NS 4 12 NS NS NS 5 NS NS NS NS 6 NS NS NS 13 7 NS NS NS 15 8 NS 14 NS NS 9 NS NS NS NS 10 NS NS NS NS 11 18 NS 63 NS 12 NS NS NS NS 13 NS 11 NS NS 14 NS NS NS NS 15 NS NS 12 NS 16 15 NS 15 NS 17 NS NS NS NS 18 NS NS NS 18 19 14 NS 18 NS 20 NS NS NS NS 21 NS 14 NS NS 22 NS NS NS NS 23 NS NS NS NS 24 NS NS NS NS 25 10 NS 11 NS 26 NS NS 13 NS 27 NS 10 NS 12 28 NS NS NS 5 29 NS NS 14 NS 30 9 NS 18 NS 31 NS NS NS NS 32 63 15 NS 63 33 NS NS NS NS 34 NS NS NS NS 35 NS NS NS 6 36 NS NS 14 18 37 11 NS NS 1 38 NS NS NS NS 39 NS NS NS NS 40 NS NS NS NS 41 6 42 6 43 NS 44 NS

Fall Fall 2002 2003 NS NS NS NS NS 10 NS 15 12 63 15 NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS 18 NS NS NS NS NS NS 13 NS NS NS NS NS NS 63 NS NS NS NS NS NS NS 14 NS 11 NS NS 7 NS NS NS NS NS NS NS 14 NS 7 NS NS 17 NS 10 NS NS NS NS NS NS 63

SOA exams Fall Spring Fall 2004 2005 2005 NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS 63 NS NS 9 NS NS 13 NS NS NS 6 NS NS 48 NS NS NS NS NS NS NS NS NS NS NS NS 6 15 NS NS NS NS NS 13 14 46 7 18 15 NS 11 18 NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS 6 NS NS 17 NS NS 7 NS NS NS NS NS NS NS 14 NS 14 6 NS 63 NS NS 63 4 14 NS NS 4 NS NS NS NS NS NS NS NS 14 NS 45 14 NS 15 NS

CAS exams Fall Fall Spring Fall Spring Fall Spring Fall 2006 2003 2004 2004 2005 2005 2006 2006 NS NS NS NS NS 32 30 NS NS NS NS NS NS NS 32 32 NS NS NS NS NS NS 21 30 NS NS NS NS 6 35 21 NS NS NS NS NS NS NS NS NS 9 NS NS NS 16 21 NS NS 18 NS NS NS 14 NS NS NS NS NS NS NS 14 NS NS NS NS NS NS NS 14 NS NS NS NS NS NS NS 12 NS NS NS NS NS NS NS NS NS NS NS NS 11 NS NS NS NS NS NS NS NS NS NS NS NS NS NS NS 11 NS NS NS NS NS NS NS 12 NS NS 11 NS NS NS NS 8 NS NS 11 NS NS NS NS 1 13 13 45 NS NS NS NS 11 NS NS 33 NS NS 4 NS 1 16 NS 30 2 NS 4 9 NS 7 NS 35 6 NS 4 14 9 6 NS 21 2 NS NS 11 9 14 11 NS 10 NS NS NS NS NS 11 NS NS NS 11 NS 15 NS 1 NS 13 NS 13 NS 15 NS 6 NS NS 1 1 NS NS NS 5 NS NS 2 NS NS NS NS 8 NS NS 2 NS NS NS 1 1 11 NS NS NS 9 NS 6 6 NS NS NS 14 14 15 NS NS NS 15 45 9 9 NS NS 14 NS NS 11 13 15 NS 11 15 NS NS 13 NS NS NS NS 9 NS 7 NS NS NS NS 2 NS NS 15 NS NS NS NS 9 NS 6 NS NS NS NS NS NS NS NS NS 19 NS NS NS 18 63 NS NS NS NS NS 60 14 NS NS NS NS NS 4 NS 15 60 NS NS 6 NS 17 NS 18 NS NS NS NS NS

NS=non-syllabus (skip these questions) Note: For CAS3 exams Spring 2005–Fall 2006 only, this table includes relevant statistics questions. C/4 Study Manual—17th edition Copyright ©2014 ASM

APPENDIX D. EXAM QUESTION INDEX

1656

Table D.3: Lessons corresponding to practice exam questions

Q # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

C/4 Study Manual—17th edition Copyright ©2014 ASM

1 4 21 64 13 25 27 32 45 12 31 46 26 34 63 32 39 34 33 26 14 28 57 53 58 30 37 24 9 8 11 24 7 36 18 42

2 1 22 26 42 39 53 45 45 64 52 32 28 49 60 30 52 61 25 11 58 62 14 37 33 27 32 13 57 53 9 32 21 31 3 8

3 2 33 24 21 28 34 5 46 14 25 23 53 24 30 27 60 32 6 53 61 34 62 39 57 8 24 46 16 52 33 54 26 37 61 42

4 7 26 21 61 39 32 30 8 28 34 42 49 26 33 11 34 5 1 54 63 4 32 54 46 13 4 53 45 58 60 17 31 42 47 3

5 8 25 38 27 32 49 57 46 27 33 5 30 21 15 28 45 63 26 31 6 48 24 26 32 61 34 30 13 19 63 34 32 45 5 42

Practice Exams 6 7 8 1 42 25 16 51 6 25 5 53 30 39 15 8 1 55 28 8 40 60 13 21 2 19 10 50 28 28 25 32 18 45 45 54 26 57 35 57 34 56 19 39 46 21 30 64 31 5 32 32 32 61 15 40 53 32 63 19 46 45 53 36 38 34 37 62 33 40 37 46 27 26 33 64 32 57 11 27 35 4 53 45 48 5 11 34 24 7 52 21 6 35 24 62 25 61 58 63 46 46 9 56 36 43 25 15

9 8 37 40 24 9 45 12 2 39 9 33 27 57 64 30 32 63 58 34 28 15 16 4 34 25 46 49 50 32 53 61 24 18 21 43

10 42 63 7 55 28 30 26 32 4 8 27 31 15 46 23 34 58 46 9 6 11 57 50 26 4 21 45 47 33 61 18 35 24 33 37

11 5 58 39 45 33 30 24 6 63 33 31 27 14 11 64 9 46 23 53 45 34 25 42 40 21 26 35 61 40 57 1 49 33 34 62

12 7 47 33 4 57 54 28 25 18 27 63 62 46 9 45 23 32 30 46 8 64 46 31 11 21 30 44 15 24 61 36 53 26 1 44

13 6 47 64 25 28 34 48 17 52 40 63 11 57 32 52 39 8 56 63 44 17 18 25 7 52 9 33 61 24 27 46 60 21 49 31

APPENDIX D. EXAM QUESTION INDEX

1657

Table D.4: Exam questions corresponding to the 306 sample questions, and the page in this manual where they appear—Part I

Exam Exam Q Question Page Q Question Page 1 4-F03:2 567 41 4-F02:18 1059 2 4-F03:3 842 42 Deleted 3 4-F03:4 476 43 4-F02:21 921 4 4-F03:6 626 44 4-F02:23 583 5 4-F03:7 968 45 4-F02:24 921 6 4-F03:8 541 46 4-F02:25 408 7 Deleted 47 4-F02:28 782 8 4-F03:11 1059 48 4-F02:29 1021 9 Deleted 49 4-F02:31 367 10 Deleted 50 4-F02:32 1080 11 4-F03:14 890 51 Deleted 12 4-F03:15 1126 52 4-F02:35 1268 13 4-F03:16 779 53 4-F02:36 243 14 4-F03:18 668 54 4-F02:37 563 15 4-F03:19 921 55 4-F02:39 886 16 4-F03:21 446 56 4-F02:40 590 17 4-F03:22 447 57 4-F01:2 431 18 4-F03:23 1024 58 4-F01:3 948 19 4-F03:24 540 59 4-F01:6 719 20 4-F03:26 1268 60 4-F01:7 886 21 4-F03:27 1081 61 4-F01:10 618 22 4-F03:28 805 62 4-F01:11 1017 23 4-F03:30 591 63 Deleted 24 4-F03:31 968 64 4-F01:14 921 25 4-F03:32 702 65 4-F01:15 867 26 4-F03:34 620 66 4-F01:17 1231 27 4-F03:35 865 67 4-F01:18 1059 28 4-F03:37 431 68 4-F01:19 408 29 4-F03:39 884 69 4-F01:22 672 30 4-F03:40 409 70 4-F01:23 1020 31 4-F02:2 562 71 4-F01:25 778 32 4-F02:3 1094 72 4-F01:26 1080 33 4-F02:4 408 73* 4-F01:27 506 34 4-F02:6 698 74 Deleted 35 4-F02:7 1108 75 4-F01:33 540 36 4-F02:8 446 76 4-F01:34 919 37 4-F02:10 625 77 4-F01:37 446 38 4-F02:11 1125 78 4-F01:38 1049 39 4-F02:14 855 79 4-F01:40 589 40 4-F02:17 735 80 Deleted *Question 73A is a version of question 73.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Q 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

Exam Question 3-F01:13 3-F03:5 3-F03:40 3-F03:3 3-F03:4 3-F03:19 3-F03:29 3-F03:33 3-F03:34 3-F02:5 3-F02:6 3-F02:16 3-F02:27 3-F02:28 3-F02:36 3-F02:37 3-F01:6 3-F01:7 3-F01:18 3-F01:28 3-F01:35 3-F01:36 3-F01:37 3-S01:3 3-S01:15 3-S01:16 3-S01:19 3-S01:25 3-S01:26 3-S01:29 3-S01:36 3-F00:2 3-F00:8 3-F00:13 3-F00:21 3-F00:27 3-F00:31 3-F00:32 3-F00:41 3-F00:42

Page 1212 1246 1247 181 265 225 125 265 125 216 264 317 264 196 300 181 284 265 315 90 105 317 17 215 213 264 315 196 279 244 244 263 243 198 245 181 91 264 281 281

Q 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160

Exam Question Deleted 3-F04:6 3-F04:7 3-F04:8 3-F04:15 3-F04:17 3-F04:18 Deleted Deleted 3-F04:32 3-F04:33 3-F04:34 4-F04:1 4-F04:2 4-F04:4 4-F04:5 4-F04:6 4-F04:8 4-F04:9 4-F04:10 4-F04:12 4-F04:13 4-F04:14 4-F04:16 4-F04:17 4-F04:18 4-F04:20 4-F04:21 4-F04:22 4-F04:24 4-F04:25 4-F04:26 Deleted 4-F04:29 4-F04:30 4-F04:32 4-F04:33 4-F04:36 4-F04:37 4-F04:38

Page 1247 166 226 266 284 125 240 1247 1248 1094 563 409 886 590 703 1081 782 446 919 541 1268 1127 627 474 855 804 541 1022 618 1060 567 698 922 615 1156 736

APPENDIX D. EXAM QUESTION INDEX

1658

Table D.5: Exam questions corresponding to the 306 sample questions, and the page in this manual where they appear—Part II

Q 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200

Exam Question 4-F04:40 M-S05:9 M-S05:10 M-S05:17 M-S05:18 M-S05:19 M-S05:31 M-S05:32 M-S05:34 M-S05:39 M-S05:40 C-S05:1 C-S05:2 C-S05:3 C-S05:4 C-S05:5 C-S05:6 Deleted C-S05:9 C-S05:10 C-S05:11 C-S05:12 C-S05:13 C-S05:14 C-S05:15 C-S05:16 C-S05:17 Deleted C-S05:19 C-S05:20 C-S05:21 C-S05:22 C-S05:24 C-S05:25 C-S05:26 C-S05:27 C-S05:28 Deleted C-S05:31 C-S05:32

Page 365 1601 1601 1601 1602 1602 1602 1602 1602 1603 1603 736 856 410 1269 717 1051 673 673 991 1192 704 923 445 368 1061 783 1023 948 477 534 1128 379 626 1157 622 1109

C/4 Study Manual—17th edition Copyright ©2014 ASM

Q 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240

Exam Question C-S05:33 C-S05:34 C-S05:35 M-F05:17 M-F05:18 M-F05:19 M-F05:26 M-F05:27 M-F05:28 M-F05:34 M-F05:35 M-F05:38 M-F05:39 C-F05:1 C-F05:2 C-F05:3 C-F05:5 C-F05:6 C-F05:7 C-F05:8 C-F05:9 C-F05:10 C-F05:11 Deleted C-F05:14 C-F05:15 C-F05:16 C-F05:17 C-F05:18 C-F05:19 C-F05:20 C-F05:21 C-F05:22 Deleted C-F05:25 C-F05:26 C-F05:27 C-F05:28 C-F05:29 C-F05:30

Page 783 1249 888 1608 1608 1608 1609 1609 1610 1610 1610 1611 1611 1612 1612 1612 1613 1613 1613 1614 1614 1614 1615 1615 1616 1616 1617 1617 1618 1618 1618 1619 1619 1619 1619 1620 1620 1620

Q 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280

Exam Question C-F05:31 C-F05:32 C-F05:33 C-F05:34 C-F05:35 C-F06:1 C-F06:2 C-F06:3 C-F06:4 C-F06:5 C-F06:6 C-F06:7 C-F06:9 C-F06:10 C-F06:11 C-F06:12 C-F06:13 C-F06:14 C-F06:15 C-F06:16 Deleted C-F06:18 C-F06:19 C-F06:20 C-F06:21 C-F06:22 C-F06:23 C-F06:24 C-F06:26 C-F06:27 Deleted C-F06:29 C-F06:30 C-F06:31 C-F06:32 C-F06:33 C-F06:34 C-F06:35 M-F06:6 M-F06:7

Page 1620 1621 1621 1621 1621 1632 1632 1632 1633 1633 1633 1634 1634 1634 1634 1634 1635 1635 1635 1635 1636 1636 1636 1636 1636 1636 1637 1637 1637 1638 1638 1638 1639 1639 1639 1640 1629 1629

Q 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305

Exam Question M-F06:20 M-F06:21 M-F06:22 M-F06:29 M-F06:30 M-F06:31 M-F06:32 M-F06:39 M-F06:40

Page 1629 1630 1630 1630 1630 1630 1631 1631 1631 1212 1214 504 504 504 504 508 508 1249 1215 478 503 1250 675 1213 508

APPENDIX D. EXAM QUESTION INDEX

1659

Table D.6: Sample questions from the 306 corresponding to old exam questions

Q 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Fall 2001 NS 57 58 NS NS 59 60 NS 61 62 NS 64 65 NS 66 67 68 NS NS 69 70 NS 71 72 73 NS NS NS 75 76 NS 77 78 NS 79

Fall 2002 NS 31 32 33 NS 34 35 36 NS 37 38 NS 39 NS NS 40 41 NS NS 43 NS 44 45 46 NS NS 47 48 NS 49 50 NS NS 52 53 54 NS 55 56

Fall 2003 NS 1 2 3 NS 4 5 6 NS NS 8 NS NS 11 12 NS 14 15 NS 16 17 18 19 NS 20 21 22 NS 23 24 25 NS 26 27 NS 28 NS 29 30

Fall 2004 133 134 NS 135 136 137 NS 138 139 140 NS 141 142 143 NS 144 145 146 NS 147 148 149 NS 150 151 152 NS NS 154 155 NS 156 157 NS 158 159 160 NS 161

Spring 2005 172 173 174 175 176 177 NS NS 179 180 181 182 183 184 185 186 187 NS 189 190 191 192 NS 193 194 195 196 197 NS NS 199 200 201 202 203

Fall 2005 214 215 216 NS 217 218 219 220 221 222 223 NS NS 225 226 227 228 229 230 231 232 233 NS NS 235 236 237 238 239 240 241 242 243 244 245

Fall 2006 246 247 248 249 250 251 252 NS 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 NS 269 270 NS 272 273 274 275 276 277 278

NS=not on syllabus, blank=on syllabus but not in 306 sample questions 2005–2006 exams had only 35 questions.

C/4 Study Manual—17th edition Copyright ©2014 ASM

Index a (Bühlmann credibility), 982 A2 (Anderson-Darling), 749 ( a, b, 0) class, 188–191, 295, 1207 ( a, b, 1) class, 191–195, 223, 295 actuarial exposure, 491 adjoint method, 648 aggregate deductible, 307–318 aggregate loss, 2 aggregate loss models exact cclculation, 325 aggregate losses, 235 allosaur, see dinosaur Anderson-Darling, 749–755 table of characteristics, 767 anniversary-to-anniversary, 492 asymptotically unbiased, 354

calculator tip, 400, 438, 449, 532, 689, 727, 761, 1102, 1145, 1180, 1262 censored data, 393 censoring, 96 Central Limit Theorem, 52 chi-square, 757–793 degrees of freedom, 763 table of characteristics, 767 classical credibility, see credibility,limited fluctuation coefficient of variation, 5, 6, 17, 147 coherent risk measures, 135 coinsurance, 159, 163, 164, 166, 167 collective risk model, 235 compound model recursive formula, 295 compound variance, 236–247 Conditional probability, 8 balancing the estimators, 1117 Conditional Tail Expectation, 140 Balkema-de Haan-Pickands Theorem, 147 conditional variance, 63, 177, 194, 249, 274, 473, 711 bandwidth (kernel smoothing), 458 confidence intervals, 359 Bayes’ Theorem, 9, 873 linear, 439 Bayesian estimation, 915–917 log-transformed, 439 credibility interval, 916 maximum likelihood, 661 highest posterior density credibility set, 916 conjugate prior, 917, 1091 interval estimation, 916 Bernoulli/Beta, 1165, 1300, 1319 Bayesian premium, 874, 931, 1091 Bernoulli/beta, 961, 1595 Bernoulli shortcut, see variance, Bernoulli shortcut, exponential/inverse Gamma, 971 250, 357, 1569 normal/normal, 953 Bernoulli/beta, see conjugate prior Poisson/gamma, 939 beta distribution, 1604 consistency (property of estimator), 356 bias, 354 continuity correction, see normal approximation Bigskew Insurance Company, 318 convolution, 295 binomial coefficient correlation coefficient, 5, 17, 55 numerator not integral, 187 covariance, 5 binomial distribution, 188, 381 covariance matrix, 660 estimating parameters, 688 coverage modifications exposure & severity modifications, 221 aggregate loss, 277 bonuses, 179–182 frequency, 221 bootstrap approximation, 1261–1273 severity—deductibles, 95, 119 Box-Muller transformation, 1209 severity—policy limits, 85 Bühlmann credibility, 981–1072 severity—policy limits and coinsurance, 159 as least squares estimate of Bayes, 1101 Cràmer-Rao, 647 Bühlmann’s k, 982 credibility Bühlmann-Straub, 1073–1089 Bayesian, 873–913 Burr, 1612, 1632 classical, see credibility, limited fluctuation C/4 Study Manual—17th edition Copyright ©2014 ASM

1660

INDEX exact, 1091 limited fluctuation, 827–871 partial, 861 warning to always use aggregate distribution, 850 credibility interval, 916 credibility unit, 1115 credibility weighted average, 1117 critical region, 358 critical values, 358 CTE, 140 cumulative distribution function, 3 cumulative hazard rate function, 4 D ( x ) plot, 713 data-dependent distribution, 29 date-to-date study, 492 Dean study note, 824 deductible disappearing, 109, 115 franchise, 95, 106, 109, 163, 165, 167, 279, 281 ordinary, 95 variance of loss with, 160, 168, 237 degrees of freedom, 763 likelihood ratio test, 796 delta method, 439, 659, 673 dinosaur tyrannosaur, 226 discretizing, 329 method of local moment matching, 330 method of rounding, 330 distribution function, 3 d j , 494 double expectation formulas, 10, 64 E[ ( X − d )+ ], 95 e˚d , 98 efficient (property of estimator), 356 empirical Bayes estimation manual premium, 1123 method that preserves total losses, 1117 nonparametric, 1113 semi-parametric, 1143 empirical distribution, 13, 375 Epanechnikov kernel, 470 EPV (Bühlmann credibility), 982 equilibrium distribution, 146 Erlang distribution, 326, 681 ES, 140 estimator quality, 353 C/4 Study Manual—17th edition Copyright ©2014 ASM

1661

ETNB, 192 exact credibility, see credibility, exact exact exposure, 489, 601 e X ( d ) , 98 Expected Shortfall, 140 exponential distribution lack of memory, 237 method of moments, 526 percentile matching, 556 exponential/inverse gamma, see conjugate prior exposure modifications, 221 extreme value distributions, 147 F ∗ , 713 f ∗ , 713 factorial moment, 12, 189 Fisher’s information matrix, 647 Fisher-Tippett Theorem, 147 Fn , 713 special definition for p–p plot, 714 f n , 376, 713 f n (compound model), 295 frailty models, 62 franchise deductible, see deductible, franchise Fréchet distribution, 147 free parameter, 795 frequency, 2 GAAP operating earnings, 182 gamma distribution as sum of exponentials, 6 method of moments, 526 gamma function, 34 γ1 (skewness), 5 γ2 (kurtosis), 5 generalized Pareto, 147 geometric distribution, 187 g n (compound model), 295 Greenwood approximation, 437 hazard rate function, 4, 18, 19 decreasing and increasing, 146, 151 Herzog textbook, 825 histogram, 375–377 HPD credibility set, 916 Hunt N. Quotum, 181 H (x ), 4 h (x ), 4 hypothetical mean, 874, 981

1662

INDEX

improper prior distribution, see prior distribution, manual premiums, 1123 improper marginal distribution, 5 individual risk model, 235 maximum covered loss, 159, 167 inflation, 29, 45–48, 89, 90, 159, 165, 167 maximum likelihood, 575–646 Bernoulli technique, 609, 645, 646, 706, 707 how to handle a lognormal, 29, 30 binomial, 688 information matrix, 647 confidence intervals, 661 insuring age, 491 equal to method of moments, 601 inverse exponential estimating q x , 612 percentile matching, 559 exponential, 601, 807 inversion method(simulation), 1179 gamma, 601 Kaplan-Meier estimator, 394 negative binomial, 601, 688 variance, see Greenwood approximation non-normal confidence intervals, 662 kernel smoothing, 457–488 normal, 601 gamma kernel, 471 Poisson, 601, 687, 766 spread, 471 adjusting for exposure, 692 triangular kernel, 463 uniform distribution, 606 uniform kernel, 458 variance, 647–685 Kolmogorov-Smirnov, 725–748 Weibull shortcut, 605 for grouped data, 730 mean excess loss, 98, 104, 105, 120, 121, 146, 286 table of characteristics, 767 empirical, 103, 104 kurtosis, 5, 6, 14, 17 mean residual life, 98, 150, 151 mean square error, 356 large sample estimate, 53 median, 6, 18–20 Le Behemoth Insurance Company, 165 method of local moment matching, 330 LER, 119 method of moments, 258, 525–553 likelihood ratio algorithm, 795–800 gamma distribution, 526 limited expected value, 85, 90, 93 lognormal distribution, 528 empirical, 87 Pareto distribution, 527 special caution if support doesn’t start at 0, 93 uniform distribution, 526, 529 limited fluctuation credibility, see credibility, lim- method of rounding, 330 ited fluctuation MGF, see moment generating function limited loss variable, 85 minimum modified chi-square, 598 limiting distributions, 43 mixtures, 59, 90, 93, 110, 126, 133, 187, 211, 244, 268 linear exponential family, 40, 917, 1091 continuous, 61, 71 logarithmic distribution, 193 two point, 54 loglogistic distribution, 110, 151, 227, 558, 567, 718, mode, 6, 17, 18 805 moment lognormal distribution, 32 central, 5 fitting, 569, 604 raw, 5 method of moments, 528 moment generating function, 12, 62 percentile matching, 558 definition, 6 Loss Elimination Ratio, 119 monotonicity (coherence), 136 loss functions (Bayesian estimation), 915 MSE, 356 Lucky Tom, 212 multinomial distribution, 381 µ n (central moment), 5 M ( t ) , 12 µ0n (raw moment), 5 Mahler-Dean study note, 824 µ x (force of mortality), 4 manual premium seeempirical Bayes estimation negative binomial distribution, 187 C/4 Study Manual—17th edition Copyright ©2014 ASM

INDEX as mixture of Poissons, 211 estimating parameters, 688 exposure & severity modifications, 221 Nelson-Åalen estimator, 375 variance, 438 Nelson-Åalen estimator, 398 n j , 494 non-normal confidence intervals, 662 normal approximation, 52, 70 compound model, 258, 260–267, 284 continuity correction, 53, 204, 257, 258, 263, 274 percentile, 111 normal/normal, see conjugate prior observed information, 656 ogive, 375–377 order statistics, 555, 725, 749, 1227 ordinary deductible, see deductible, ordinary P ( z ) , 12 p–p plot, 7, 714 p-value, 358, 767, 795 parametric distributions, 29–48 inverse, 32 scaling, 29 transformed, 32 Pareto distribution method of moments, 527 partial expectation, 140 PC (credibility premium) Bühlmann, 983 limited fluctuation, 861 Pearson, 760 percentile definition, 6, 7 smoothed empirical, 555 percentile matching, 555–574 PGF, 12 pharmaceutical company, 20 pizza delivery company, 18 P j , 494 p n (compound model), 295 Poisson distribution, 71, 74, 187, 313, 315, 318 adjusting for exposures chi-square, 767 maximum likelihood, 692 compound, 301, 314 estimating parameter, 687 exposure & severity modifications, 221 gamma mixture, 61, 187, 211, 274 C/4 Study Manual—17th edition Copyright ©2014 ASM

1663

Poisson/gamma, see conjugate prior polar method, 1209 policy limit, 85, 159 positive homogeneity (coherence), 135 posterior distribution, 910 power of statistical test, 359 predictive distribution, 910 prescription drugs, 166, 317 preserving total losses, see empirical Bayes estimation primary distribution, 235 prior distribution, 909 improper, 909, 922 probability density function, 4 probability generating function, 12, 189 definition, 6 probability parameter (limited fluctuation credibility), 828 process variance, 981 product limit estimator, see Kaplan-Meier estimator pure premium, 2 t p x , 437 n p x , 489 quantile, 7 quiz show, 313, 316 q x , 489 t q x , 437 n q x , 489 range parameter (limited fluctuation credibility), 828 recursive formula for compound model, 295 retailer of personal computers, 19 ρ XY , 5 risk measures, 135 coherent, 135 risk set, 394 r j , 394 S (x ), 4 scale family, 29 scale parameter, 29, 46 Schwarz Bayesian Criterion, 800 secondary distribution, 235 severity, 2 shifted random variable, 96 Sibuya distribution, 193 significance level, 358 simulation

INDEX

1664

actuarial applications, 1237 simulating chi-square tests, 1239 Value at Risk, 1239 s j , 394 skewness, 5, 6, 14, 16, 18, 19, 71 smoothed empirical percentile, 555 splicing, 66–69, 75, 285 standard deviation, 5 standard deviation principle, 135 stop-loss, 307, 311, 312, 315, 316 stop-loss premium, 307 subadditivity (coherence), 135 survival function, 4, 19 Tail Conditional Expectation, 140 tail weight, 144, 150, 151 Tail-Value-at-Risk, see TVaR TCE, 140 transformations of random variables, 31 translation invariance (coherence), 135 true information, 656 truncated data, 393 truncated random variable, 98 TVaR, 140 simulation, 1239 two point mixture, 54 Type I error, 358 Type II error, 359 tyrannosaur, see dinosaur UMVUE, 356 unbiased estimator, 354 uniform distribution method of moments, 529 uniformly minimum variance unbiased estimator, 356 uniformly most powerful test, 359 v (Bühlmann credibility), 982 Value-at-Risk, see VaR VaR, 7, 137, 556 simulation, 1239 variance, 5, 51–56 additivity, 51 Bernoulli shortcut, 54, 177, 252 mixtures, 59 Ventnor Manufacturing, 827, 861, 981 VHM (Bühlmann credibility), 982 watches, 17 Weibull distribution, 25, 75, 109, 216 C/4 Study Manual—17th edition Copyright ©2014 ASM

hazard rate, 146 percentile matching, 557 w j , 494

( X − d )+ , 95 y−x p x ,

382 382 y j , 394

y−x q x ,

Z (credibility factor) Bühlmann, 983 limited fluctuation, 861 zero-modified, 192 zero-one loss function, 915 zero-truncated, 192