## Thesis Format

Integrated Article

# Generalized Poisson random variables: Their distributional properties and actuarial applications

## Degree

Doctor of Philosophy

## Program

Statistics and Actuarial Sciences

## Supervisor

Prof. Jiandong Ren

Prof. Shu Li

Joint Supervisor

## Abstract

Estimating the expected number of claims per risk exposure period is essential to risk classification. The simple Poisson regression model usually cannot fit the claim data well because the data often display over-dispersion. Various other models, such as Negative binomial distribution and Poisson-Inverse Gaussian distribution, have been proposed to address the issue of over-dispersion. Additionally, zero-inflated count distributions, such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB), have been proposed to account for a larger number of observed zeros in insurance loss data. The Generalized Poisson (GP) distribution, introduced in 1973, can model over-dispersed and under-dispersed data and has been found in applications in many areas, including actuarial science. The principal purpose of this thesis is to study in detail the application of GP and its related distributions in modeling insurance claim numbers. In Chapter 2, we study the distributional properties of a family of distributions that includes GP as a special case. We first derive recursive formulas for computing the corresponding compound distributions; then, we show that the compound distribution can be evaluated based on the transformation methods such as Fast Fourier Transform (FFT). The results are used to compute risk measures, e.g., Value-at-risk (VaR) and conditional tail expected value (CTE) of the compound distribution. In Chapter 3, we show that the ZI/hurdle + functional form of the GP model can fit insurance loss data well. In addition, we discuss approaches for incorporating exposure into zero-inflated models. In Chapter 4, we introduce a Sarmonov-type bivariate version of the GP distribution. We discuss their zero-inflated and hurdle versions and use them to fit bivariate insurance claim data.

## Summary for Lay Audience

This thesis studies the application of the Generalized Poisson (GP) and the related distributions in analyzing the number of insurance claims. It includes three main parts. In the first part, after reviewing the properties of the GP and related distributions, we derive recursive formulas for computing the distribution of its compound sums. Then, we show that the compound distribution can be evaluated based on the transformation methods such as Fast Fourier Transform (FFT). The results are used to compute risk measures, e.g., Value-at-Risk (VaR) and conditional tail expected value (CTE) of the compound distribution. In the second part, we study models for insurance claim data that are often over-dispersed and have an excessive number of zeros. The reason why this occurs is that policyholders are reluctant to report small claims to avoid possible increases in their insurance premiums. We show that the functional form of GP distributions, GP-P, combined with zero-inflated or hurdle model, can be very flexible and useful in modeling such data. In the third part, we study models for bivariate insurance claim number data. Such data is commonly seen because insurance claims are usually categorized into different types. For example, in auto insurance, we often have claim number data for both bodily injuries and property damages. We first introduce a bivariate GP distribution based on the Sarmanov structure and then study its zero-inflated and hurdle variants. We show through analyzing real insurance loss data that our proposed model can be very useful in model bivariate insurance claim data.