Evaluating Risk Measures and Capital Allocations Based on Multi-Losses Driven by a Heavy-Tailed Background Risk: The Multivariate Pareto-II Model

Evaluating risk measures, premiums, and capital allocation based on dependent multi-losses is a notoriously difficult task. In this paper, we demonstrate how this can be successfully accomplished when losses follow the multivariate Pareto distribution of the second kind, which is an attractive model for multi-losses whose dependence and tail heaviness are influenced by a heavy-tailed background risk. A particular attention is given to the distortion and weighted risk measures and allocations, as well as their special cases such as the conditional layer expectation, tail value at risk, and the truncated tail value at risk. We derive formulas which are either of closed form or follow well-defined recursive procedures. In either case, their computational use is straightforward.


Introduction
In the insurance literature, loss is usually viewed as a non-negative random variable, say X ≥ 0. Denote its cumulative distribution function (cdf) by F X . Its mean E[X] is known as the net premium, and every practically useful premium is obtained by adding a loading to the net premium. For more details on premium calculation principles, their construction and underlying axioms, we refer to, for example, Denuit et al. [1], Pflug and Römisch [2], Tsanakas and Desli [3], Wang et al. [4], Young [5], and references therein. Many of the premium calculation principles are related to non-expected utility theories. On the latter topic, we refer to the monographs by Puppe [6], Quiggin [7], Wakker [8], as well as to the review articles by Machina [9,10], and references therein.
A number of challenging issues arise when constructing premiums. In particular, we need to decide on an appropriate loading: should it reflect the mean-loss, volatility, or something else? Indeed, the loading is not just a mere reflection of the severity of the loss X, but it is also associated with factors such as risk or loss perception by those already insured or to be insured.
Naturally, the theory of decision making under risk and uncertainty plays a pivotal role when constructing premiums and risk measures. In particular, we can frequently find a set of axioms that lead to one premium (or risk measure) or another. Several axiomatic approaches have been developed in the actuarial literature, and they frequently take their roots in analogous considerations of economic theory (cf., e.g., Puppe [6], Quiggin [7], Wakker [8], Machina [10], and references therein).
In general, we modify the net premium in such a way that, in addition to the distribution of losses, the resulting premium or risk measure would reflect a number of factors such as the possible collateral damage due to, say, increased claim processing time, required additional human resources, etc. Hence, instead of X, we might need to deal with its transformation v(X) for some function v : [0, ∞) → [0, ∞), which can be viewed as a utility function as we do in classical economic theory or a value function as in behavioural economics. In our context, we find it appropriate to call v a loss-distortion function.
Importantly, the premium should also reflect the potential distortion of loss probabilities due to various (natural and artificial) factors. Mathematically, this could, for example, mean modifying the de-cumulative distribution function (ddf)F X := 1 − F X (Section 3 below) or the probability density function (pdf) f X (Section 4 below). To familiarize ourselves with some notation in advance, in the former case we shall use a probability-distortion function g : [0, 1] → [0, 1] and in the latter case a probability-weighting function w : [0, ∞) → [0, ∞).
The rest of the paper is organized as follows. In Section 2, we introduce and justify our choice of a model for insurance multi-losses. In Section 3, we recall the definition of the distortion risk measure and discuss how to calculate it for aggregate losses, given by our multi-loss model. Analogous considerations in the case of the weighted premium are presented in Section 4, and for the weighted risk allocation in Section 5. We specialize these results to the conditional-layer-expectation (CLE) premium and allocation, as well as to the classical and truncated tail-value-at-risk (TVaR) premiums and allocations in Section 6. Concluding notes are in Section 7. Most of the proofs, lemmas, and other technicalities are given in Appendix A.
Finally, here are some notation that we frequently use throughout the paper: Given a vector x := (x 1 , . . . , x n ), we write x (i) := (x 1 , . . . , x i−1 , x i+1 , . . . , x n ), and likewise x (i,m) if the two coordinates x i and x m have been deleted, etc. Furthermore, The sign ":=" means equality by definition.

Motivation
Suppose that we are dealing with the portfolio X := (X 1 . . . , X n ) of n losses, which are generally dependent random variables. To further our progress in the direction of modelling and analyzing insurance premiums and risk measures/allocations, we find it necessary to assume a dependence structure. Naturally, it should be reasonable from the practical point of view and lead to a mathematically tractable model.
Our choice for the underlying loss model is the multivariate Pareto distribution of the second kind, which has also been used by Vernic [21]. We shall next explain our rationale behind this choice.
To begin, the notion of background risk has played a crucial role in our choice of the model. In the actuarial, finance, and economic literature (cf., e.g., Finkelshtain et al. [22], Franke et al. [23,24], Nachman [25], Pratt [26], Tsanakas [27], and references therein), background risk has been modeled in a number of ways, such as additive, multiplicative, or more complex one that couples stand-alone risks/losses with the background risk. We also learn from these works that it is not easy to decide on the form of a coupling function, say h, that couples (unobservable) stand-alone risks ξ 1 , . . . , ξ n with (hardly observable) background risk η into the (observable) risks or losses h(ξ 1 , η), . . . , h(ξ n , η).
The coupling function h is frequently chosen by asking questions such as "what h could have possibly produced X i 's with such and such (observable) properties?" We have also asked this question and found that it would be reasonable to assume that each marginal loss X i follows the Pareto-II (also known as Lomax) distribution, usually denoted by PaII (µ, σ, α). Its decumulative distribution and probability density functions are given by, respectively, for all x ≥ µ, where µ ∈ R is the location parameter (the left-hand point of the distribution support), σ > 0 is the scale parameter, and α > 0 is the shape parameter. Throughout this paper, we always assume α > 1, which ensures that X has at least one finite moment.

The Pareto-II Model
It is known and easily checked that X ∼ PaII (µ, σ, α) is equal in distribution to ξ/γ + µ, where ξ ∼ Exp(σ) and γ ∼ Ga(α, 1) are independent random variables: Exp(σ) denotes the exponential distribution with the mean σ, that is, its cdf is 1 − e −x/σ , and Ga(α, 1) denotes the gamma distribution with the shape parameter α and the scale parameter 1, that is, its pdf is e −x x α−1 /Γ(α). In view of these notes, we find it natural to work with the model . . , n, and γ ∼ Ga(α, 1), with all these n + 1 random variables being independent.
This makes X := (X 1 , . . . , X n ) to follow the multivariate Pareto distribution of the second type, denoted by MP (n) II (µ, σ, α) as per Arnold's [28] nomenclature of multivariate Pareto distributions, where µ := (µ 1 , . . . , µ n ) and σ := (σ 1 , . . . , σ n ). Since we only deal with non-negative losses, we shall always have µ ≥ 0 := (0, . . . , 0), with the inequality defined coordinate-wise. Under the above model, the losses X 1 , . . . , X n are dependent because of the "background" risk η := 1/γ. To develop a further insight into the dependence between X i 's, we note that if α > 2, which we do not generally assume in this paper unless noted otherwise, the correlation coefficient Corr[X i , X j ] between X i and X j is equal to 1/α whenever i = j. Hence, we may now wonder whether α should be interpreted as an index of the distribution tail fatness or a dependence measure between the losses; perhaps both, which is natural. Indeed, note that the multiplicative background risk η follows the inverse-gamma distribution, whose pdf if f η (x) = x −α−1 e −1/x Γ(α). This function is asymptotically of the order 1/x α+1 when x → ∞. Hence, the distribution of the (multiplicative) background risk η has a Paretian tail, which is a natural feature when modelling risks/losses: the heavier is the tail of the background risk η, the more it distorts the values of the marginal stand-alone risks ξ i , and thus in turn creates more dependence between the (observable) risks X i and also makes them heavier tailed. An additional copula-focused insight will be given in the next subsection.

Mathematical Properties
The ddfF X (x) := P[X 1 > x 1 , . . . , X n > x n ] of X ∼ MP (n) II (µ, σ, α) is given by the formulā for all x ≥ µ, with the latter inequality defined coordinate-wise, where µ, σ, and α > 0 are parameters, with the coordinates µ i ∈ R and σ i > 0 for all i = 1, . . . , n. As noted earlier, we deal only with the case µ i ≥ 0 because of the assumed non-negativity of losses. Furthermore, we always require α > 1 so that all the marginal risks/losses X i would have at least one finite moment. The corresponding pdf is for all x ≥ µ.
These are formulas that are needed for various mathematical derivations, but they also give important insights into the practical relevance of the distribution. Indeed, expressing the ddf F X (x) =C(F X 1 (x 1 ), . . . ,F Xn (x n )) in terms of a survival copulaC, we see that we are dealing with the Clayton survival-copulaC Coupling this observation with the seminal work of Frees and Valdez [11] on understanding copulas and their actuarial relevance, we gain a powerful insight on the practical applicability of the MP (n) II (µ, σ, α) distribution in insurance, finance, and related areas. Furthermore, concentrating on the dependence structure provided by the copula, we immediately glean from its formula that the strength of dependence between the marginal risks increases when the parameter α decreases (e.g., Nelsen [29]). Specifically, when α ↓ 0, then we approach the Fréchet-upper-bound copula, which is associated with co-monotonic risks (e.g., Dhaene et al. [30,31]), and when α ↑ ∞, then we approach the independence copula. In addition, the asymptotic tail dependence, which is of the order n −α , increases when α decreases (e.g., Juri and Wüthrich [32]). For additional information on the Clayton and other copulas and their uses in insurance, we refer to Denuit et al. [1]. For mathematical properties, references, and a global view on copulas, we refer to the seminal work of Nelsen [29].
We shall use these properties together with some of the results by Vernic [21] in our following considerations. For additional information on the multivariate Pareto distribution of the second kind, we refer to Arnold [28] and Yeh [33,34], where we also find statistical inferential results concerning the parameters µ, σ, and α. Such results are important because the expressions of premiums and allocations that we shall obtain later in this paper will be in terms of these parameters, and thus practical usefulness of the results will hinge on the ability to estimate the parameters.

Distortion Risk Measure
One of the most intuitive ways for introducing the distortion risk measure starts with rewriting the net premium E[X] as the integral ∞ 0F X (x)dx. Now, we observe that modifying the values of X means using a loss-distortion function v as the integrator: dv(x) instead of dx in the integral. Distorting probabilities, on the other hand, means integrating g(F X (x)) instead ofF X (x), where g : is a probability-distortion function, that is, non-decreasing, g(t) ≥ t for all t ∈ [0, 1], g(0) = 0 and g(1) = 1. We have arrived at the quantity which is known in the actuarial literature (mainly when v(x) = x) as the distortion risk measure or, alternatively, the Wang risk measure. The role of this risk measure in insurance is discussed and explored in great detail in a series of pioneering papers by Shaun Wang (e.g., Wang [35,36], and references therein). Tsanakas [37], Tsanakas and Barnett [38] have developed capital allocation rules based on this risk measure. Jones and Zitikis [39] noted a connection between the distortion risk measure and the class of L-statistics, and in this way opened up a fruitful route for developing statistical inferential results in the area. A thorough mathematical treatment of integral (2) is given by Denneberg [40]. For recent advances in the area, we refer to Dhaene et al. [41]. The role of ∆ g,v [X] in economics has been revealed and discussed in great detail in the seminal works by Quiggin [7,42], Schmeidler [43], and Yaari [44].
Here we are interested in calculating ∆ g,v [X + ] for the aggregate loss The following theorem plays a pivotal role.
Hence, in order to obtain a formula for ∆ g,v [X + ], and assuming that all the marginal standard deviations σ 1 , . . . , σ n are equal, we just need to plug in the right-hand side of Equation (3) into formula (2). When some of the marginal standard deviations are not equal, then the task is more complex, and Equation (4) has to be employed. Interestingly, when all the marginal standard deviation are different, then recursive formula (4) can be turned into a non-recursive one, as the following result shows.
For the marginal ddf'sF X i on the right-hand side of Equation (5), we use formula (1). This produces a formula for ∆ g,v [X + ] in terms of the distribution parameters µ, σ, and α. Note that, for every n ≥ 2, which immediately follows from Equation (5) Later in this paper, we shall need quantiles x p := F −1 X + (p) for specific fixed values of p ∈ (0, 1), where F −1 X + is the inverse (or quantile) function of F X + . Even though we do not have closed-form or even recursive formulas for such quantiles, which are very difficult if possible to obtain (cf., e.g., Castellacci [45]), we can nevertheless use the above formulas for the cdf F X + to numerically invert F X + , which is sufficient for practical purposes.

Preliminaries
We start by rewriting the net premium E[X] as the integral instead of x, whereas by "weighting" the cdf F with a function w : [0, ∞) → [0, ∞) means replacing F by the weighted cdf F w (cf., e.g., Rao [46], Patil [47], and references therein), which is defined by (Certainly, we assume that w is Borel-measurable and the expectation E[w(X)] = ∞ 0 w(z)dF (z) is finite and positive.) These transformations of the net premium E[X] lead to the weighted premium (cf. Furman and Zitikis [48,49]) The weighted premium has appeared in the actuarial literature in several forms and contexts. For example, Heilmann [50] has shown how the premium naturally arises in the loss function approach to premium calculation. Kamps [51] views the premium as a natural extension of the Esscher premium. In general, the weighted premium has been extensively explored by Furman and Zitikis [48,49,52]. Quantities of the form Π v,w [X] have manifested prominently in the weighted utility theory. On the latter topic, we refer to the pioneering work of Chew [53] as well as to the survey papers by Machina [9,10] and the monograph by Puppe [6].
Later in this paper we shall be concerned with detailed calculations of the conditional layer expectation (CLE) premium, defined by where 0 ≤ a < b ≤ +∞ are some fixed numbers, which may or may not depend on the cdf of the aggregate loss X + . Obviously, CLE a,b [X + ] is a special case of the weighted premium Π v,w [X + ] by setting v(x) = x and w(x) = 1 [a,b] (x), where the latter means the indicator function of the interval [a, b] which is equal to 1 when x ∈ [a, b] and 0 otherwise.
The CLE plays an important role in layer-based (re)insurance (cf., e.g., Wang [35], Halliwell [54], and references therein). The layer boundaries a and b may be known quantities or, for example, the quantiles x p = F −1 X + (p) and x q = F −1 X + (q) for some 0 ≤ p < q ≤ 1. Under this set-up, the CLE premium becomes the truncated tail-value-at-risk (TrTVaR) premium In the special case q = 1, the latter premium becomes the classical tail-value-at-risk premium (or risk measure) Detailed calculations of TVaR p [X + ] when X ∼ MP (n) II (µ, σ, α) are given by Vernic [21].
In general, to start calculating the weighted premium Π v,w [X + ] under the assumption X ∼ MP (n) II (µ, σ, α), we write and then use the following lemma providing formulas for the pdf f X + of the aggregate loss X + . Lemma 4.1 (Vernic [21]) Let n ≥ 2 and X ∼ MP (n) II (µ, σ, α). When all the marginal standard deviations are equal, that is, σ 1 = · · · = σ n =: σ, then the density f X + (x) of the aggregate loss X + can be expressed by for all x ≥ µ + . When, however, there are at least two unequal marginal standard deviations, say σ i = σ j , then the density can be expressed by for all x ≥ µ + .
Hence, when all the marginal standard deviations σ 1 , . . . , σ n are equal, then we can just plug in the right-hand side of Equation (7) into formula (6) and have Π v,w [X + ] in terms of the distribution parameters µ, σ, and α. More explicit formulas can be obtained, and we shall do so in the next subsection.

Results
Since Hence, to calculate Π v,w [X + ], we only need to know how to calculate the expectation E [u (Y + )] for the functions u(y) = v(y + µ + )w(y + µ + ) and u(y) = w(y + µ + ) but we shall also see soon that the form of the function u does not matter, and thus we can view it as a generic function. The following theorem is an immediate consequence of Lemma 4.1 when µ = 0.
Theorem 4.1 Let n ≥ 2 and Y ∼ MP (n) II (0, σ, α) with σ 1 = · · · = σ n =: σ. Then When some of the marginal standard deviations σ i are not equal, then the task is more complex. We employ Equation (8) to obtain the following result.
Theorem 4.2 Let n ≥ 2 and Y ∼ MP (n) II (0, σ, α). When there are at least two unequal standard deviations, say σ i = σ j , then This recursive route for calculating the premium Π v,w [X + ] is useful as it shows how the premium of the aggregate loss evolves from lower-order aggregate losses and their premiums. For a numerical illustration of such a technique in the case of the TVaR premium and allocation, we refer to Vernic [21].
When all of the marginal standard deviations are unequal, then recursive formula (10) can be turned into a non-recursive one.
with the individual expectations on the right-hand side calculated by the formula Hence, given v and w, as well as µ, σ, and α, we can now use the above formulas to evaluate the premium Π v,w [X + ], which is the ratio of the expectations E [u (Y + )] corresponding to functions (9).
Namely, in a most natural way (cf., Furman and Zitikis [49,52]), the weighted premium Π v,w [X + ] extends to the weighted capital allocation Indeed, letting v(x) = x, we can view A v,w [X l , X + ] as the contribution of the loss X l to the aggregate loss X + , because we have the additivity property Π v,w [X + ] = n l=1 A v,w [X l , X + ]. An important special case of the weighted allocation A v,w [X l , X + ] is the CLE allocation Setting a and b to x p := F −1 X + (p) and x q := F −1 X + (q), respectively, for some 0 ≤ p < q ≤ 1, we obtain the truncated tail-value-at-risk allocation For calculations of the latter allocation when X ∼ MP (n) II (µ, σ, α), we refer to Vernic [21].

Results
Hence, we are interested in calculating A v,w [X l , X + ]. As before, we assume X ∼ MP (n) II (µ, σ, α) and write X = Y + µ with Y ∼ MP (n) II (0, σ, α). We note at the outset that our calculations will not depend on the form of the value and weight functions, v and w, and so our task reduces to calculating only the expectation E [v(X l )w (X + )]; note that setting v(x) ≡ 1 gives us the denominator E [w (X + )] in the definition of A v,w [X l , X + ]. In turn, our task reduces to calculating the expectation E [v(Y l + µ l )w (Y + + µ + )], and by redefining v and w in an obvious manner, our task further reduces to calculating E [v(Y l )w (Y + )]. We do this next in this subsection.
When all the marginal standard deviations σ 1 , . . . , σ n are equal, or perhaps with only one exception, say σ l , then we have closed-form expressions.
Theorem 5.2 Let n ≥ 3, l ∈ {1, . . . , n}, and Y ∼ MP (n) II (0, σ, α). When there are at least two unequal standard deviations, say σ i = σ j , and none of them is equal to σ l , then The above two theorems allow us to express the expectation E [v(Y l )w (Y + )] in terms of v and w, and the parameters µ, σ, and α. As we saw in the illustrative example of Subsection 5.1, in some important cases the function w may depend on the cdf F X + of the aggregate risk/loss X + . To express the latter in terms of the marginal cdf's and the parameters µ, σ, and α, we employ Theorem 3.1.
When all the marginal standard deviations are different, then formula (13) simplifies to a non-recursive one.
Corollary 5.1 Let n ≥ 3, l ∈ {1, . . . , n}, and Y ∼ MP (n) II (0, σ, α). When σ i = σ j for all i = j, then To calculate the expectation E [v(Y l )w (Y l + Y i )] on the right-hand side of Equation (14), we can use Theorem 5.1 by setting n = 2 and using Y l and Y i instead of Y 1 and Y 2 , respectively. We formulate this observation as our next corollary.
In the next section, we shall specialize the above results to the important case when v(x) = x and w(x) = 1 [a,b] (x) for any pair 0 ≤ a < b ≤ ∞.

Preliminaries
Many researchers have worked on capital allocation rules based on the tail-value-at-risk (TVaR) measure. Some of the recent works include Landsman and Valdez [13], Furman and Landsman [15], Chiragiev and Landsman [20], Vernic [21], Dhaene et al. [62], Asimit et al. [65], and references therein. Due to the importance of the TVaR premium and allocation, in this section we specialize our earlier formulas by setting v(x) = x and w(x) = 1 [a,b] (x) for any pair 0 ≤ a < b ≤ ∞.
Hence, under these choices of v and w, we shall next derive expressions for the premium CLE a,b [X + ] and allocation CLE a,b [X l , X + ] in terms of the distribution parameters µ, σ, and α. To get desired formulas for TrTVaR p,q [X l , X + ] and TrTVaR p,q [X + ], we shall then replace a and b by x p := F −1 X + (p) and x q := F −1 X + (q), respectively. A number of the following results will extend and generalize those in the literature (e.g., Vernic [21]).

CLE Premium
We start with the equation Formulas for the cdf F Y + can be obtained from those in Section 3 by setting µ = 0. Hence, we only need to derive formulas for the expectation E Y + 1 [a,b] (Y + ) in terms of the parameters σ and α.
Corollary 6.1 Let n ≥ 2 and Y ∼ MP (n) II (0, σ, α). When all the marginal standard deviations are equal, that is, σ 1 = · · · = σ n =: σ, then When there are at least two unequal marginal standard deviations, say σ i = σ j , then When σ i = σ j for all i = j, then with the individual expectations on the right-hand side calculated by the formula

CLE Allocation
We start with the equation The following corollaries tackle the expectation E Y l 1 [a,b] (Y + ) for any pair 0 ≤ a < b ≤ ∞. Corollary 6.2 Let n ≥ 2, l ∈ {1, . . . , n}, and Y ∼ MP (n) II (0, σ, α). When all the marginal standard deviations are equal, that is, σ 1 = · · · = σ n =: σ, then Note that the first equation of Corollary 6.1 immediately follows from Corollary 6.2 by summing up E Y l 1 [a,b] (Y + ) with respect to all l = 1, . . . , n. We need to point out at this moment, however, that the second and third equations of Corollary 6.1 will not follow so easily from our following results. In fact, we have found that direct proofs of the aforementioned two equations are shorter and more transparent, and we shall thus follow this route in Appendix A.
When all the marginal standard deviations except σ l are equal, then we have a closed-form formula for the expectation E Y l 1 [a,b] (Y + ) . Corollary 6.3 Let n ≥ 2, l ∈ {1, . . . , n}, and Y ∼ MP (n) II (0, σ, α). When σ i =: σ for all i = l and σ l = σ, then where When σ l and some other two standard deviations, say σ i and σ j , are unequal, then we have a recursive formula. Corollary 6.4 Let n ≥ 3, l ∈ {1, . . . , n}, and Y ∼ MP (n) II (0, σ, α). When there are at least two unequal standard deviations, say σ i = σ j , and none of them is equal to σ l , then When all the marginal standard deviations are unequal, then we have the following result.
Corollary 6.5 Let n ≥ 3, l ∈ {1, . . . , n}, and Y ∼ MP (n) II (0, σ, α). When σ i = σ j for all i = j, then To calculate the expectation E Y l 1 [a,b] (Y l + Y i ) on the right-hand side of Equation (17), we apply Corollary 6.3 in the case n = 2 and obtain the following result. where

Conclusions
In this paper, we have argued that the multivariate Pareto distribution of the second kind is an attractive model for multidimensional risks/losses. We have demonstrated in particular that very general classes of insurance premiums and risk allocations can be successfully tackled for the aforementioned distribution. The formulas for the premiums and allocations are in terms of the utility, distortion, and weight functions, as well as in terms of the distribution parameters. This facilitates a straightforward computational implementation of the premiums and allocations when parameter estimates become available to the researcher.

A. Technicalities
In this appendix, we have collected proofs of our main results whose validity needs some explanation. To begin with, we note that Proposition 3.1 follows immediately from the following lemma.
Lemma A.1 Let n ≥ 2 and Y ∼ MP (n) II (0, σ, α). When σ i = σ j for all i = j, then Proof The proof is by induction and relies on recursive formula (8), whose validity has been established by Vernic [21]. The case n = 2 is obvious. Hence, let Equation (19) hold for n − 1 losses, and our aim is to show that it holds for n losses as well. Hence, f Y + (y) is equal to Note that the coefficient next to f Y 1 (y) in quantity (20) is equal to which is the same as the corresponding coefficient in Equation (19). Next, for all i ∈ {2, . . . , n − 1}, the coefficient next to f Y i (y) in quantity (20) is equal to j∈{2,...,n}\{i} (σ i − σ j ) which is the same as the corresponding coefficient in Equation (19). Finally, the coefficient next to f Yn (y) in quantity (20) is equal to which is the same as the corresponding one in Equation (19). This proves Lemma A.1.

Proof of Equation
for all x, y ≥ 0.
Proof of Equation (12) of Theorem 5.1 Immediately follows from Lemma A.3.
Proof of Theorem 5.2 Immediately follows from Lemma A.4.
To proceed, we recall (cf. Subsection 2.3) that all lower-dimensional distributions of Y are also of the Pareto-II type.
Proof of Corollary 6.3. We first apply Theorem 5.1 with v(y) = y and w(y) = 1 [a,b] (y), and then proceed with the following calculations: