Dependent competing risks: Cause elimination and its impact on survival

The dependent competing risks model of human mortality is considered, assuming that the dependence between lifetimes is modelled by a multivariate copula function. The effect on overall survival of removing one or more causes of death is explored under two alternative definitions of removal, ignoring the causes and eliminating them. Under the ignoring definition deaths from the causes are removed by simply ignoring them and considering the marginal distribution of the remaining causes. Under the alternative eliminating definition, deaths are eliminated, by considering the limiting distribution of the competing risks' life-times, conditional on the corresponding eliminated lifetimes approaching infinity. Under the two definitions of removal, expressions for the overall survival functions in terms of the specified copula (density) and the net (marginal) survival functions, are given. The net survival functions are obtained as a solution to a system of non-linear differential equations, which relates them through the specified copula (derivatives) to the crude (sub-) survival functions, estimated from data. The overall survival functions in a model with four competing risks, cancer, cardiovascular diseases, respiratory diseases and all other causes grouped together have been implemented and evaluated, based on cause specific mortality data for England and Wales published by the Office for National Statistics, for year 2007. We show that the two alternative definitions of removal of a cause of death have different effect on the overall survival and in particular on the life expectancy at birth, at age 65 and on life annu-ities at age 65, when one, two or three of the competing causes are removed. An important conclusion, is that the eliminating definition is better suited for practical use in actuarial and other applications, since it is more intuitive, and allows for considering only positive dependence between the lifetimes which is not the case under the alternative ignoring definition.


Introduction
In the competing risks model, a group of individuals (units) is subject to the simultaneous operation of a set of competing risks which cause death (failure).It is assumed that each individual can die from any one of the causes and that there are corresponding lifetime random variables attached to him/her at birth.This model has been widely studied in the statistical, actuarial and demographic literature, under the assumption of independence of the corresponding lifetimes.Important contributions to the subject, to mention only a few, are the books by Pintilie (2006), Crowder (2001), Bowers et al. (1997) and Elandt-Johnson and Johnson (1980), the recent overview by Lindqvist (2007) and papers by Solari et al. (2008), Salinas-Torres et al. (2002) and Bryant and Dignam (2004), and also by Zheng andKlein (1994, 1995), where statistical methods for estimating related survival functions are considered.
Considerable amount of work is devoted to the competing risks model and their application in economics, reliability, medicine and actuarial science, under the assumption of dependence of the competing risks lifetimes.Important early contributions in this strand of literature are the papers by Elandt-Johnson (1976), and also by Yashin et al. (1986) who considered conditional independence of the times to death, given an assumed stochastic covariate process.A useful survey of statistical methods for dependent competing risks is provided by Moeschberger and Klein (1995).EM-based estimation of sub-distribution functions under the assumption that some of the competing causes are masked, has been considered by Craiu and Reiser (2006).Bounds in a dependent competing risks models with interval outcome data have been derived by Honoré and Lleras-Muney (2006), who apply their model in estimating changes in cancer and cardiovascular mortality in USA.Recently, Lindqvist (2009) has focused at modelling dependent competing risks in reliability, by considering first passage times of Wiener processes.
Dependent competing risks models of human mortality, based on copula functions, have been considered by Carriere (1994) and Escarela and Carriere (2003) and more recently by Kaishev et al. (2007).A copula model of competing risks applied to unemployment duration modelling has also been recently considered by Lo and Wilke (2009).Carriere (1994) and Escarela and Carriere (2003) have modelled dependence between two failure times by a two dimensional copula.In Escarela and Carriere (2003), the bi-variate Frank copula was fitted to a prostate cancer data set.Carriere (1994) was the first to use a bi-variate Gaussian copula in order to model the effect of complete removal of one of two competing causes of death on human mortality.However, the mortality data used by Carriere (1994) was not complete with respect to older ages and therefore, it was not possible to calculate such important survival characteristics as expected lifetimes and life annuities and draw relevant conclusions.
This deficiency has been overcome in the paper by Kaishev et al.(2007) who close the life table by applying a method of spline extrapolation up to a limiting age 120.They have extended further the work of Carriere (1994), considering a multidimensional copula model for the joint distribution of the lifetimes.The model has been tested on the example of up to four competing causes of death, (cancer, heart diseases, respiratory diseases and other causes grouped together), based on the US general population cause specific mortality data set, provided by the National Center for Health Statistics (NCSH) (1999) of the USA.Four alternative four dimensional copula models underlying the joint distribution of the life times have been explored, the Gaussian copula, the Student t-copula, the Frank copula and the Plackett copula.The impact of removal of one, two or three of the causes on the life expectancy and life annuity functions, characteristics of utmost importance in medical and statistical applications, have been studied.
In the paper by Kaishev et al. (2007), as well as in the earlier paper by Carriere (1994), it has been assumed that deaths by a cause are removed by simply ignoring that cause, i.e., by omitting the corresponding lifetime random variable from the vector of lifetimes considered.For this reason, removal of a cause of death under this definition, can be described more precisely as ignoring the cause.However, as pointed out by Kaishev et al. (2007) and also earlier, by Elandt-Johnson (1976), an alternative definition of removal of a certain cause may be given by considering the limiting distribution of the vector of lifetimes, given that the lifetime with respect to the removed cause tends to infinity, or more realistically to the limiting age.In other words, under this definition, it is assumed that deaths from the removed cause would not occur and all individuals would survive infinitely long time (in reality up to the limiting age) with respect to that cause.In what follows, we will call this type of removal of deaths from a particular cause, elimination of that cause.As pointed out by Kaishev et al. (2007), this alternative definition is more intuitive and easy to interpret, but leads to more complex expressions for the limiting survival distribution, under the assumption that dependence is modelled by a suitable copula.
The purpose of this paper is to explore the two alternative definitions of ignoring a cause and eliminating that cause, within the multivariate copula dependent competing risks model.We compare and contrast the two definitions, based on UK cause specific mortality data for year 2007, provided by the Office for National Statistics (2008), which includes deaths from cancer, heart disease, respiratory diseases and all other causes grouped together.We show that the choice of definition of cause removal has a significant effect on the overall survival, life expectancy at birth and at age 65, and on the value of a life annuity, these variables being estimated given that one, two or three of the competing causes of death are simultaneously removed.It is demonstrated that the eliminating definition is easier and more intuitive to interpret and does not necessarily require the use of comprehensive copulas an also that the complexity related to its implementation can be overcome without difficulty.Therefore, an important conclusion of the current work is that the eliminating definition is preferable for practical use compared to the ignoring definition, studied earlier in the papers by Carriere (1994) and Kaishev et al. (2007) .
The paper is organized as follows.In section 2, we introduce the dependent competing risks model under the assumption that dependence between the competing risks lifetimes is modelled by a suitable copula function.We summarize the methodology for obtaining net survival functions, given estimates of the crude survival functions, considered earlier by Carriere (1994) and Kaishev et al. (2007).In section 3, we give two alternative definitions of removal of a cause of death, ignoring and eliminating and provide expressions for the overall survival functions when one ore more causes are removed.In section 4, we implement the definitions numerically and compare the effect they have on the overall survival and on selected actuarial functions.Section 5 provides some conclusions and comments.

The dependent competing risks model
As pointed out by a number of authors (see e.g.Hooker and Longley-Cook 1957, Carriere 1994, Valdez 2001, Fukumoto 2005, Lindqvist 2007, 2009), risks in many real life applications tend to be dependent.In particular, as established in studying disease interactions (see e.g., Kaput et al. 1994, Weir 2005, Lobo 2008), diseases may be jointly caused by the interaction of particular genes.For example, as pointed out by Kaput et al. (1994), high levels of dietary fat, regulated and characterized by certain genes, jointly enhance the severity of certain cancers, obesity and cardiovascular diseases.Therefore, successful treatment of obesity, may lead to considerable reduction in the number of deaths from certain types of cancer and atherosclerosis.Weir (2005) has studied the interaction between cardiovascular disease (CVD) and chronic kidney disease (CKD) in patients with CKD and has explained the increased risk for CVD in patients with CKD.The paper by Lobo (2008) is devoted to understanding epistatic interactions between genes as the key to understanding complex diseases, such as Alzheimer's disease, diabetes, cardiovascular disease, and cancer.These and other studies in the medical literature suggest that, by reducing (or completely removing) deaths from one disease, it is possible to significantly improve mortality rates from the related (interacting) disease.In terms of lifetimes, this means that the lifetimes of interacting diseases are related (mutually dependent), and this dependence, which characterizes the overall survival from such causes, can be represented and studied under the copula-dependent competing risks model considered in this section.
The copula-dependent competing risks model of human mortality has recently been considered by Kaishev et al. (2007) where a detailed account of its properties, model assumptions and parameter estimation can be found.For our purpose of considering the model uncertainty with respect to the definition of cause elimination, we will briefly introduce the model and recall its basic characteristics.
Consider a group of individuals, exposed to m competing causes of death.It is assumed that each individual may die from any single one of the m causes.To make the problem more formally tractable it is assumed that, at birth, each individual is assigned a vector of potential life times T 1 , ..., T m , 0 § T j < ¶, j = 1, ..., m, if he/she were to die from each one of the m causes.Obviously, the actual lifetime span is the minimum of all the T 1 , ..., T m .Thus, it is clear that under this model the lifetimes T 1 , ..., T m are unobservable and we can only observe the minHT 1 , ..., T m L. In the classical competing risks model the random variables T 1 , ..., T m are assumed independent, whereas here we will be interested in their (dependent) joint survival distribution function which is assumed absolutely continuous and where t j ¥ 0, for j = 1, ..., m.In what follows, we will also need the marginal survival functions S ' H jL HtL = PrIT j > tM, j = 1, …, m, associated with SHt 1 , …, t m L, which we call net survival functions.As we will see, S ' H jL HtL are the target quantities in our study since, if we know them we can identify and calculate the joint survival function SHt 1 , ..., t m L and hence, evaluate the overall survival function SHt, ..., tL, under some appropriate assumptions on the dependence structure underlying (1).Note that, S ' H jL HtL, j = 1, ..., m are not observable.Let us recall that the classical model of independence of the r.v.s T 1 , ..., T m implies that SHt 1 , ..., t m L = S ' H1L Ht 1 Lä ...äS ' HmL Ht m L.
The overall survival of an individual, under the dependent competing risks model assumptions, is defined by the random variable T = minHT 1 , ..., T m L, and we will be interested in modelling the overall survival function, where t ¥ 0. In order to do so, one can apply the celebrated theorem of Sklar and express the survival function SHt 1 , …, t m L in terms of the net (marginal) survival functions S ' H jL HtL and a suitable copula function, CHu 1 , …, u m L, 0 § u i § 1, i = 1, …, m which captures the dependence structure, underlying the multivariate survival distribution of the random vector T 1 , ..., T m .
Copula functions have become a well established tool for modelling stochastic dependence and their properties are well documented in the monographs by Nelsen (1999), Joe (1997) and Cherubini et al. (2004).There are numerous copula related papers scattered throughout the statistical, financial and actuarial journals and some relevant references can be extracted from the CopulaWiki web page http://140.78.127.5/mediawiki/index.php/Main_Page.For a concise summary of the main properties of copulas, relevant to the multivariate dependent competing risks model of human mortality see Kaishev et al. (2007).
Having fixed a suitable copula, we can write (2) SHt 1 , ..., t m L = CIS ' H1L Ht 1 L, ..., S ' HmL Ht m LM, from where we can also evaluate the overall survival function (3) SHt, …, tL = CIS ' H1L HtL, ..., S ' HmL HtLM, if the net survival functions S ' H jL It j M, j = 1, ..., m were known.In order to find them, we may use the relationship between S ' H jL HtL and the so called crude survival functions, S H jL HtL, j = 1, ..., m.The crude survival function S H jL HtL is defined as the survival function with respect to the j-th cause of death, due to which death actually occurs, i.e., S H jL HtL = PrIminHT 1 , ..., T m L > t, minHT 1 , ..., T m L = T j M Dimitrova, D.S., Haberman, S. and Kaishev, V.K.

Dependent competing risks: cause elimination
The survival function S H jL HtL is called crude, since it reflects the observed mortality of an individual and hence, may be estimated, from the observed mortality data of a population, as will be illustrated in Section 4. In the biostatistics literature the crude survival function S H jL HtL is sometimes called the sub-survival function or the cumulative incidence function (see e.g., Bryant and Dignam, 2004).
As shown by Carriere (1995), under the assumption of differentiability of CHu 1 , ..., u m L with respect to u j oe H0, 1L and of S ' H jL It j M with respect to t j > 0, for t > 0, the following system of differential equations relates the crude and net survival functions It is important to note that (5) is a system of nonlinear, differential equations which may be solved with respect to the net survival functions S ' H jL HtL, given a suitable copula and estimates of the crude survival functions S H jL HtL, j = 1, ..., m.The numerical solution of (5) has been considered by Carriere (1994) in the two dimensional case and by Kaishev et al. (2007) in the multivariate case.For details of how this is done using Mathematica, see Kaishev et al. (2007).
The derivatives with respect to time of the crude and net survival functions in ( 5) are actually the crude and net probability density functions of the r.v.s T 1 , T 2 , ..., T m .We will denote these densities as f H jL HtL and f ' H jL HtL, j = 1, ..., m, respectively.
Let us also note that equality (4) can be used as a check on the solution of (5).For this purpose, we can apply (3) to express the overall survival function on the left-hand side of (4) as CIS ' H1L HtL, ..., S ' HmL HtLM = S H1L HtL + ....+ S HmL HtL where , 0 § t § 120.
Once the net survival functions are obtained, one can use (3) and evaluate the overall survival function which is of major interest in our investigation.More precisely, we will be interested in studying the effect of removal of a cause of death on the overall survival function, under two alternative definitions of removal, which will be introduced in the next section.

Removal of a cause of death
Our main interest in the paper is to investigate the effect of removing a cause of death, say indexed j, on the overall survival function SHt, ..., tL.This effect depends on the definition of removal and, as mentioned in section 1, one can consider two alternative definitions, either ignore the cause or eliminate it.The two alternatives have been highlighted already in the early paper by Elandt-Johnson (1976).Under the first approach, deaths arising from the j-th cause are removed by simply ignoring the j-th cause and considering a modified version of the lifetime random variable T, defined as where t = 0 appears on the j-th position.Similarly, ignoring two causes, say the j-th and the k-th ones, j ∫ k, would lead to considering the survival function Alternatively, the j-th cause of death, may be eliminated by considering the limiting distribution, conditional on T j ¶, of surviving from all other causes.Under this definition, the overall survival distribution function becomes Similarly, eliminating the j-th and the k-th cause, j ∫ k, may be defined as considering the survival function , where S ' H j,kL It j , t k M, is the marginal survival function with respect to the j-th and the k-th causes.Note that both expressions ( 7) and ( 9) directly generalize to the case of removing more than two competing risks.
The elimination definition allows for a more natural interpretation of the dependence between lifetimes and of the elimination of their corresponding causes, as will be illustrated numerically in the next section.To see this, assume that the j-th cause is strongly positively correlated with, say, the k-th cause.In this case, eliminating the j-th cause will mean that an individual is much more likely to survive to a longer time-horizon with respect to the k-th cause and more precisely, under perfect positive correlation, T j ¶ would lead to T k ¶, which is intuitive.On the other extreme, if T j and T k are perfectly negatively correlated, if T j ¶, then T k 0 which could be described as: elimination of the j-th cause would lead to increased mortality with respect to the k-th cause and hence, to decreased overall survival.
Clearly, this is of little practical relevance since removal of a cause of death usually leads to improvement of the overall survival and for this reason, elimination should be considered only under non-negative correlation.Let us note that this is not the case for the alternative ignoring definition, under which both negative or positive correlations between lifetimes may produce improvements in the overall mortality, and worse mortality is not achievable, as confirmed numerically in section 4 on the example of the UK cause specific mortality data and also in Kaishev et al. 2007 for US data.Therefore, the requirements with respect to the copula functions are more stringent under the ignoring definition, since in order to cover the whole range, from perfectly negative to perfectly positive correlation, only comprehensive copulas may be used.It is also more difficult to give meaningful interpretation of ignoring a cause under both negative and positive correlation between competing lifetimes.
It has to be noted that the elimination approach is confronted with the difficulty that the limiting conditional distributions in ( 8) and ( 9), may not always exist and if they exist, the evaluation of the overall survival function may in general be more complex.Based on a particular selection of copulas, we have shown in section 4 that, the numerical complexity added due to the change of definition of elimination may be successfully overcome.Let us also note that in the case when T 1 , ..., T m are assumed independent, the two approaches are equivalent (see Elandt-Johnson 1976).
While the somewhat simpler approach of ignoring a cause has been implemented and explored further in the papers by Carriere (1994) and more recently by Kaishev et al. (2007), to the best of our knowledge, the alternative approach of eliminating a cause of death has not been implemented and studied previously.
Our major goal in this paper will be to find representations, in terms of a suitable copula, of where the marginal (net) survival function S ' H jL HtL = PrIT j > tM due to cause j, are found as solutions to the system of differential equations, (5), following the methodology described in section 2.
Alternatively, under the elimination approach, applying definition (8) the following expression for the overall survival function, given the j-th cause has been eliminated can be written ( 12) where c Iu 1 , …, u j-1 , 0, u j+1 , …, u m M, is the copula density, f ' HiL HtL, i = 1, …, m are the marginal (net) probability density functions, corresponding to each cause of death and the integral in ( 12) has dimension m -1.
Similarly, following (9), ( 13) where It is easy to see that (13) generalizes directly to the case of eliminating more than two causes, therefore we will omit the corresponding formula.
Comparing expressions ( 10) and ( 11), with ( 12) and ( 13), it is easy to see that the latter are more complex and difficult to evaluate.In order to evaluate ( 10) and ( 11), it is sufficient to compute the copula function C whereas, in order to evaluate ( 12) and ( 13) one would need to compute a multiple integral of a relatively complex integrand function.In order to produce a simpler expression for S eliminate

H-jL
HtL and S eliminate Hj,-kL HtL, one may consider either simplifying ( 12) and ( 13) or finding explicitly the limits in ( 8) and ( 9).In general, both approaches are confronted with difficulties.One of them is that the marginal densities, f ' HiL HtL, i = 1, …, m, are not in analytic form but are derived from the numerical solution of (5), so direct integration in ( 12) and ( 13) is not plausible even for copulas with simpler representation, such as Frank or Plackett copulas.Furthermore, directly finding the limits in ( 8) and ( 9) is difficult since, the denominator, S ' H jL HsL is obtained as a numerical solution of ( 5), and it tends to zero as s Ø ¶.However, as has been established in section 4, definitions ( 8) and ( 9) lead to a more efficient numerical implementation than the more expensive integral expressions ( 12) and ( 13).The implementation of the competing risks model under both the ignoring and the eliminating definition, is illustrated in section 4.

Numerical results
In this section, we apply the methodology described earlier to UK cause specific mortality data for year 2007, published by the Office for National Statistics ( 2008), which includes deaths from cancer, heart disease, respiratory diseases and all other causes grouped together.
The classification of causes of death is according to the International Classification of diseases (ICD-10).For ease of presentation, we consider the two dimensional and the multidimensional competing risk models separately.The numerical implementation of the methodology has been performed using Mathematica 7.

Two causes of death
We consider here the simplest case of only two competing causes of death, one due to cancer (ICD-10 codes C00-D48), and a second one due to all other, non-cancer causes, pooled together.Thus, here m = 2 and we denote by T c and T o the lifetime random variables for the cancer and non-cancer causes of death and by S HcL HtL, S ' HcL HkL, and S HoL HtL, S ' HoL HkL, the crude and net survival functions for cancer and non-cancer respectively.As noted in section 2, it is possible to estimate crude survival functions based on an appropriate set of cause specific, mortality data.In order to estimate the crude survival functions for cancer and other (non-cancer) causes, we have used a two decrement life We have fitted a cubic spline function to the observed crude survival data for ages from 0 to 100.In order to obtain a "closed" mortality model up to a limiting age of 120, we have extrapolated the fitted cubic spline functions S HcL HtL and S HoL HtL, for the cancer and the other (non-cancer) causes, over the 100-120 age range, under the condition that S HcL H120L = S HoL H120L = 10 -10 .For further details regarding the method and formulas used to obtain the observed and extrapolated values of the crude survival functions we refer to the Appendix.
The fitted cubic spline survival functions S HcL HtL and S HoL HtL, 0 § t § 120 and their densities are given in Fig. 1.Having estimated the crude survival functions S HcL HtL and S HoL HtL, 0 § t § 120, we obtain the net survival functions S ' HcL HtL and S ' HoL HtL, 0 § t § 120, by solving the system (5), using three different type of copulas, namely Gaussian, Frank and Plackett copulas.The solutions S ' HcL HtL, S ' HoL HtL, 0 § t § 120, obtained from ( 5) have been checked applying equation ( 4) for the case m = 2.As can be seen from Fig. 1, the crude survival functions, S HcL HtL and S HoL HtL are both close to zero in the age range 100 § t § 120, therefore numerical solutions of ( 5), S ' HoL HtL and S ' HcL HtL may not possess the built-in Mathematica precision in that range.Another important point is that both, S ' HoL HtL and S ' HcL HtL, are influenced by the extrapolated sections of the crude survival functions not only for 100 § t § 120 but within the entire age range 0 § t § 120.This in turn means that the results and conclusions with respect to survival under the dependent competing risks model, given later in this section, depend on the extrapolation that has been carried out.
The net survival functions, obtained as a solution of (5), using the Gaussian copula, C Ga Hu 1 , u 2 L, with values of r corresponding to five different values of Kendall's t are plotted in Fig. 2 (so that t = 0.91 corresponds to r = 0.99, t = 0.35 corresponds to r = 0.52 and so on).The linear correlation r is considered as a free parameter, by means of which different degrees of association, between the cancer and non-cancer modes of death, are preassigned.Thus, the system (5) has been solved for values of r equal to -0.99, -0.52, 0.00, 0.52, 0.99 and the obtained net survival functions S ' HoL HtL and S ' HcL HtL, 0 § t § 120, are given in the left and right panel in Fig. 2. The corresponding densities, f ' HoL HtL and f ' HcL HtL are plotted in Fig 3 .Plots for S ' HoL HtL and S ' HcL HtL, assuming Frank and Plackett copulas are very similar and therefore have been omitted.
In the remaining of this section, we will compare and analyze the numerical results of survival under the two alternative definitions of removal of a cause of death, the ignoring and the eliminating definitions given by ( 6) and (8) in section 3.
In the bi-variate case, under the ignoring definition, for fixed r, the net survival function, S ' HoL HtL, 0 § t § 120, coincides with the overall survival function, S ignore H-cL HtL, 0 § t § 120, when cancer has been removed, i.e., S ' HoL HtL ª S ignore H-cL HtL and f ' HoL HtL ª f ignore H-cL HtL, where . Obviously, if cancer is ignored in the bi-variate decrement model, the overall survival will entirely be determined by the only remaining cause of death, that of non-cancer, and vice-versa.Therefore, in order to study the overall survival, when cancer is ignored, we may directly study the non-cancer net survival function S ' HoL HtL and its corresponding density, f ' HoL HtL, given in the left panel of Fig 3.As can be seen from the left panel of Fig. 2, ignoring cancer affects survival most significantly when Kendall's t = -0.91 (r S = -0.99),which corresponds to the case of extreme negative dependence.This effect of rectangularization of the overall survival function is seen even more clearly on the right panel of Fig. 2, where the 'other' cause of death has been removed.In addition, we note that in the case of negative dependence or even independence between T c and T o , the trend of the overall survival curves suggests that the limiting age lies somewhere beyond 120 and it would not be natural to expect the old age survivors to die almost simultaneously at 120.
Survival under the eliminating definition of removal of a cause is illustrated for the three different choices of copula, Gaussian, Frank and Plackett copulas, in Fig. 4 HtL, has been computed based on definition (8) (see section 3) for the case m = 2, with s Ø ¶ replaced by s Ø 120, in which case S ' H jL HsL Ø 0 has been replaced by S ' H jL HsL Ø 10 -10 i.e., (8) simplifies to ( 14) S eliminate

H-jL
HtL = C IS ' H1L HtL, …, S ' H j-1L HtL, 10 -10 , S ' H j+1L HtL, …, S ' HmL HtLMä10 10 .It has to be noted that expression (12) can be used as an alternative to ( 14), however, its evaluation is much more time consuming and in the case of the Gaussian and t-copulas, for which considerable probability mass is located at the origin, leads to unstable computations.In contrast, the evaluation of ( 14) is stable and requires only few seconds in the case of m = 2, for any of the three copulas selected.
As can be seen from Fig. 4-6, for negative values of t, the overall survival function, S eliminate

H-cL
HtL, when cancer is eliminated, suggests poor survival from the remaining cause (all other causes pooled together) for all three copula choices.This is confirmed by the negative values of the gain in the life expectancies at birth, e ë 0

H-cL
, and the whole of life annuity at age 65, a 65 H-cL , calculated at 4 % interest, for t = -0.35 and t = -0.91,presented in Tables 1-3.The latter phenomenon is observed because, under strong negative correlation i.e., t = -0.91,individuals tend to survive to 120 from cancer when it is eliminated and hence, they will tend to die from the remaining competing cause already at birth, due to the assumed strong negative correlation of the corresponding lifetimes.Clearly, under the eliminating definition, such negative correlation makes little sense, since it suggests that improvement of mortality with respect to one cause would lead to increasing the mortality from the remaining cause.Such a setting is of little relevance when the competing risks are critical illnesses, since what is important in the context of actuarial demographic and medical applications is how life expectancy, life annuities and other actuarial and demographic characteristics are affected if mortality improves as a result of successful elimination of any of the main causes of death.Therefore, under the eliminating definition of removal of a cause of death, it is sufficient to study only the range of positive correlation between the competing lifetime random variables.As can be seen from the left panel of Fig. 4, assuming almost perfect positive correlation and eliminating cancer, i.e. achieving perfect survival with respect to it, naturally leads to perfect survival with respect to the only remaining competing risk (all other causes pooled together), and hence leads to perfect overall survival, given cancer is eliminated.This is clearly illustrated by the curve, S eliminate

H-cL
HtL for t = 0.91, which is almost rectangular.

Dependent competing risks: cause elimination
Comparing Fig. 2 and Fig. 4, and also the columns "Ignore" and "Eliminate" of Table 1, which summarizes the values of e ë 0

H-cL
, and a 65 H-cL , under both the ignoring and eliminating definitions, it can be seen that survival under the two alternative definitions is quite different.Thus, under the ignoring definition, improvement in mortality is achieved for all values of t oe H-1, 1L, whereas under the eliminating definition, mortality improvement is achieved only for non-negative values of t oe @0, 1L.On the other hand, looking at Fig. 2 and Fig. 4, it can be seen that the overall survival function under the ignoring definition varies within a relatively small range and is bounded from above by the curve for t = -0.91 which is nearly the best possible mortality improvement, attained in the limit, as t Ø -1, in which case the Gaussian copula converges to the lower Fréchet-Hoeffding bound (see e.g., Kaishev et al.2007).In contrast to the ignoring case, under the elimination definition, survival is very sensitive with respect to the value of t and can vary within the entire range, from zero life span to 120 years life span, as seen from Fig. 4 and the values for e ë 0

H-cL
, and a 65 H-cL , presented in Table 1.Also, contrary to the ignoring definition, under elimination, survival depends significantly on the choice of the copula modelling the dependence between the lifetimes, as can be seen comparing the survival functions in Figures 4, 5 and 6, and the numbers for e ë 0

H-cL
, and a 65 H-cL , presented in Tables 1-3.Comparing the curves in Fig. 2 and Fig. 4, for t = 0.0 which corresponding to the independent case, it can be verified that the two definitions are equivalent, as noted in section 3.
What can also be observed, comparing the survival curves in Fig. 4, 5 and 6 is that the curves corresponding to the Frank and Plackett copulas are relatively much closer to each other then to the curves for the Gaussian copula case.This is consistent also with the numerical results for e ë 0

H-cL
, and a 65 H-cL , presented in the columns "Eliminate" of Tables 2 and 3, which are close to each other for most of the values of t.It can also be seen from Fig 5 and 6 that for both copulas, improvement of survival is somewhat more limited and rectangularization for t = 0.91 is not achieved, in contrast to the case of Gaussian copula, given in Fig.

Comparing the numerical values for e
ë 0

H-cL
, and a 65 H-cL , summarized in Table 1 with those given in Tables 2 and 3, one can conclude that, under the eliminating definition, results are more sensitive both with respect to the value of t and the choice of copula, than under the ignoring definition.16.42 @-3.59D 14.74 @1.55D 11.33 @-1.86D 0.35 q = 5.022 84.17 @2.51D 92.42 @10.76D 21.32 @1.30D 27.94 @7.93D 13.87 @0.67D 16.62 @3.43D 0.91 q = 735.882.12 @0.46D 105.28 @23.62D 20.1 @0.08D 40.29 @20.28D 13.24 @0.04D 20.2 @7.01D Although our focus so far has been at the changes in the overall survival function SHtL, 0 § t § 120, under the two alternative definitions of ignoring and eliminating a cause, the joint survival function of T c and , is also of interest.However, since either one of the causes leads to death, and the other lifetime remains latent, probabilistic inference related to the joint distribution of T c and T o is somewhat artificial.Nevertheless, it is instructive and in Fig. 7-9 we have plotted the joint density of T c and T o , in case of the bi-variate Gaussian, Frank and Plackett copulas for Kendall's t = 0.35.For any bi-variate copula, the joint density of T c and T o can be calculated from ( 2) as ( 15) Dimitrova, D.S., Haberman, S. and Kaishev, V.K.

Dependent competing risks: cause elimination
As seen from Fig. 7-9, under this assumption of positive dependence, jointly increasing values of the lifetimes T c and T o are likely to occur.This is valid, regardless of what copula has been assumed to model the dependence.There are, of course, some copula specific differences in the joint density functions, as is natural to expect in view of (15).As can bee seen from Fig. 8 and 9, the plots of the joint density of T c and T o are similar for the Frank and Placket cases, and are somewhat different to the density plots in case of the Gaussian copula given in Fig. 7. Another, obvious characteristics of the joint density function for all three copulas is that it has two modes.

Multiple causes of death Hm = 4L
We now illustrate the extension of the proposed methodology to the multivariate case by considering four competing causes of death, cancer (c), (ICD-10 codes C00 -D48), heart diseases (h), (ICD-10 codes I00 -I99), respiratory diseases (r), (ICD-10 codes J00 -J99), and other causes (o), grouped together.As in the bi-variate case, we have constructed a four decrement life table using England and Wales cause specific female mortality data for year 2007, published by the Office for National Statistics (2008), see Table 5 therein.For more details on how the four decrement life table was obtained see the Appendix.The interpolated crude survival functions S HcL HtL, S HhL HtL, S HrL HtL, S HoL HtL, 0 § t § 120 and their derivatives are given in Fig. 10.For illustrative purposes we have used the multivariate Frank copula to model purely positive dependence between the lifetimes T c , T h , T r and T o , which, as noted in the bi-variate case, is the meaningful range of dependence under the elimination definition of cause removal.The four net survival functions obtained as a solution to system (5), and their densities for the multivariate Frank copula with parameter q = 3.46 , are presented in Fig. 11.In the left panels of Fig. 12 and Fig. 14 we give the overall survival functions with each one of the three possible diseases removed, j oe 8h, c, r<, under the ignoring and the eliminating definitions of removal, respectively and compare them to the overall survival function with no disease removed, SHtL.As can be seen, improvement in survival is more significant under the eliminating definition than under the ignoring one.This is confirmed also comparing the corresponding gains in the actuarial functions, summarized in the first three rows of , as can be seen from Table 4.
As it is also illustrated in Fig. 12 and Fig. 14, under both definitions the most significant improvement in survival for the age range 40 § t § 85 is achieved if cancer is removed, whereas for 85 § t § 120 the best improvement in survival is due to removal of heart disease.As expected, improvement in survivorship due to removal of respiratory disease is not as significant.
In the left panels of Fig. 13 and Fig. 15 we give the overall survival functions with all possible pairs of diseases removed, i.e., j ª 8c, h<, j ª 8c, r<, j ª 8h, r<, under the ignoring and the eliminating definitions of removal, respectively and again contrast them to the overall survival functions with no disease removed, SHtL.As in the case of removing only one disease at a time, if a pair of diseases is removed, improvement in survival is more significant under the eliminating definition, compared to the ignoring one.This is confirmed also comparing the corresponding gains in the actuarial functions, summarized in the second three rows of

Concluding remarks
In this paper, we have demonstrated how copula functions can be applied in modelling dependence between lifetime random variables in the context of competing risks.We have implemented the multivariate copula dependent competing risks model to study the impact of removing one or more causes of death on England & Wales 2007 cause specific mortality.In particular, we have focused at comparing and contrasting two alternative definitions of cause removal, namely ignoring and eliminating a cause, and their effect on life expectancy at birth and age 65, and annuity functions.For this purpose, we have provided expressions for the overall survival functions in terms of the specified copula (density) and the net (marginal) survival functions.
We have shown that there are substantial differences in the overall survival functions, given one or more risks are removed, under the two definitions which is also reflected in the values of the life expectancy and annuity functions.An important conclusion derived from this work is that the elimination definition is more appropriate for actuarial applications, since it suffices to consider only positive dependence among the competing lifetimes and the model results are more intuitive and easily interpretable.
The methodology and results may be applied in: managing longevity risk; setting target levels for mortality rates that will assist with scenario testing and sensitivity analyses in the presence of dependence between causes of death; population forecasting and planning; life insurance business where the financial impact of mortality improvements on life insurance and annuities products may be investigated.
The question of how to estimate the (pairwise) correlations between causes of deaths via their associated lifetimes, requires further research in close collaboration with the medical profession.In this regard, promising directions of research may be to look at estimation, based on the so called Expectation-maximization algorithms, and also quantitative methods for modelling expert's opinion.

Appendix
Here we describe how we have constructed a two and a four decrement UK female population data set (FP), using "  A.4.As mentioned in section 4, cubic spline functions were fitted to these crude survival data and an extrapolation has been performed over the 100-120 age range by setting S H jL HkL = 10 -10 , j oe 8c, h, r, o<.It has to be noted that the spline functions have been fitted to log S H jL HkL data and than transformed back to the original scale.The latter allows to avoid some unwanted wiggling of the spline curves when fitted directly to S H jL HkL data, as for example the fit becoming negative in the very old ages.In order to obtain the observed values of the crude survival functions, the following quantities were calculated: ¶ q 0 H jL -the multiple-decrement probability that a newborn will die from cause of death j, j oe 8c, h, r, o< ¶ d 0 H jL -the total number of deaths from cause of death j, j oe 8c, h, r, o<, for all ages from 0 to ¶ The following formula were used to obtain the values of ¶ q 0 HcL = 0.24 and ¶ q 0 HoL = 0.76 in the two-decrement case, and ¶ q 0 HcL = 0.24, ¶ q 0 HhL = 0.34, ¶ q 0 HrL = 0.15 and ¶ q 0 HoL = 0.27 in the four-decrement case, based on the values given in = C 1 IS ' H1L HtL, ..., S ' HmL HtLM μ d d t S ' H1L HtL d d t S H2L HtL = C 2 IS ' H1L HtL, ..., S ' HmL HtLM μ = C m IS ' H1L HtL, ..., S ' HmL HtLM μ d d t S ' HmL HtL where C j Hu 1 , ..., u m L = ∑ ∑u j CHu 1 , ..., u m L, j = 1, ..., m.

Fig. 1 .
Fig. 1.Interpolated crude survival functions (left panel) and their densities (right panel) for 'cancer' and 'other' causes of death.

Fig. 7 .
Fig. 7.A 3D plot and a contour plot of the joint density of T c and T o , expressed through the Gaussian copula, for Kendall's t = 0.35.

Fig. 8 .
Fig. 8.A 3D plot and a contour plot of the joint density of T c and T o , expressed through the Frank copula, for Kendall's t = 0.35.

Fig. 9 .
Fig. 9.A 3D plot and a contour plot of the joint density of T c and T o , expressed through the Plackett copula, for Kendall's t = 0.35.

Fig. 15 .
Fig. 15.The overall survival functions with no disease eliminated and with only two diseases eliminated (left panel) and their densities (right panel).
This will allow us to quantify and compare the effect of removal of one or more causes, under the two alternative definitions, on life expectancy at birth and at age 65 and life annuities.
the survival functions S ignore H-jL HtL, S ignore Hj,-kL HtL,… and S eliminate H-jL HtL, S eliminate Hj,-kL HtL,… under the two alternative definitions of removal of a cause of death.
table, obtained on the basis of England and Wales cause specific female mortality data for year 2007, published by the Office for National Statistics (2008), see Table 5, therein.For more details on how the two decrement life table was obtained see the Appendix.The two decrement life table data are presented in 5 year age intervals and cover the age range from 0 to 95+ years.
-6, respectively.It is worth noting that, contrary to the ignoring definition, the net survival function, S ' HoL HtL, 0 § t § 120, does not coincide with the overall survival function, S eliminate

Table 4 .
Another interesting conclusion is that maximum gain in e .e., j = c, contrary to the eliminating definition where maximum gain is attained if heart disease is eliminated.However, ignoring cancer or heart disease produces very similar gains in the life expectancy at birth, e

Table 4 .
Under the ignoring definition, maximum gain in e jL is attained if j = hr.However, as can be seen from Table4, under the eliminating definition, the gains in e h< and j ª 8h, r<, so one may argue that under both definitions, the removal of cancer and heart ( j ª 8c, h<) brings about (most) significant gains in all the actuarial functions summarized in Table4.It is worth noting also another way in which the two definitions are different.Comparing the gains obtained if one cause is removed to the gains resulting from the removal of two causes, one can see that gains nearly double under the ignoring definition while they are nearly the same under the eliminating definition.This is natural to expect since one and the same level of positive correlation between the lifetime random variables, T c , T h , T r and T o , has different interpretation and numerical effect on the functions jL , under the two alternative definitions.And finally we note that regardless of the definition, maximum gain is achieved when all three diseases are removed and this is illustrated by the last row of Table4.The overall survival functions with no disease eliminated and with only one disease eliminated (left panel) and their densities (right panel).

Table 5 .
Death: underlying cause, sex and age-group, 2007: summary" from ONS (2008) and the England & Wales 2005-07 Interim Life Table as published by ONS.Table 5 from ONS (2008) contains number of deaths by cause of death and total number of deaths, relating to five year age groups, e.g.5-9, 10-14, 15-19 and so on.The first and the last age spans for which data are given in the table are correspondingly 0-1 and 95+ and the causes of death are coded according to the International Classification of Diseases (ICD), 10th revision.So, from these data we have extracted the proportions in every age group of people dying from cancer (c), (ICD-10 codes C00-D48), from heart diseases (h), (ICD-10 codes I00-I99), from respiratory diseases (r), (ICD-10 codes J00-J99) and all other causes of death, (o), pooled together.Clearly, for the purpose of constructing the two decrement tablewehave combined the figures for heart and respiratory diseases to create one group of 'other' causes of death.The proportions obtained in this way were applied to the number of deaths, d x , given in the England & Wales 2005-07 Interim Life Table and the resulting age-grouped, multiple decrement tables are given in Table A.1 and Table A.2. Based on the crude data presented in Table A.1 and Table A.2, we easily obtain the observed values at ages k = 1, 5, 10, ..., 95, 100 of the crude survival functions S HcL HkL and S HoL HkL, see Table A.3, and S HcL HkL, S HhL HkL, S HrL HkL and S HoL HkL, see Table

competing risks: cause elimination Table A.4.
Table A.1 and Table A.2: ¶ q 0 H jL = The following formulae were used to calculate the values of k d 0 H jL , S 0 H jL HkL and SHkL given in Table A.3 and Table A.4, based on the values given in Table A.1 and Table A.2: Dimitrova, D.S., Haberman, S. and Kaishev, V.K. Dependent k d 0 H jL = ‚ x<k d x H jL