Model Determination for Multivariate Survival Data - CiteSeerX

3 downloads 101 Views 332KB Size Report
data, using Metropolis-within-Gibbs (Metropolis, Rosenbluth, Rosenbluth, Teller ... available as prior information (see Leonard (1978) and Gamerman (1991)).
Model Determination for Multivariate Survival Data Helen Aslanidou and Dipak K. Dey, University of Connecticut  Abstract Frequently in the analysis of survival data, survival times within the same \group" are correlated due to unobserved covariates. One way these covariates can be included in the model is as frailties. These frailty random block e ects generate dependency between the survival times of the individuals which are conditionally independent given the frailty. Using the conditional proportional hazards model of Clayton and Cuzick (1985), in conjunction with the frailty, a whole new family of models is introduced. By considering a gamma frailty model, often the issue is to nd an appropriate model for the baseline hazard function. In this paper a exible baseline hazard model based on a correlated prior process is proposed and is compared with a standard Weibull model. Several model diagnostics methods are developed and model comparison is made using Bayes factors for the two models. The above methodologies are applied to the McGilchrist and Aisbett (1991) kidney infection data and the analysis is performed using Markov Chain Monte Carlo methods.

Key Words: Autocorrelated prior process, Bayes factors, credible regions, frailty, metropo-

lis algorithm, model determination, proportional hazards model, Weibull model.

 The research of the second author was supported in part by NSF SCREMS grant DMS-9506557. Cor-

respondence to: Dr. Dipak K. Dey, Department of Statistics, Box U-120, University of Connecticut, Storrs, CT 06269-3120. E-mail: [email protected], Fax: (860)486-4113.

1

1 Introduction To analyze multivariate survival data where the subjects of the same group are related to each other; for example, they come from the same family (related genetically), or they are exposed to the same conditions (related environmentally), Cox's (1972) proportional hazards model (see also Cox and Oakes (1984)) cannot be used. On the other hand, frailty models are very ecient. These models can formulate the variability of life times, coming from two distinct sources. The rst source is natural variability and it is explained by the hazard function and the second is variability common to individuals of the same group or variability common to several events of an individual and it is explained by the frailty. Examples of the above are survival times of brothers or sisters (group shares the same variability), times of the re-occurrence of an infection for a speci c patient (re-occurrence times share the same variability) etc. The frailty term in a model represents the common covariates that are not observed or are neglected. The most common distribution for the frailty is the gamma distribution (see Clayton (1978), Oakes (1982) and Clayton and Cuzick (1985)). Also Hougaard (1986b) used the stable distribution for the frailty along with arbitrary and Weibull hazards. Hougaard (1984,1986a,b,1987,1991), Hougaard, Harvald and Holm (1992a,b) and Hougaard, Myglegaard and Borch-Johnsen (1994) discuss the various families of frailty distributions. Hougaard (1995), presents some generalizations. Clayton (1991) was the rst to apply Bayesian methods to frailty models. He used the conditional proportional hazards frailty model (Clayton and Cuzick (1985); Oakes (1986, 1989); Shih and Louis (1992)) where given the unobserved frailty random variable (Vaupel et al. (1979)) corresponding to the random group-e ect, the hazard for each survival time follows a proportional hazards model (Cox (1972)) with frailty random variable, covariate e ect and the baseline hazard acting multiplicatively to form the conditional hazard. For the baseline hazard, Clayton (1991) used a Levy process (Kalb eisch (1978)), where the increments in the cumulative baseline hazard in disjoint intervals are assumed to be independent. In real life problems though, a smooth or correlated prior process, where hazards in adjacent intervals are correlated, may be more suitable for modeling the prior information of the baseline hazard function. We discuss that type of hazard function (correlated prior) here, obtained from 2

Arjas and Gasbarra's (1994) martingale process. In this paper we consider some Bayesian model diagnostic measure as presented in Aslanidou, Dey and Sinha (1995), for adequacy of two competing baseline hazards, the Weibull and the correlated prior process. Next we compare these two models under the framework of model selection. The remaining sections are as follows: Section 2 introduces the conditional proportional hazards model with frailty (Clayton and Cuzick (1985); Oakes (1989); Clayton (1991)), the Weibull baseline hazard model and the correlated prior process for the baseline hazard model. Section 3 presents some standard notations and the data likelihood for both models. Section 4 contains the conditional posteriors. Model determination is the content of section 5. In this section we use a Bayesian EDA type method to assess model diagnostics using exponential residuals (see Aslanidou, Dey and Sinha (1995)). In addition we compare the two models using Bayes factors. In section 6 we analyze the McGilchrist and Aisbett (1991) data, using Metropolis-within-Gibbs (Metropolis, Rosenbluth, Rosenbluth, Teller and Teller (1953); Gelfand and Smith (1990); Tanner (1993); Muller (1994)) algorithm to calculate some of the posteriors for both models, display some posterior features of the parameters, check model adequacy using Bayesian EDA methods and compare the models using Bayes factors.

2 The models 2.1 Frailty models Clayton (1978) and Oakes (1982) considered rst frailty models for multivariate survival data, using gamma distribution for the frailty. Hougaard (1986b) used the positive stable model. Lee (1979) and Lu and Bhattacharyya (1990) used Weibull distribution to model the frailty parameter while Whitmore and Lee (1991a) studied a model with inverse gamma frailties. We consider the gamma distribution which is most common to model the frailty. Assuming that the survival time of the j-th subject (j = 1; : : : ; m) in the i-th group (i = 1; : : : ; n) is denoted by Tij and given the unobserved frailty parameter denoted by wi (for the i-th

3

group), the model via the hazard function is as follows:

h(tij jwi; zij ) = o (tij ) exp( zij ) wi

(2.1)

where zij is the xed covariate vector, is the regression parameter and o () the baseline hazard function. Shih and Louis (1992) call the above \conditional proportional hazards model". The covariate zij can be time dependent. The frailty parameters wi; i = 1; : : : ; n are assumed to be independent and identically distributed for every group. For identi ability reasons we need the wi's to have mean one. In this paper we consider wi to follow a

wi  Gamma(? ; ); i = 1; : : : ; n

(2.2)

1

where  is the unknown variance of the wi's. Large values of  signify closer positive relationship between the subjects of the same group and greater heterogeneity among the groups. The natural prior for  will be Gamma. The regression parameter follows a normal distribution with relatively large variance, so that the data drive the inference.

2.2 Weibull modeling of the baseline hazard The Weibull distribution has been used extensively to model the baseline hazard function due to its simplicity and exibility. Introducing some notation, we say that

!o (tij )  Weibull( ; ); > 0;  > 0; i = 1; : : : ; n; j = 1; : : : ; m

(2.3)

where and  are unknown hyperparameters and the superscript \!" indicates that this baseline hazard is for the Weibull model. The gamma distribution is a suitable prior for the hyperparameters and considering large variances, the priors become as non-informative as possible. So the baseline hazard function is, !o (tij ) =  t ? and we assume that the priors for the hyperparameters are,   Gamma(m ; m ) with E () = m m = 1 and  Gamma( ;  ). 1

1

1

2

1

2

2

2.3 Correlated prior process modeling of the baseline hazard Modeling the baseline hazard using prior processes is very common. Clayton (1991) used a Levy process for the baseline cumulative hazard. But he assumed, unrealistically, independency between hazards of adjacent intervals, creating a very \course" path through time. 4

Often in real life problems, not the actual baseline hazard, but the smoothness of it, is available as prior information (see Leonard (1978) and Gamerman (1991)). A correlated prior process incorporates that in the model. Here we divide time into prespeci ed intervals Ik = (tk? ; tk ] for k = 1; 2; : : : ; g where 0 = t < t < : : : < tg < 1, tg being the last survival or censored time and assume the baseline hazard to be constant within intervals, and g being the number of prespeci ed time interval i.e., 1

0

1

po (tij ) = k ; for tij 2 Ik ;

(2.4)

and the regression parameter to be constant over time. The superscript \p" indicates that this hazard is for the piecewise exponential model. The model was rst introduced by Breslow (1974). The selection of the grid ft ; t ; : : : ; tg g, is problem speci c, but independent of the data (see Kalb eisch and Prentice (1973)). For the example in section 6 since no expert opinion was available, 20 intervals of equal length were used. To correlate the k 's in adjacent intervals, a discrete-time martingale process is used, similar to that of Arjas and Gasbarra's (1994) for univariate survival model. Given ( ; : : : ; k? ) we have that 1

2

1

1

k j ; : : : k?  Gamma( k ;  k? ); 1

(2.5)

1

1

k

for k = 1; 2; : : : ; g and E (k ) = k? . The parameter k in (2.5) monitors the amount of smoothness available i.e., small k indicates less information on the smoothing of k 's. If k = 0 then k and k? are independent. When k ! 1 then the baseline hazard is the same in the intervals Ik and Ik? i.e., k = k? . In the McGilchrist and Aisbett (1991) example we assume k = o ; 8k. Finally 1

1

1

1

  Gamma( o ; o ? )

(2.6)

1

1

where o is small i.e., var( ) = o ? is large to ensure non-informative prior for  . Notice though that if prior information is available, the shape and scale of the marginal prior of  will change accordingly and the prior will be more informative. A version of the above process which can also be used, was given by Leonard (1978) and Gamerman (1991). They modeled logj = cj and cj jcj  N (cj ;  ). 1

1

1

1

+1

2

5

3 Likelihood speci cation 3.1 Notation For the j-th subject of the i-th group, the triplet (tij ; ij ; zij ) is observed in the k-th time interval. Here, ij is the indicator variable taking value 1 if the subject fails in the k-th interval and value 0 otherwise. Also here, tij is the survival time if the subject fails in that interval, or the censoring time if censoring occurs in that interval, or just tk? if the subject has already failed or censored before that interval. Finally, zij is the covariate for each subject. Now, all such triplets, (tij ; ij ; zij )'s for each subject in each interval, are denoted as a whole by (Y; Z). The vector of unobserved wi's, denoted by W, is called the augmented data and the triplet (W; Y; Z) is called the complete data. We allow only right-censored survival data and assume that the censoring is noninformative. 1

3.2 Data likelihood for the Weibull model Using a Weibull hazard with parameters and , the data likelihood is the following:

L!c ( ! ; ; jW! ; Y; Z) where ij! = exp( ! zij ), and

8 >
:

/

n Y m Y i=1 j =1



(f tij ? wi! ij! gij expf?t ij ij! wi! g ; 1

(3.1)

1; subject fails in the k-th interval 0; otherwise.

The superscript \!" again indicates that these are the parameters of the Weibull model. For the derivation of the complete data likelihood L!c , see the appendix. The data likelihood of ( ! ; ; ) based on observed data (Y; Z) is denoted by L! . It can be obtained by computing the integral 1

L! ( ! ; ; jY; Z) 1

Z

/ W! L!c ( !; ; jW! ; Y; Z)  p(W! j!)dW! ;

where p(W! j! ) is given in (2.2) and ! is the variance of the wi! 's.

6

(3.2)

3.3 Data likelihood for the piecewise exponential model The complete data likelihood based on (Wp ; Y; Z) for the piecewise exponential version of the model in (2.1) is given as

Lpc ( p; ; jWp ; Y; Z) /

gij n Y m nY Y

i=1 j =1 k=1

o

exp(?k k wip ijp ) (gij wipijp )ij +1

expf?gij wip ijp (tij ? tgij )g;

(3.3)

+1

where ijp = exp( pzij ), k = tk ? tk? , gij is such that tij 2 (tgij ; tgij ] = Igij . The superscript \p" indicates that these are the parameters of the piecewise exponential model. Details of the derivation of the complete data likelihood Lpc , are given in the appendix. Once again the data likelihood of ( p; ) based on observed data (Y; Z) and denoted by Lp , can be obtained by computing the integral 1

+1

+1

1

Lp ( p; jY; Z) / 1

Z

Wp Lc ( ; jW ; Y; Z)  p(W j )dW ; p

p

p

p p

p

(3.4)

where p(Wp jp ) is given in (2.2) and p is the variance of the wip 's.

The nal forms of the data likelihoods Lp and L! are too complicated to work with. Thus, it is not easy to evaluate the marginal posteriors of ( ! ; ; ) and ( p; ) analytically or even numerically. To circumvent this problem, we use a Markov chain Monte Carlo method, the Metropolis-within-Gibbs algorithm (Metropolis et al. (1953); Muller (1994); Tanner (1993)) to generate samples from the marginal posteriors of ( ! ; ; ) and ( p; ). These samples will be used to perform complete Bayesian inference for the problem. 1

1

4 Conditional posteriors 4.1 Conditional posteriors for the Weibull model Using the data likelihood L!c and the priors we are able to obtain the conditional posteriors for the parameters. Each of the wi! 's given ( ! ; ; ; ; Y; Z) follow a gamma distribution i.e., [wi! j ! ; ; ; ! ; Y; Z] 

8 m tk g i.e., the risk set at tk , Dk = Rk? ? Rk and  (k ), the conditional prior, given as follows 1

   ( j ) = ? 0 exp ?  ( ) 1

2

0

1

2

1

1

n o  (k jk? ; k ) = ?k exp ? ( k + k ) ; for k = 2; : : : ; g ? 1 k? k 1

1

+1

+1

0

1

   (g jg? ) = g 0 ? exp ? g ; g? where ( ) is the prior of  . 1

1

1

0

1

1

For the conditional posteriors of both models we notice that not all of them can be sampled from standard distributions. Thus to sample from these posteriors we use the Metropolis-Hastings algorithm.

5 Model determination 5.1 Model adequacy using predictive distributions Model adequacy in survival analysis is very important issue. In the classical set up, there are two approaches. One is to assume a parametric function for the baseline hazard o , assess an empirical estimate ^o say, plot it against t or logt and compare the plot with the theoretical one. The other approach uses theoretical or empirical Q-Q plots of exponential or martingale based residuals (Lawless (1982), Schoenfeld (1982), Cox and Oakes (1984) and Therneau et al. (1990)). In the Bayesian framework very few papers address the issue of model adequacy: Shih and Louis (1992) and Gelfand and Mallick (1995) (for univariate survival data). Here, we present Bayesian EDA diagnostics methods to check the goodness of t of the proportional hazards frailty model. We, actually, extend the Gelfand and Mallick (1995) approach for the frailty models. Under the proportional hazards frailty model in (2.1) suppose we de ne o (t) as the cumulative baseline hazard function. Suppose further that

uij = uij (tjwi; zij ) = o(t)ij wi; i = 1; : : : ; n and j = 1; : : : ; m 10

(5.1)

where ij = exp( zij ). Then it follows (see Appendix B) that both for the Weibull as well as the piecewise exponential model, given wi ; uij has a standard exponential distribution. Further if vij (t) = exp(?uij (t)), then vij given wi has a standard uniform distribution. Also, conditional on wi , the vij 's are independent. Then the vij 's can be treated as standardized residuals. Now we can create a diagnostic plot based on the vij 's, using the output from the Markov chain Monte Carlo method. If the model is correct, the vij 's have a uniform distribution. Thus a boxplot of the Monte Carlo estimates of the vij 's can be used to assess model adequacy. Alternatively a Q-Q plot of the vij 's vs a standard uniform, produces similar ndings. The estimates of the vij 's, at the sth stage of iteration are of course obtained from (5.1) as

v^ijs = exp(?o (t)sijs wis); s = 1; : : : ; S;

(5.2)

where S is the number of iterations . For details, see Gelfand, Dey and Chang (1992). 1

5.2 Model selection In this subsection we consider the problem of accounting for uncertainty about model form. We are faced with many models that involve di erent assumptions, distributional forms, or sets of covariates or other model parameters. Although we may wish to summarize our ndings with a single model, there are usually many choices to be made. The most popular approach in Bayesian literature, for model comparison, is to report the posterior probabilities of each model, by comparing Bayes factors (Kass and Raftery (1995)). Let us present that approach: Suppose that there are m models M ; M ; :::; Mm in our consideration. Denote by f (jyobs; Mi) the posterior distribution under model Mi . Here, for the sake of simplicity we use the same notation , for the parameters under each model. However, in our application, the dimension of  will vary from model to model. Also, let f (yobsjMi) be the marginal likelihood for model Mi , that is, 1

2

Z

f (yobsjMi) = f (jyobs; Mi )d: A point estimate of vij , can be obtained by Monte Carlo method as v^ij = S ?1 features of the posterior distribution of vij 's can be obtained. 1

11

(5.3) PS

s s=1 vij .

Similarly other

Then, the Bayes factor for comparing models i and i is de ned as Mi ) : Bii = ff((yyobsjjM obs i ) The posterior probability of model Mi, can be calculated as jMi)P (Mi) P (Mijyobs ) = Pmf (yfobs (yobs jMj )P (Mj ) j =1

(5.4)

(5.5)

where P (Mi) denotes the prior probability of model Mi . When the prior information for model Mi is not available, P (Mi) is typically chosen to be 1=m. Therefore, model Mi is preferred over Mi when Bii > 1; see Kass and Raftery (1995) for more details. In the context of model comparison, we choose the model which yields the largest posterior probability P (Mijyobs): Recently many e ective simulation based Monte Carlo methods for computing the Bayes factor or model posterior probabilities have been developed, which include bridge sampling (Meng and Wong (1993)), path sampling (Gelman and Meng (1994)), a data-augmentationbased method to estimate the marginal density (Chib (1995)), ratio importance sampling (Chen and Shao (1994)) and computing Bayes factors for models with di erent dimensions in the parameter space. For our example we estimate the Bayes factor using posterior samples and importance sampling.

6 Kidney infection data example 6.1 The data set The data reproduced from McGilchrist and Aisbett (1991) in Table 1 displays the times of infection from the time of insertion of the catheter for 38 kidney patients using portable dialysis equipment. For each patient the rst and second columns show the survival time of the rst and second infection respectively. The third column provides binary variables representing the censoring indicators for the rst and second infections respectively. Occurrence of the infection is indicated by 1, and right censoring by 0. The last column represents the only covariate, sex (1 indicating male and 2 indicating female). The other covariates 12

such as age and disease types of a patient, are omitted in the analysis as they were found to have insigni cant e ect on the infection time in the previous works of McGilchrist and Aisbett (1991) and McGilchrist (1993). In this particular example, each subject (infection time) from each group (patient) shares the same value of the covariate.

6.2 Prior speci cation for the Weibull model The prior speci cation for wi! we chose to be Gamma((!)? ; ! ), so that E (wi! ) = 1 and V ar(wi! ) = ! : The prior for the hyperparameter ! , will be !  Gamma(6; 1) with E (! ) = V ar(! ) = 6, to assure enough heterogeneity among the patients. Since ! = (! )? , it follows that the prior for ! will be an inverse gamma, IGamma(6; 1). p The prior of  is Gamma(0:25; 4) and the prior of is Gamma(1; 2), both with large variances to ensure non-informativeness. Finally the prior of ! is Normal(?1:2; 1:2). There might be some criticism for choosing a rather informative prior but, we found that the data did not provide enough information to help outline the posterior. 1

1

6.3 Prior speci cation for the piecewise exponential model As stated earlier, we assume that wip  Gamma((p)? ; p ), so that E (wip) = 1 and V ar(wip ) = p. and again the prior for the hyperparameter p , will be p  Gamma(6; 1) i.e., p = (p)?  IGamma(6; 1). For the i's i = 1; : : : ; g we consider the following priors 1

1

  Gamma(0:7; 0:7) and i  Gamma(0:7; 0i:?7 ); i = 2; : : : ; g: 1

1

Finally, for p , we use a normal, N (?1:2; 1) as a prior. The survival or censored times of the patients, that ranged from 0-562 days, was divided in to 20 equal intervals.

6.4 Posterior inference For the posterior inference we need to have posterior samples of the parameters. If a posterior distribution has a nice closed form, then the Gibbs sampling algorithm, provides these samples. As we can see though, from section 4, not all the posteriors came to be of standard form. From the Weibull model the posteriors of ! ; ! and do not have a standard 13

form. From the piecewise exponential model the posteriors of p ; p and k ; k = 1; : : : ; g ? 1 also do not have a standard form. So, in order to draw samples from these posteriors the Metropolis-within-Gibbs algorithm was used. The proposal densities were all chosen to be normals, with means close to the prior means and appropriate standard deviations. For the parameters with positive densities, we applied the log transformation before using the algorithm. To ensure that the samples drawn were capable of correctly identifying the posteriors, we performed the Gelman-Rubin diagnostic (Gelman and Rubin (1992)). Table 2 shows the posterior mean, standard deviation, median and 95% credible intervals of all the parameters from both models. Both ! and p show that the female patients have a slightly lower risk for infection. Both ! and p show that there is a strong posterior evidence of high degree of heterogeneity in the population of patients. So, some patients are expected to be very prone to infection compared to other patients with the same covariate value. This is not very surprising, as in the data set there is patient number 1 with covariate value 1 (that is male) and infection times 8 and 16, and there is also another male patient with infection times 152 and 562 (patient 21). The high posterior means of ! and p also provide evidence of strong positive correlation between two infection times within the same patient. Table 3 gives the posterior mean and 95% credible interval for the frailties in the Weibull model, while table 4 gives the same quantities for the frailties in the piecewise exponential model. These frailty e ects measure the susceptibility of the patient to infections, after being adjusted for the covariate (sex of the patient here). Boxplots of the log(wi ) for both models, in gure 1, present the di erent frailty e ect the patients experience. Notice for example how di erent are the boxplots of log(wi ), in both models, for the rst and last patient. Figure 2 has the baseline hazard of the Weibull and the piecewise exponential model. Notice the smoothness of the second hazard in adjacent intervals. The boxplots of the vij ; i = 1; : : : ; 38(patients); j = 1; 2(infections) in the gures 3 and 4 show how good the patients t in the model or not. Finally, the evaluation of the Bayes factor for piecewise exponential model vs Weibull, using the posterior samples and importance sampling, favors the piecewise exponential model 160 times more over the Weibull baseline hazard model. 14

References [1] Aslanidou, H., Dey D. K. and Sinha, D. (1995). "Bayesian Analysis of Multivariate Survival Data Using Monte Carlo Methods", Tech. Report 95-05, Department of Statistics, The University of Connecticut. [2] Arjas, E. and Gasbarra, D. (1994). \Nonparametric Bayesian inference for right-censored survival data, using the Gibbs sampler", Statistica Sinica, 4, No 2, 505-524. [3] Breslow, N. E. (1974). \Covariance analysis of censored survival data", Biometrics, 30, 89-99. [4] Chen, M. H. and Shao, Q. M. (1994). \On Monte Carlo methods for estimating ratios of normalizing contants", Tech. Report 627, Department of Mathematics, The National University of Singapore. [5] Chib, S. (1995). \Marginal likelihood from the Gibbs output", J. Amer. Statist. Assoc., 90, 1313-1321. [6] Clayton, D. (1978). \A model for association in bivariate life tables and its application in epidemiological studies of familiar tendency in chronic disease incidence", Biometrica, 65, 141-151. [7] Clayton, D. (1991). \A Monte Carlo Method for Bayesian inference in frailty models", Biometrics, 47, 467-485. [8] Clayton, D. and Cuzick, J. (1985). \Multivariate generalizations of the proportional hazards model", J. Roy. Statist. Soc., A 148, 82-117. [9] Cox, D. R. (1972a). \Regression models and life tables", J. Roy. Statist. Soc., B, 34, 187-220. [10] Cox, D. R. (1972b). \The statistical analysis of dependencies in point processes", In: Stochastic Point Processes (ed P.A.W. Lewis). New York: Wiley. pp. 55-66. [11] Cox, D. R. and Oakes, D. (1984). Analysis of Survival Data, London: Chapman & Hall. 15

[12] Gamerman, D. (1991). \Dynamic Bayesian models for survival data", Appl. Statist., 40, 63-79. [13] Gelfand, A. E. and Smith, A. F. M. (1990). \Sampling based approaches to calculating marginal densities", J. Amer. Statist. Assoc., 85, 398-409. [14] Gelfand, A. E., Dey, D.K. and Chang, H. (1992). \Model determination using predictive distributions with implementation via sampling-based methods", Bayesian Statistics, 4, 147-167. [15] Gelfand, A. E. and Mallick, B. K. (1995). \Bayesian Analysis of semiparametric proportional hazards models", Biometrics, 51, 843-852 . [16] Gelman, A. and Rubin, D. B. (1992). \Inference from iterative simulation using multiple sequences (with discussion) ", Statistical Science, 7 , 457-511. [17] Gelman, A. and Meng, X. L. (1994). \Path sampling for computing normalizing contants: Identities and theory", Tech. Report 377, Department of Statistics, The University of Chicago. [18] Hougaard, P. (1984). \Life table methods for heterogeneous populations: Distributions describing the heterogeneity", Biometrika, 71, 75-84. [19] Hougaard, P. (1986a). \Survival models for heterogeneous populations derived from stable distributions", Biometrika, 73, 387-396 (Correction, 75, pp.395). [20] Hougaard, P. (1986b). \A class of multivariate failure time distributions", Biometrika, 73, 671-678 (Correction, 75, pp.395). [21] Hougaard, P. (1987). \Modeling multivariate survival", Scand. J. Statist., 14, 291-304. [22] Hougaard, P. (1991). \Modeling heterogeneity in survival data", J. Appl. Prob., 28, 695-701. [23] Hougaard, P. (1995). \Frailty models for survival data", Lifetime Data Analysis, 1, 255-273, Kluwer Academic Publishers, Netherlands. 16

[24] Hougaard, P., Harvald, B. and Holm, N. V. (1992a). \Measuring the similarities between the lifetimes of adult Danish twins born between 1881-1930", J. Amer. Statist. Assoc., , 87, 17-24. [25] Hougaard, P., Harvald, B. and Holm, N. V. (1992b). \Assessment of dependence in the life of twins", Survival Analysis State of the Art, 77-97, (Klein J. P. and Goel P. K., eds), Kluwer Academic Publishers. [26] Hougaard, P., Myglegaard, P. and Borch-Johnsen, K. (1994). \Heterogeneity models of disease susceptibility, with application to diabetic nephropathy", Biometrics, , 50, 1178-1188. [27] Kalb eisch, J. D. (1978). \Nonparametric Bayesian analysis of survival time data", J. Roy. Statist. Soc., B, 40, 214-221. [28] Kalb eisch, J. D. and Prentice, R. L. (1973). \Marginal likelihoods based on Cox's regression and life model", Biometrika, 60, 267-278. [29] Kass, R. E. and Raftery, A. E. (1995). \Bayes factors", J. Amer. Statist. Assoc., , 90, 773-795. [30] Lawless, J.F.(1982), Statistical Models and Methods for Life Time Data. New York: John Wiley. [31] Lee, L. (1979). \Multivariate distributions having Weibull properties", J. Mult. Anal., 9, 267-277. [32] Leonard, T. (1978). \Density estimation, stochastic processes and prior information", J. Roy. Statist. Soc.,B, 40, 113-146. [33] Lu, J. C. and Bhattacharyya, G. K. (1990). \Some new constructions of bivariate Weibull models", Ann. Inst. Statist. Math., 42, 543-549. [34] McGilchrist, C. A. and Aisbett, C. W. (1991). \Regression with frailty in survival analysis", Biometrics, 47, 461-466. 17

[35] Meng, X. L. and Wong, W. H. (1993). \Simulating ratios of normalizing constants via a simple identity: A theoretical exploration", Tech. Report 365, Department of Statistics, The University of Chicago. [36] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). \Equations of state calculations by fast computing machines", J. of Chemical Physics, 21, 1087-1092. [37] Muller, P. (1994). \A generic approach to posterior integration and Gibbs sampling", unpublished manuscript. [38] Oakes, D. (1982). \A model for association in bivariate survival data", J. Roy. Statist. Soc., B, 44, 414-422. [39] Oakes, D. (1986). \Semiparametric inference in a model for association in bivariate survival data", Biometrika, 73, 353-361. [40] Oakes, D. (1989). \Bivariate survival models induced by frailties", J. Amer. Statist. Assoc., 84, 487-493. [41] Schoenfeld, D. (1982). \ Partial residuals for proportional hazards regression model", Biometrika, 69, 239-241. [42] Shih, J. A. and Louis, T. A. (1992). \Models and analysis for multivariate failure time data", Tech. Report, Division of Biostatistics, The University of Minnesota. [43] Tanner, M. A. (1993). Tools for Statistical Inference: Observed Data and Data Augmentation Methods, Lecture Notes in Statistics: Second Edition, Springer-Verlag: New York. [44] Therneau, T. M., Grambsch, P. M., and Fleming, T. R. (1990). \ Martingale based residuals for survival models", Biometrika, 77, 147-160. [45] Vaupel, J. W., Manton, K. G. and Stallard, E. (1979). \The impact of heterogeneity in individual frailty on the dynamics of mortality", Demography, 16, 439-454. 18

[46] Whitmore, G. A. and Lee, M. L. T. (1991a). \ A multivariate survival distribution generated by an Inverse Gaussian mixture of exponentials", Technometrics, 33, 39-50.

19

Appendix A Derivation of the complete data likelihood L!c , for the Weibull model: The j-th (j = 1; : : : ; m) subject of the i-th (i = 1; : : : ; n) group has a hazard of h!ij =  tij ? ij! wi! ; given the unobserved frailty wi! and also all the subjects are independent (given the unobserved wi! for the subject). Now, for the j-th subject of the i-th group, the likelihood contribution is either exp(?wi! ij! t ij ) if the subject survived, or  tij ? ij! wi! exp(?wi! ij! t ij ) for failing or be censored. The complete data likelihood of (3.1) can be obtained by multiplying the likelihood contributions of every subject in every group. Derivation of the complete data likelihood Lpc , for the piecewise exponential model: The j-th (j = 1; : : : ; m) subject of the i-th (i = 1; : : : ; n) group has a constant hazard of hpij = wip ijp k in the k-th interval given the unobserved frailty wip and also all the subjects are independent (given the unobserved wip for the subject). So, if the j-th subject of the i-th group is at the k-th interval, the likelihood contribution is either exp(?k wip ijp (tk ? tk? )) for the subject surviving the k-th interval, or exp(?k wip ijp (tij ? tk? )) for failing or be censored inside the k-th interval. The complete data likelihood of (3.3) can be obtained by multiplying the likelihood contributions of every event (nm events total), times the hazard of that event in this interval, in each of the g intervals. 1

1

1

1

Appendix B De ne, uij = u(tjwi; zij ) = o (t)ij wi. Then the survival function will be: R R Sij (tjwi; zij ) = exp(? t hij ( jwi; zij )d ) = exp(?ij wi t o ( )d ) = exp(?ij wio (t)). Thus, the pdf of Tij is given as fij (tjwi; zij ) = o (t)ij wi exp(?ij wio (t)). Now, P (uij  u) = P (ij wio (t)  u). For the Weibull model: R Recall, !o (t) =  tij ? . Let !o (t) = t !o (u)du = t ij ?1 Then P (u!ij  u) = P (ij! wi! !o (t)  u) = P (Tij  !o ? (u=(ij! wi! )) ) = 1 ? exp(?ij! wi! (u=(ij! wi! ))) = 1 ? exp(?u). For the piecewise exponential model: R Recall, po (t) = k if t 2 Ik = (tk? ; tk ]. Let po (t) = t po (u)du, i.e., for t 2 Ik ; po (t) = Pk i i i , where i = ti ? ti? . 0

0

1

0

1

1

=1

0

1

20

Then P (upij  u) = P (ijp wip po (t)  u) = P (Tij  po ? (u=(ijp wip))) = 1?exp(?ijp wip (u=(ijp wip ))) = 1 ? exp(?u). 1

21

Table 1: Recurrence times of infections in 38 kidney patients using portable dialysis machine. Reproduced from McGilchrist and Aisbett (1991). Patient number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

times 8, 16 33, 13 22, 28 447, 318 30, 12 24, 245 7, 9 511, 30 53, 196 15, 154 7, 333 141, 8 96, 38 149, 70 536, 25 17, 4 185, 117 292, 114 22, 159

Recurrence Event Patient Recurrence types Sex number times Sex 1, 1 1 20 15, 108 1, 0 2 1, 0 2 21 152, 562 1, 1 1 1, 1 1 22 402, 24 1, 0 2 1, 1 2 23 13, 66 1, 1 2 1, 1 1 24 39, 46 1, 0 2 1, 1 2 25 12, 40 1, 1 1 1, 1 1 26 113, 201 0, 1 2 1, 1 2 27 132, 156 1, 1 2 1, 1 2 28 34, 30 1, 1 2 1, 1 1 29 2, 25 1, 1 1 1, 1 2 30 130, 26 1, 1 2 1, 0 2 31 27, 58 1, 1 2 1, 1 2 32 5, 43 0, 1 2 0, 0 2 33 152, 30 1, 1 2 1, 0 2 34 190, 5 1, 0 2 1, 0 1 35 119, 8 1, 1 2 1, 1 2 36 54, 16 0, 0 2 1, 1 2 37 6, 78 0, 1 2 0, 0 2 38 63, 8 1, 0 1

22

Table 2: Bayes Posterior estimates MODEL Weibull

Piecewise exponen.

PARAMETER MEAN STD.DEV MEDIAN 95% C.I. ! -0.898 0.203 -0.8988 (-1.241, -0.564) ! 2.016 0.678 1.917 (1.037, 3.29) ! 1.278 0.168 1.273 (1.012, 1.549) ! 0.009 0.006 0.008 (0.007, 0.021) p -1.049 0.311 -1.05 (-1.535, -0.567) p 6.582 1.149 6.504 (4.88, 8.678)

23

Table 3: Posterior mean and 95% credible interval for the frailty parameters in the Weibull model. PARAMETER w1! w2! w3! w4! w5! w6! w7! w8! w9! ! w10 ! w11 ! w12 ! w13 ! w14 ! w15 ! w16 ! w17 ! w18 ! w19

MEAN, 95% C.I. 3.612, (0.826,8.488) 1.051, (0.117,2.822) 2.724, (0.647,6.056) 0.097, (0.014,0.278) 3.02, (0.665,6.833) 0.292, (0.042,0.793) 4.266, (0.892,10.111) 0.129, (0.018, 0.351) 0.343, (0.047, 0.961) 0.988, (0.198, 2.358) 0.216, (0.033, 0.577) 0.338, (0.029, 0.989) 0.689, (0.121, 1.658) 0.089, (0.0001, 0.362) 0.077, (0.006, 0.242) 2.295, (0.271, 6.011) 0.281, (0.045, 0.737) 0.198, (0.027, 0.538) 0.104, (0.0001, 0.447)

PARAMETER ! w20 ! w21 ! w22 ! w23 ! w24 ! w25 ! w26 ! w27 ! w28 ! w29 ! w30 ! w31 ! w32 ! w33 ! w34 ! w35 ! w36 ! w37 ! w38

24

MEAN, 95% C.I. 0.433, (0.035,1.208) 0.233, (0.03,0.624) 0.113, (0.007,0.37) 1.095, (0.22,2.658) 0.662, (0.063,1.864) 2.796, (0.734,6.236) 0.166, (0.013,0.494) 0.313, (0.044,0.842) 1.391, (0.297,3.156) 3.519, (0.738,8.074) 0.554, (0.089,1.384) 1.06, (0.191,2.509) 1.028, (0.101,2.724) 0.457, (0.072,1.181) 0.252, (0.023,0.772) 0.658, (0.11,1.62) 0.286, (0.0005,1.095) 0.618, (0.071,1.714) 1.328, (0.151,3.477)

Table 4: Posterior mean and 95% credible interval for the frailty parameters in the piecewise exponential model. PARAMETER w1p w2p w3p w4p w5p w6p w7p w8p w9p p w10 p w11 p w12 p w13 p w14 p w15 p w16 p w17 p w18 p w19

MEAN, 95% C.I. 0.109, (0.015,0.303) 0.010, (0.0005,0.031) 0.052, (0.0064,0.138) 0.001, (0.00012,0.002) 0.0627, (0.008,0.167) 0.0026, (0.0004,0.006) 0.1523, (0.021,0.415) 0.0015, (0.0002,0.003) 0.0027, (0.0004,0.006) 0.0134, (0.002,0.035) 0.0022, (0.0004,0.005) 0.003, (0.0002,0.008) 0.0064, (0.0009,0.0166) 0.0002, (0,0.0014) 0.0007, (0,0.002) 0.067, (0.003,0.2101) 0.0025, (0.0004,0.006) 0.002, (0.0003,0.005) 0.0003, (0,0.0017)

25

PARAMETER p w20 p w21 p w22 p w23 p w24 p w25 p w26 p w27 p w28 p w29 p w30 p w31 p w32 p w33 p w34 p w35 p w36 p w37 p w38

MEAN, 95% C.I. 0.003, (0.0002,0.0117) 0.003, (0.0004,0.007) 0.001, (0,0.003) 0.0101, (0.0017,0.025) 0.005, (0.0002,0.015) 0.0453, (0.006,0.1239) 0.0013, (0,0.004) 0.003, (0.0005,0.007) 0.013, (0.002,0.0342) 0.0973, (0.013,0.265) 0.0055, (0.0009,0.014) 0.009, (0.0015,0.0236) 0.009, (0.0006,0.026) 0.004, (0.0007,0.01) 0.0018, (0,0.005) 0.007, (0.001,0.017) 0.0009, (0,0.005) 0.005, (0.0003,0.016) 0.0188, (0.001,0.058)

-10 -8 -6 -4 -2 0 2

weibull hazard

-15

-10

-5

0

26

Piecewise exponential hazard

Figure 1: Box plots of log(wi ); i = 1; : : : ; 38, for both models

0.07

Weibull model •

0.06

• •







• •

• 0.05

• •••

0.02

0.03

0.04



• •• •• • • •• •• •• • •• •• • • 0

• •• ••

•• •• •••

•••••



100

200

300

400

500

0

2

4

6

8

Piecewise exponential model

0

100

200

300

400

500

Figure 2: Estimated posterior mean of the baseline hazard, plotted against time. 27

0.0 0.2 0.4 0.6 0.8 1.0

Weibull model

0.0 0.2 0.4 0.6 0.8 1.0

28

Piecewise exponential model

Figure 3: Box plots of vi ; i = 1; : : : ; 38 (patient's rst infection) for both models 1

0.0 0.2 0.4 0.6 0.8 1.0

Weibull model

0.0 0.2 0.4 0.6 0.8 1.0

29

Piecewise exponential model

Figure 4: Box plots of vi ; i = 1; : : : ; 38 (patient's second infection) for both models 2