The Canadian Journal of Statistics

The Canadian Journal of Statistics
Vol. 42, No. 3, 2014, Pages 365383
La revue canadienne de statistique

365

Semiparametric methods for survival analysis
of case-control data subject to dependent
censoring
Douglas E. SCHAUBEL1 *, Hui ZHANG2 , John D. KALBFLEISCH1 and Xu SHU1
1 Department
2 U.S.

of Biostatistics, University of Michigan, Ann Arbor, MI, USA
Food and Drug Administration, Silver Spring, MD, USA

Key words and phrases: Case-control study; Cox regression; dependent censoring; estimating equation;
inverse weighting
MSC 2010: Primary 62N01; secondary 62D05
Abstract: Case-control sampling can be an ef?cient and cost-saving study design, wherein subjects are
selected into the study based on the outcome of interest. It was established long ago that proportional hazards
regression can be applied to case-control data. However, each of the various estimation techniques available
assumes that failure times are independently censored. Since independent censoring is often violated in
observational studies, we propose methods for Cox regression analysis of survival data obtained through
case-control sampling, but subject to dependent censoring. The proposed methods are based on weighted
estimating equations, with separate inverse weights used to account for the case-control sampling and to
correct for dependent censoring. The proposed estimators are shown to be consistent and asymptotically
normal, and consistent estimators of the asymptotic covariance matrices are derived. Finite-sample properties
of the proposed estimators are examined through simulation studies. The methods are illustrated through an
analysis of pre-transplant mortality among end-stage liver disease patients obtained from a national organ
failure registry. The Canadian Journal of Statistics 42: 365383; 2014 © 2014 Statistical Society of Canada
Re´ sume´ : L´echantillonnage cas-t´emoins peut constituer un plan dexp´erience ef?cace et e´ conomique dans
le cadre duquel les sujets sont choisis pour l´etude en fonction du ph´enom`ene e´ tudi´e. Il est e´ tabli depuis
longtemps que le mod`ele de r´egression a` risques proportionnels peut sappliquer a` des donn´ees cas-t´emoins.
Cependant, toutes les techniques destimation existantes supposent que les temps de d´efaillance sont cen´
sur´es de fac¸on ind´ependante. Etant
donn´e que lind´ependance de la censure est souvent bafou´ee dans le
cadre d´etudes observationnelles, les auteurs proposent des m´ethodes pour la r´egression de Cox de donn´ees
de survie sujettes a` la censure d´ependante obtenues par un e´ chantillonnage cas-t´emoins. Les m´ethodes propos´ees se fondent sur des e´ quations destimation pond´er´ees dont les poids s´epar´es et inverses permettent de
tenir compte de l´echantillonnage cas-t´emoins et de corriger le biais li´e a` la censure d´ependante. Les auteurs montrent que les estimateurs propos´es sont convergents et asymptotiquement normaux. Ils obtiennent
e´ galement des estimateurs convergents pour les matrices de covariance asymptotique. Ils examinent les propri´et´es de ces estimateurs sur des e´ chantillons de taille ?nie par voie de simulation et illustrent les m´ethodes
au moyen dune analyse de donn´ees sur le taux de mortalit´e pr´etransplantation chez les patients atteints
dune maladie h´epatique en phase terminale provenant dun registre national dorganes d´efaillants. La revue
canadienne de statistique 42: 365383; 2014 © 2014 Soci´et´e statistique du Canada

1. INTRODUCTION
Case-control and case-cohort sampling schemes are designed to over-sample subjects expected
to provide relatively large amounts of information in terms of parameter estimation. In the case* Author to whom correspondence may be addressed.
E-mail: deschau@umich.edu
© 2014 Statistical Society of Canada / Soci´et´e statistique du Canada

366

SCHAUBEL ET AL.

Vol. 42, No. 3

control study, subjects observed to experience the event of interest (the cases) are over-sampled,
while only a fraction of subjects not experiencing the event of interest (the controls) are selected. In
the original formulation of case-control study, all cases are selected but, more generally, cases are
over-sampled, such that their sampling fraction would be many times greater than that expected
through random sampling. In the context of proportional hazards regression (Cox, 1972), the casecohort design (Prentice, 1986) was proposed, in which one selects all cases, as well as a random
sample of the study population (the sub-cohort). The case-cohort design is especially well-suited
to Cox regression since the sub-cohort representing the risk set can be used to compute partial
likelihood contributions at each observed event.
A number of methods have been proposed for the regression analysis of case-control and
case-cohort studies under the proportional hazards model. Prentice & Breslow (1978) adapted
the Cox model for use in case-control studies, and studied in detail the correspondence between
Cox regression in the presence of prospective and retrospective sampling. Oakes (1981) provided a partial likelihood interpretation for the likelihood function used by Prentice & Breslow
(1978) and, hence, a theoretical justi?cation of the estimation procedure. Self & Prentice (1988)
provided a rigorous theoretical evaluation of the case-cohort design of Prentice (1986), including
detailed asymptotic derivations and various ef?ciency calculations. Kalb?eisch & Lawless (1988)
developed pseudo-likelihood methods for semiparametric estimation of the illness-death model,
applicable to case-cohort and other retrospective sampling designs. Lin & Ying (1993) considered
general missing data problems in the context of Cox regression, with the case-cohort design cast
as a special case. The asymptotic development by Lin & Ying (1993) leads to a different variance
estimator for the Prentice (1986) case-cohort estimator than that derived by Self & Prentice (1988).
Much of the subsequent research on the case-cohort design was focussed on modi?cations
to the sampling scheme and/or analysis in order to increase ef?ciency. For example, Chen & Lo
(1999) derived a class of estimators for case-cohort sampling; essentially, the methods involve
adding future cases to the risk set (in addition to the sub-cohort) in order to improve ef?ciency
relative to the original method by Prentice (1986). Borgan et al. (2000) proposed strati?ed casecohort designs to improve ef?ciency. Motivated by family-based studies, Moger, Pawitan, &
Borgan (2008) developed case-cohort methods for ?tting frailty models to large clustered data sets.
The methods were inspired by the work of Kalb?eisch & Lawless (1988) and Borgan et al. (2000).
The motivation for the work of Moger, Pawitan, & Borgan (2008) was to reduce computational
burden, rather than dollar cost.
With respect to the case-control study, much of the development in the last 15 years has been
directed at clustered data. For instance, in the case-control family study, one selects a random
sample of cases, random sample of controls, as well as the family members of the cases and
controls. Li, Yang, & Schwartz (1998) developed parametric methods for ?tting copula models to
clustered survival data obtained through case-control family sampling. Shih & Chatterjee (2002)
extended this work to the semiparametric setting through quasi-likelihood procedures. Hsu &
Gor?ne (2006) developed pseudo-likelihood methods for ?tting frailty model to case-control
family data using an approach similar to that of Shih & Chatterjee (2002). Hsu et al. (2004)
developed semiparametric methods for frailty models based on case-control family sampling,
with the focus being on the conditional regression parameter. Gor?ne, Zucker, & Hsu (2009)
further evaluated frailty models based on case-control family data through a pseudo-full likelihood
approach, and derived rigorous large-sample theory.
There have been several evaluations of cohort sampling in a general sense. Chen & Lo (1999)
described the close theoretical connection between the case-cohort and case-control study within
the framework of Cox regression. Chen (2001) considered generalized case-cohort sampling
(nested case-control, case-cohort and case-control, all within the same framework) in the context
of Cox regression. Gray (2009) studied various cohort sampling designs, focusing on Inverse

The Canadian Journal of Statistics / La revue canadienne de statistique

DOI: 10.1002/cjs

2014

CASE-CONTROL DATA WITH DEPENDENT CENSORING

367

Probability of Selection Weighted (IPSW) estimators applied to the Cox model (e.g., Binder,
1992; Barlow, 1994; Borgan et al., 2000; Lin, 2000; Pan & Schaubel, 2008; Nan, Kalb?eisch, &
Yu, 2009).
The motivation for subsampling of large epidemiologic databases is well-established in the
literature. Nothing prevents such data from being subject to dependent censoring. However, each
of the subsampling methods cited thus far (and, to the best of our knowledge, all existing methods
for Cox regression analysis of case-control data) assumes that subjects are censored in a manner
independent of the failure rate. In most practical applications in which the independent censoring
assumption is violated, censoring is actually a mixture of independent (e.g., administrative, or
random loss to follow-up) and dependent (e.g., censoring of pre-treatment death by the initiation of
treatment). In such cases, it is necessary to distinguish between subjects who were independently
and dependently censored. The most popular method for handling dependent censoring is Inverse
Probability of Censoring Weighting (IPCW) proposed by Robins & Rotnitzky (1992). From this
perspective, case-control studies appear to extend nicely to the dependent censoring setting. In
particular, since IPCW entails modelling the dependent censoring hazard, there is an incentive to
over-sample dependently censored subjects.
We propose semiparametric methods for the analysis of failure time data generated by casecontrol sampling and subject to dependent censoring. In the setting of interest, we over-sample
dependently censored subjects, since correcting for dependent censoring through IPCW requires
that the dependent censoring process be modelled. The dependent censoring hazard is modelled
using existing case-control survival analysis methods featuring Inverse Probability of Sampling
Weighting (IPSW). Parameter estimation for the failure time hazard model then proceeds through
weighted estimating equations, with the weights correcting for both the case-control sampling and
the dependent censoring. We derive asymptotic properties of the proposed estimator, in a manner
which accounts for the randomness in the estimated IPCW and IPSW components. Since the
derived asymptotic variance is complicated, we suggest an alternative variance estimator that is
much easier to program and much faster to compute. Finite-sample performance of the proposed
procedures is evaluated through simulation.
In the evaluation of a proposed subsampling method, it is useful to assess not only if the method
can be implemented in a real data set, but, in addition, whether the method gives reasonable answers
in practice. Along these lines, we apply the proposed methods to an analysis of pre-transplant
survival among end-stage liver disease (ESLD) patients. In this application, the full cohort is
available, meaning that the results based on the full-cohort can serve as a target for the results
obtained through the proposed case-control sampling methods. With respect to background, liver
transplantation is the preferred method of treatment. However, there are thousands more patients
needing a liver transplant than there are available donor organs. As such, patients clinically suitable
for receiving a deceased-donor liver transplant are put on a wait-list, with the pertinent time origin
(t = 0) then being the date of wait listing. The receipt of a liver transplant does not censor death
but, naturally, does censor pre-transplant death (i.e., the time until death in the absence of liver
transplantation). In liver transplantation, patients are ranked on the wait-list based on their most
recent Model for End-stage Liver Disease (MELD) score. MELD was chosen as the scoring system
largely because it is a very strong predictor of pre-transplant mortality (e.g., Wiesner et al., 2001;
Kremers et al., 2004; Huo et al., 2005; Merion et al., 2005; Basto et al., 2008; Subramanian et al.,
2010). Since MELD is time-dependent, an analysis of the effect of baseline risk factors (i.e.,
measured at time t = 0) on the pre-transplant death hazard could result in substantial bias if liver
transplantation were treated as independent censoring. This and other related issues in the liver
transplant setting are discussed by Schaubel et al. (2009).
In certain cases, special exceptions are made under which a wait-listed patient may be assigned
a MELD score which is higher than that calculated, in an attempt to re?ect the patients perceived

DOI: 10.1002/cjs

The Canadian Journal of Statistics / La revue canadienne de statistique

368

SCHAUBEL ET AL.

Vol. 42, No. 3

medical urgency. The most frequent occurrence of such MELD exceptions is for patients with
hepatocellular carcinoma (HCC), a form of liver cancer. HCC patients are typically assigned a
MELD score of 22, which is often considerably higher than the score based on their laboratory
measures. To our knowledge, no existing analyses in the liver transplant literature have quanti?ed
whether the MELD score of 22 accurately re?ects the true MELD-equivalent wait-list mortality
risk faced by HCC patients. As a primary example in this article, we compare HCC versus
non-HCC patients, with the latter group categorized by baseline MELD score. Since MELD (at
time 0 but, also, after time 0) affects both death and liver transplantation probabilities, liver
transplantation is handled as dependent censoring of pre-transplant death time.
It should be noted that MELD exception scores for HCC are usually assigned at initial wait
listing (time 0). Therefore, given our analytic objective and in the interests of interpretation, it is
appropriate to compare HCC patients to non-HCC patients, with the latter categorized speci?cally
by time 0 MELD score. To use MELD at time > 0 would be tantamount to having the HCC and
comparator patients established at different times, which would render the resulting parameter
estimates uninterpretable. More generally, the perils of adjusting for time-dependent factors (such
as MELD at time >0) are well-described in the causal inference literature (e.g., Hern´an, Brumback,
& Robins, 2000). Another general setting for which adjustment for time-dependent factors is
explicitly avoided concerns the setting wherein there is a mediator. In evaluating a baseline factor
(e.g., treatment assignment), adjusting for the mediator (e.g., though a time-dependent indicator
covariate) would generally distort the resulting treatment effect; it might then be necessary to
treat the mediator as dependent censoring.
The remainder of this article is organized as follows. In Section 2, we describe the proposed
estimation procedures. In Section 3, we derive large sample properties for the proposed estimators.
We conduct simulation studies in Section 4 to investigate the ?nite-sample properties of the
proposed estimators. Section 5 provides an application of the methods to the afore-described
end-stage liver disease data. The article concludes with a discussion in Section 6.
2. METHODS
Let Z1i denote the q1 -vector of time-constant covariates for subject i (i = 1, . . . , n). Let Z2i (t)
i (t) =
be the q2 -vector of time-dependent covariates at time t, Zi (t) = {ZT1i , Z2i (t)T }T , and Z
{Zi (u) : 0 = u = t} denote the history of Zi (·) up to time t. Let Ti and Ci be the potential failure
and censoring times, respectively. We suppose that Ci = C1i ? C2i , where a ? b = min {a, b},
C1i is the censoring time due to mechanisms that are independent of Ti given Zi (0), and C2i
denotes the dependent censoring time; that is, C2i is dependent on Ti given Zi (0). Let Xi = Ti ?
Ci , Yi (t) = I (Xi = t), 1i = I (Ti = Ci ), 2i = I (C2i = C1i , C2i < Ti ), 3i = (1 1i )(1
2i ), Ni (t) = I (Xi = t, 1i = 1) and NiC (t) = I (Xi = t, 2i = 1), where I(·) is the indicator
function. The observable data are assumed to be n independently and identically distributed copies
of {Ni (·), NiC (·), Yi (·), Zi (·)}. Let ?i indicate whether or not subject i is sampled. The variate ?i
is allowed to depend on 1i , 2i and 3i so that the sampling probability can be different
for subjects who fail, subjects who are dependently censored and those who are independently
censored. Let the cohort be divided into three strata according to the outcome (1 , 2 , 3 )
such that
Lk = {i : ki = 1}, k = 1, 2, 3. Let pk = pr(?i = 1 | i ? Lk ), p = (p1 , p2 , p3 )T and
?i (p) = 3k=1 ki ?i /pk . Note that ?i (p) weights the ith subject by the inverse probability that
the subject is sampled.
We assume that the hazard of failure for individual i is speci?ed by the following proportional
hazards model (Cox, 1972),
?i {t | Zi (0)} = ?0 (t) exp{ßT0 Zi (0)},
The Canadian Journal of Statistics / La revue canadienne de statistique

(1)
DOI: 10.1002/cjs

2014

CASE-CONTROL DATA WITH DEPENDENT CENSORING

369

where ?0 (t) is an unspeci?ed baseline hazard function for failure time, and ß0 is a (q1 + q2 )dimensional regression parameter. Note that we are chie?y interested in inferring the role of Zi (0)
on the failure time hazard, as opposed to {Zi (t) : t > 0}, for reasons of interpretation. For example,
it is straightforward to predict survival probability from time 0 using a pre-speci?ed value of Zi (0)
i (t) would be
along with parameter estimates from model (1). To do so using a model based on Z
much more complicated, and would generally require modelling the stochastic properties of the
process Zi (t).
If it were also the case that C2i was independent of Ti given Zi (0) (unlike the setting of interest),
the root of the estimating equation U(ß) = 0, where
then ß0 could be consistently estimated by ß,
U(ß) =

?i (p){Zi (0) Z(ß, t)}dNi (t),

(2)

i=1

where
follow-up time, Z(ß, t) = S(1) (ß, t)/S (0) (ß, t), S(d) (ß, t) =
n t < 8 is the?dmaximum
T
exp{ß Zi (0)}, with a?0 = 1, a?1 = a, and a?2 = aaT . Estimating equai=1 ?i (p)Yi (t)Zi (0)
tions of the same general structure as (2) and arising from IPSW have been proposed by several
previous authors, for example, Kalb?eisch & Lawless (1988), Binder (1992), Borgan et al. (2000)
and Lin (2000) for the Cox model; Kulich & Lin (2000) for the additive hazards model; and Nan,
Kalb?eisch, & Yu (2009) for the accelerated failure time model.
Since Zi (t) affects both the event and censoring times, and since Zi (t) is not incorporated into

model (1), C2i would generally not be independent of Ti given Zi (0). In this case, the estimate ß
derived from (2) could be substantially biased because (2) does not accommodate the dependence
i (t), the hazard of
between C2i and Ti . We assume that conditional on the covariate history Z
dependent censoring C2i at time t does not further depend on the possibly unobserved failure time
Ti , that is,
C

?C
i {t | Zi (Ti ), Ci = t, Ti = t, Ti } = ?i {t | Zi (t), Ci = t, Ti = t}.

(3)

This fundamental assumption is called no unmeasured confounders for censoring (Rubin,
1977; Robins, 1993). Borrowing terminology from the competing risks literature (Kalb?eisch
and Prentice, 2002), assumption (3) allows us to identify the cause-speci?c hazard for C2i . We
assume a time-dependent Cox proportional hazards model for the right-hand side of Equation (3),
C
T

?C
i {t | Zi (t), Xi = t} = ?0 (t) exp{a0 Vi (t)},

(4)

where ?C
0 (t) is an unspeci?ed baseline hazard function for dependent censoring, Vi (t) is a
i (t), and a0 is a s-dimensional regression parameter.
s-vector consisting of functions of Z
We propose the following estimating function,
UR (ß) =

i=1

Ri (t){Zi (0) ZR (ß, R, t)}dNi (t),

(5)

where
ZR (ß, R, t) =

S(1) (ß, R, t)
S (0) (ß, R, t)

S(d) (ß, R, t) = n-1

Ri (t)Yi (t)Zi (0)?d exp{ßT Zi (0)}

i=1

DOI: 10.1002/cjs

The Canadian Journal of Statistics / La revue canadienne de statistique

370

SCHAUBEL ET AL.

Vol. 42, No. 3

Ri (t) = ?i (p)Wi (t)
C

Wi (t) = ei (t) ?i (t),
t
C
T
where C
i (t) = 0 exp{a Vi (u)}d0 (u) and the function ?i (t) in the weight Wi (t) is a stabilization
factor. We consider three choices of ?i (t). One choice is ?1i (t) = 1. However, when the censoring
C
is heavy, ei (t) could be quite large and lead to instability in the estimation. In this case, the
t

choice of ?2i (t) = exp[- 0 exp{aT Vi (0)}dC
0 (u)] or ?3i (t) = exp[-i {t | Zi (0)}] may be more

appropriate, where i (t) is based on a time-to-censoring model that uses only the baseline coC
variate values, Zi (0). Hereafter, we denote Wji (t) = ei (t) ?ji (t), j = 1, 2, 3, and correspondingly
,ß
and ß
, the solutions to UR (ß) = 0 with weights W1i (t), W2i (t) and
estimate ß0 with ß
W1
W2
W3
W3i (t), respectively.
C (t)}, where
The weight W1i (t) can be estimated using exp{
i
C

i (t) =
C (t, a) =

C (s, a
)
exp{
aT Vi (s)}d
0

i=1

?
?-1
n

?
?j (p)Yj (s) exp{aT Vj (s)}? ?i (p)dNiC (s),
j=1

, the partial likelihood estimate of a0 , is the root of UC (a) = 0, where
where a
UC (a) =

i=1

{Vi (t) V(a, p, t)}?i (p)dNiC (t),
(1)

(0)

is an IPSW-based estimating function, with V(a, p, t) = SC (a, p, t)/SC (a, p, t) and

T
(d)
SC (a, p, t) = n-1 ni=1 ?i (p)Yi (t)Vi (t)?d ea Vi (t) .
C (t)}, where
C {t, a
|
The weight W2i (t) can be estimated using
?2i (t) exp{
?2i (t) = exp[-
i
i
Zi (0)}]; that is, ?2i (t) is estimated using the same model (4), but only using baseline covariate
values,
C
| Zi (0)} =

i {t, a

C (s, a
).
exp{
aT Vi (0)}d
0

C (t)}, where
(t)}, with
?3i (t) exp{
?3i (t) = exp{-
The weight W3i (t) can be estimated by
i
i
?3i (t) estimated using an additional baseline model for C2i ,

?i {t | Zi (0), Ci = t, Ti , Ti = t} = ?0 (t) exp{a Vi (0)},
T

such that we have
(t) =

i
(t, a ) =

i=1

(s, a
),
exp{
a Vi (0)}d
0

?
?-1
n

T
?
?j (p)Yj (s) exp{a Vj (0)}? ?i (p)dNiC (s),
j=1

The Canadian Journal of Statistics / La revue canadienne de statistique

DOI: 10.1002/cjs

2014

CASE-CONTROL DATA WITH DEPENDENT CENSORING

371

is the partial likelihood estimate of a under the model for dependent censoring with
and a

hazard ?i (t). Weight stabilizers analogous to ?3i (t) have been suggested, for example, by Robins
& Finkelstein (2000) and Hern´an, Brumback, & Robins (2000). We propose the stabilizer ?2i (t)
as an alternative. The performance of each of W1i (t), W2i (t) and W3i (t) are compared through
simulations studies described in Section 4.
(j = 1, 2, 3) is computed as the root of the estimating equation in (5),
To summarize, ß
Wj
i (t). After estimating ß0 , the cumulative baseline hazard, 0 (t), can then
with Ri (t) replaced by R
be estimated by
W (t) =

i=1

-1

T Z     (0)}
     (s)Y     (s) exp{ß
R
Wj

i (s)dNi (s).
R

3. ASYMPTOTIC PROPERTIES
The following conditions are assumed throughout this section.
{Ni (·), NiC (·), Yi (·), Zi (·)}, i = 1, . . . , n are independently and identically distributed.
P {Yi (t) =1} > 0.
t
|Zij (0)| + 0 |dZij (t)| < BZ < 8, where Zij is the jth component of Zi and BZ is a constant.
There exists a neighbourhood B of ß0 such that supu?[0,t],ß?B S(d) (ß, R, u)

s(d) (ß, R, u) -? 0 in probability for d = 0, 1, 2, where s(d) (ß, R, u) = E S(d) (ß, R, u)
is absolutely continuous, for ß ? B, uniformly in u ? (0, t], E (·) denotes expectation. Moreover, s(0) (ß, R, u) is assumed to be bounded away from zero.
(d)
(e) There exists a neighbourhood BC of a0 such that supu?[0,t],a?BC SC (a, p, u)

(a)
(b)
(c)
(d)

(d)

-sC (a, u) -? 0 in probability for d = 0, 1, 2, where sC (a, u) = E {SC (a, p, u)} is ab(0)
solutely continuous, for a ? BC , uniformly in u ? (0, t]. Moreover, sC (a, u) is assumed to
be bounded away from zero.
(f ) The matrices A(ß0 ) and AC (a0 ) are positive de?nite, where

A(ß) =

AC (a) =

s(2) (ß, R, u)/s(0) (ß, R, u) z(ß, R, u)?2 dF (u)

(2)
(0)
sC (a, u)/sC (a, u) v(a, u)?2 dF C (u)
(1)

(0)

with z(ß, R, u) = s(1) (ß,R, u)/s(0) (ß, R,
u), v(a, u) = sC (a, u)/sC (a, u), F (u) = E
{Ri (u)Ni (u)}, F C (u) = E ?i (p0 )NiC (u) .
(g) 0 (t) < 8, C
0 (t) < 8.
We describe the asymptotic properties of the proposed estimators in the following theorems.

a0 converges to a mean zero
Theorem 1. Under conditions (a) (g), as n ? 8, n1/2 a
Normal distribution with covariance AC (a0 )-1 (a0 )AC (a0 )-1 , where AC (a0 ) is as de?ned by
Condition (f) and (a) = E{?i (a, p)?2 }, with ?i (a, p) being asymptotically independent and
identically distributed for i = 1, . . . , n; we defer the de?nition of ?i (a, p) to the Supplementary
Materials document.
DOI: 10.1002/cjs

The Canadian Journal of Statistics / La revue canadienne de statistique

372

SCHAUBEL ET AL.

Vol. 42, No. 3

a0 =
In the Supplementary Materials document, we show that n1/2 a

a0 is essentially a scaled
AC (a0 )-1 n-1/2 ni=1 ?i (a0 , p0 ) + op (1); hence, n1/2 a
sum of n independent and identically distributed random quantities with mean zero and ?nite
variance. By the Multivariate Central Limit Theorem (MCLT) and empirical process theory, the
asymptotic normality is proved.
Theorem 2.

ß , converges to a mean
Under conditions (a) (g), as n ? 8, n1/2 ß
W1
0

zero Normal distribution with covariance A(ß0 )-1 (ß0 , R)A(ß0 )-1 , with A(ß) having been de?ned in Condition (f) and where (ß, R) = E{i (ß, R)?2 }, with i (ß, R) being independent
and identically distributed mean 0 variates (i = 1, . . . , n) asymptotically. The explicit de?nition
of i (ß, R) is provided in the Supplementary Materials.
The proof of Theorem 2 (provided in the Supplementary Materials) begins by de C (t) C (t)} into n1/2 {
C (t; a
C (t; a
C (t; a
)
, p
, p0 )} + n1/2 {
, p0 )
composing n1/2 {
0
0
0
0
0
C
C
C
C
C
1/2
1/2

0 (t; a0 , p0 )} + n {0 (t; a0 , p0 ) 0 (t)}. Then n {0 (t) 0 (t)} can be expressed
asymptotically as a sum of independent and identically distributed zero-mean variates, as n ? 8.
i (t) Ri (t)} can
Combining this result and the Functional Delta Method, we can show that n1/2 {R
be written asymptotically as a sum of independent and identically distributed zero-mean variates, as n ? 8. Finally, through the Functional Delta Method, the asymptotic normality of
ß ) is obtained.
n1/2 (ß
W1
0
is very complicated and dif?cult to
The expression for the asymptotic covariance of ß
W1
implement numerically. A practical way to estimate the variance of the proposed estimators is to
treat the weights Ri (t) as known rather than estimated. In the setting where the weight function
is known, results derived in the Supplementary Materials show that
ß ) = A(ß )-1 n- 21
n1/2 (ß
0
0

Ui ß0 , R + op (1)

(6)

i=1

t

with Ui (ß0 , R) = 0 {Zi (0) z(ß, R, t)}Ri (t)dMi (t) and dMi (t) = dNi (t) Yi (t)di (t). Hence,
ß ) is asymptotically a scaled sum of independent and identically distributed zeron1/2 (ß
0
is estimated by
mean random quantities with ?nite variance. Therefore, the variance of ß
W1
-1
R)
and
R)
-1 , where (ß, R) = E{U (ß, R)?2 }, A(
ß)
ß)
ß)
(ß,
(ß,
A(
are calculated by
A(
i
replacing limiting values with their corresponding empirical counterparts.
ß ) and n1/2 (ß
ß ).
By similar arguments, the asymptotic normality holds for n1/2 (ß
W2
0
W3
0
ß ). Therefore,
However, the covariance will be even more complicated than that of n1/2 (ß
W1
0
and ß
. Note that (6)
similarly, we can treat Ri (t) as ?xed to calculate the variance of ß
W2
W3
and ß
can be estimated by
holds when using W2 or W3 , such that the variances of each of ß
W2
W3
-1
R)
-1 , with Ri (t) being replaced by ?i (
ß)
ß)
(ß,
A(
2i (t) and ?i (
3i (t), respectively.
A(
p)W
p)W
Our ?nal asymptotic result pertains to the proposed cumulative baseline hazard function
estimator.
W (t) 0 (t)} converges to a zero-mean GausTheorem 3. Under conditions (a) (g), n1/2 {
0
sian process as n ? 8, with an explicit covariance function estimator.
A proof of Theorem 3 is provided in the Supplementary Materials, including de?nitions
pertinent to the limiting covariance function.
The Canadian Journal of Statistics / La revue canadienne de statistique

DOI: 10.1002/cjs

2014

CASE-CONTROL DATA WITH DEPENDENT CENSORING

373

Table 1: Simulation study: data con?gurations.
Parameter

0.1

0.05

ß1

0.406

-0.406

ß2

0.406

?C0

0.1

0.05

-0.5

-0.406

0.693

1.010

0.095

51%

62%

27%

%C2

44%

28%

45%

%C1

10%

28%

Fill in Order Details

Make Payment Securely

Writing Process

Download your paper

5.0

4.9

4.9

WHAT OUR CURRENT CUSTOMERS SAY

Consider Your Assignments Done

“All my friends and I are getting help from eliteacademicresearch. It’s every college student’s best kept secret!”

“I was apprehensive at first. But I must say it was a great experience and well worth the price. I got an A!”

Our Top Experts

Pro. M

810

582

Tutor Green

724

444

Doctor Pearce

860

492

Pro. M

735

457

Tutor Green

773

465

Doctor Pearce

752

427

See Why Our Clients Hire Us Again And Again!

OVER

10.3k
Reviews

RATING
4.89/5
Average

YEARS
13
Mastery

Success Guarantee

See our Results

Fill in Order Details

Make Payment Securely

Writing Process

Download your paper

5.0

4.9

4.9

WHAT OUR CURRENT CUSTOMERS SAY

Consider Your Assignments Done

“All my friends and I are getting help from eliteacademicresearch. It’s every college student’s best kept secret!”

“I was apprehensive at first. But I must say it was a great experience and well worth the price. I got an A!”

Our Top Experts

Pro. M

810

582

Tutor Green

724

444

Doctor Pearce

860

492

Pro. M

735

457

Tutor Green

773

465

Doctor Pearce

752

427

See Why Our Clients Hire Us Again And Again!

OVER 10.3k Reviews

RATING 4.89/5 Average

YEARS 13 Mastery

Success Guarantee

See our Results

OVER

10.3k
Reviews

RATING
4.89/5
Average

YEARS
13
Mastery