Survival Analysis with an Historical Control

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Survival Analysis with an Historical Control

Paul Miller
Hello All,

I'm trying to figure out how to perform a survival analysis with an historical control. I've spent some time looking online and in my boooks but haven't found much showing how to do this. Was wondering if there is a R package that can do it, or if there are resources somewhere that show the actual steps one takes, or if some knowledgeable person might be willing to share some code.

Here is a statement that describes the sort of analyis I'm being asked to do.

A one-sample parametric test assuming an exponential form of survival was used to test the hypothesis that the treatment produces a median PFS no greater than the historical control PFS of 16 weeks.  A sample median PFS greater than 20.57 weeks would fall beyond the critical value associated with the null hypothesis, and would be considered statistically significant at alpha = .05, 1 tailed.  

My understanding is that the cutoff of 20.57 weeks was obtained using an online calculator that can be found at:

http://www.swogstat.org/stat/public/one_survival.htm

Thus far, I've been unable to determine what values were plugged into the calculator to get the cutoff.

There's another calculator for a nonparamertric test that can be found at:

http://www.swogstat.org/stat/public/one_nonparametric_survival.htm

It would be nice to try doing this using both a parameteric and a non-parametric model.

So my first question would be whether the approach outlined above is valid or if the analysis should be done some other way. If the basic idea is correct, is it relatively easy (for a Terry Therneau type genius) to implement the whole thing using R? The calculator is a great tool, but, if reasonable, it would be nice to be able to look at some code to see how the numbers actually get produced.

Below are some sample survival data and code in case this proves helpful.

Thanks,

Paul

###################################
#### Example Data: GD2 Vaccine ####
###################################

connection <- textConnection("
GD2  1   8 12  GD2  3 -12 10  GD2  6 -52  7
GD2  7  28 10  GD2  8  44  6  GD2 10  14  8
GD2 12   3  8  GD2 14 -52  9  GD2 15  35 11
GD2 18   6 13  GD2 20  12  7  GD2 23  -7 13
GD2 24 -52  9  GD2 26 -52 12  GD2 28  36 13
GD2 31 -52  8  GD2 33   9 10  GD2 34 -11 16
GD2 36 -52  6  GD2 39  15 14  GD2 40  13 13
GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
GD2 48  28  9  GD2  2  15  9  GD2  4 -44 10
GD2  5  -2 12  GD2  9   8  7  GD2 11  12  7
GD2 13 -52  7  GD2 16  21  7  GD2 17  19 11
GD2 19   6 16  GD2 21  10 16  GD2 22 -15  6
GD2 25   4 15  GD2 27  -9  9  GD2 29  27 10
GD2 30   1 17  GD2 32  12  8  GD2 35  20  8
GD2 37 -32  8  GD2 38  15  8  GD2 41   5 14
GD2 43  35 13  GD2 45  28  9  GD2 47   6 15
")

hsv <- data.frame(scan(connection, list(VAC="", PAT=0, WKS=0, X=0)))
hsv <- transform(hsv, CENS=ifelse(WKS < 1, 1, 0), WKS=abs(WKS))
head(hsv)

require("survival")

survObj <- Surv(hsv$WKS, hsv$CENS==0) ~ 1

km <- survfit(survObj, type=c("kaplan-meier"))
print(km)

paraExp <- survreg(survObj, dist="exponential")
print(paraExp)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Survival Analysis with an Historical Control

Andrews, Chris
The code is actually available at the websites you provide.  Try "View page source" in your browser.  The most cryptic code isn't needed because the math functions (e.g, incomplete gamma function) are available in R.


-----Original Message-----
From: Paul Miller [mailto:[hidden email]]
Sent: Tuesday, July 08, 2014 12:00 PM
To: [hidden email]
Subject: [R] Survival Analysis with an Historical Control

Hello All,

I'm trying to figure out how to perform a survival analysis with an historical control. I've spent some time looking online and in my boooks but haven't found much showing how to do this. Was wondering if there is a R package that can do it, or if there are resources somewhere that show the actual steps one takes, or if some knowledgeable person might be willing to share some code.

Here is a statement that describes the sort of analyis I'm being asked to do.

A one-sample parametric test assuming an exponential form of survival was used to test the hypothesis that the treatment produces a median PFS no greater than the historical control PFS of 16 weeks.  A sample median PFS greater than 20.57 weeks would fall beyond the critical value associated with the null hypothesis, and would be considered statistically significant at alpha = .05, 1 tailed.  

My understanding is that the cutoff of 20.57 weeks was obtained using an online calculator that can be found at:

http://www.swogstat.org/stat/public/one_survival.htm

Thus far, I've been unable to determine what values were plugged into the calculator to get the cutoff.

There's another calculator for a nonparamertric test that can be found at:

http://www.swogstat.org/stat/public/one_nonparametric_survival.htm

It would be nice to try doing this using both a parameteric and a non-parametric model.

So my first question would be whether the approach outlined above is valid or if the analysis should be done some other way. If the basic idea is correct, is it relatively easy (for a Terry Therneau type genius) to implement the whole thing using R? The calculator is a great tool, but, if reasonable, it would be nice to be able to look at some code to see how the numbers actually get produced.

Below are some sample survival data and code in case this proves helpful.

Thanks,

Paul

###################################
#### Example Data: GD2 Vaccine ####
###################################

connection <- textConnection("
GD2  1   8 12  GD2  3 -12 10  GD2  6 -52  7
GD2  7  28 10  GD2  8  44  6  GD2 10  14  8
GD2 12   3  8  GD2 14 -52  9  GD2 15  35 11
GD2 18   6 13  GD2 20  12  7  GD2 23  -7 13
GD2 24 -52  9  GD2 26 -52 12  GD2 28  36 13
GD2 31 -52  8  GD2 33   9 10  GD2 34 -11 16
GD2 36 -52  6  GD2 39  15 14  GD2 40  13 13
GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
GD2 48  28  9  GD2  2  15  9  GD2  4 -44 10
GD2  5  -2 12  GD2  9   8  7  GD2 11  12  7
GD2 13 -52  7  GD2 16  21  7  GD2 17  19 11
GD2 19   6 16  GD2 21  10 16  GD2 22 -15  6
GD2 25   4 15  GD2 27  -9  9  GD2 29  27 10
GD2 30   1 17  GD2 32  12  8  GD2 35  20  8
GD2 37 -32  8  GD2 38  15  8  GD2 41   5 14
GD2 43  35 13  GD2 45  28  9  GD2 47   6 15
")

hsv <- data.frame(scan(connection, list(VAC="", PAT=0, WKS=0, X=0)))
hsv <- transform(hsv, CENS=ifelse(WKS < 1, 1, 0), WKS=abs(WKS))
head(hsv)

require("survival")

survObj <- Surv(hsv$WKS, hsv$CENS==0) ~ 1

km <- survfit(survObj, type=c("kaplan-meier"))
print(km)

paraExp <- survreg(survObj, dist="exponential")
print(paraExp)


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Survival Analysis with an Historical Control

Paul Miller

Hi Chris,

Thanks for pointing out the use of "View page source". Very helpful to know.

Do you happen to know anything about how to perform the analysis itself? I haven't been able to find anything confirming that the approach described in my original email (below) is correct.

Thanks,

Paul

--------------------------------------------
On Wed, 7/9/14, Andrews, Chris <[hidden email]> wrote:

 Subject: RE: [R] Survival Analysis with an Historical Control

r-project.org>
 Received: Wednesday, July 9, 2014, 11:26 AM

 The code is actually
 available at the websites you provide.  Try "View page
 source" in your browser.  The most cryptic code
 isn't needed because the math functions (e.g, incomplete
 gamma function) are available in R.


 -----Original Message-----


 Sent: Tuesday, July 08, 2014 12:00 PM
 To: [hidden email]
 Subject: [R] Survival Analysis with an
 Historical Control

 Hello
 All,

 I'm trying to
 figure out how to perform a survival analysis with an
 historical control. I've spent some time looking online
 and in my boooks but haven't found much showing how to
 do this. Was wondering if there is a R package that can do
 it, or if there are resources somewhere that show the actual
 steps one takes, or if some knowledgeable person might be
 willing to share some code.

 Here is a statement that describes the sort of
 analyis I'm being asked to do.

 A one-sample parametric test assuming an
 exponential form of survival was used to test the hypothesis
 that the treatment produces a median PFS no greater than the
 historical control PFS of 16 weeks.  A sample median PFS
 greater than 20.57 weeks would fall beyond the critical
 value associated with the null hypothesis, and would be
 considered statistically significant at alpha = .05, 1
 tailed. 

 My understanding
 is that the cutoff of 20.57 weeks was obtained using an
 online calculator that can be found at:

 http://www.swogstat.org/stat/public/one_survival.htm

 Thus far, I've been unable
 to determine what values were plugged into the calculator to
 get the cutoff.

 There's
 another calculator for a nonparamertric test that can be
 found at:

 http://www.swogstat.org/stat/public/one_nonparametric_survival.htm

 It would be nice to try doing
 this using both a parameteric and a non-parametric model.

 So my first question would be
 whether the approach outlined above is valid or if the
 analysis should be done some other way. If the basic idea is
 correct, is it relatively easy (for a Terry Therneau type
 genius) to implement the whole thing using R? The calculator
 is a great tool, but, if reasonable, it would be nice to be
 able to look at some code to see how the numbers actually
 get produced.

 Below are
 some sample survival data and code in case this proves
 helpful.

 Thanks,

 Paul

 ###################################
 #### Example Data: GD2 Vaccine ####
 ###################################

 connection <-
 textConnection("
 GD2  1   8
 12  GD2  3 -12 10  GD2  6 -52  7
 GD2 
 7  28 10  GD2  8  44  6  GD2 10  14  8
 GD2 12   3  8  GD2 14 -52  9 
 GD2 15  35 11
 GD2 18   6 13 
 GD2 20  12  7  GD2 23  -7 13
 GD2 24
 -52  9  GD2 26 -52 12  GD2 28  36 13
 GD2
 31 -52  8  GD2 33   9 10  GD2 34 -11 16
 GD2 36 -52  6  GD2 39  15 14  GD2 40  13
 13
 GD2 42  21 13  GD2 44 -24 16  GD2 46
 -52 13
 GD2 48  28  9  GD2  2  15  9 
 GD2  4 -44 10
 GD2  5  -2 12  GD2 
 9   8  7  GD2 11  12  7
 GD2
 13 -52  7  GD2 16  21  7  GD2 17  19 11
 GD2 19   6 16  GD2 21  10 16  GD2
 22 -15  6
 GD2 25   4 15  GD2
 27  -9  9  GD2 29  27 10
 GD2
 30   1 17  GD2 32  12  8  GD2 35  20  8
 GD2 37 -32  8  GD2 38  15  8  GD2
 41   5 14
 GD2 43  35 13  GD2
 45  28  9  GD2 47   6 15
 ")

 hsv
 <- data.frame(scan(connection, list(VAC="",
 PAT=0, WKS=0, X=0)))
 hsv <-
 transform(hsv, CENS=ifelse(WKS < 1, 1, 0),
 WKS=abs(WKS))
 head(hsv)

 require("survival")

 survObj <- Surv(hsv$WKS,
 hsv$CENS==0) ~ 1

 km <-
 survfit(survObj, type=c("kaplan-meier"))
 print(km)

 paraExp <- survreg(survObj,
 dist="exponential")
 print(paraExp)


 **********************************************************
 Electronic Mail is not secure, may not be read
 every day, and should not be used for urgent or sensitive
 issues

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Survival Analysis with an Historical Control

Therneau, Terry M., Ph.D.
In reply to this post by Paul Miller
You are asking for a one sample test.  Using your own data:

connection <- textConnection("
GD2  1   8 12  GD2  3 -12 10  GD2  6 -52  7
GD2  7  28 10  GD2  8  44  6  GD2 10  14  8
GD2 12   3  8  GD2 14 -52  9  GD2 15  35 11
GD2 18   6 13  GD2 20  12  7  GD2 23  -7 13
GD2 24 -52  9  GD2 26 -52 12  GD2 28  36 13
GD2 31 -52  8  GD2 33   9 10  GD2 34 -11 16
GD2 36 -52  6  GD2 39  15 14  GD2 40  13 13
GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
GD2 48  28  9  GD2  2  15  9  GD2  4 -44 10
GD2  5  -2 12  GD2  9   8  7  GD2 11  12  7
GD2 13 -52  7  GD2 16  21  7  GD2 17  19 11
GD2 19   6 16  GD2 21  10 16  GD2 22 -15  6
GD2 25   4 15  GD2 27  -9  9  GD2 29  27 10
GD2 30   1 17  GD2 32  12  8  GD2 35  20  8
GD2 37 -32  8  GD2 38  15  8  GD2 41   5 14
GD2 43  35 13  GD2 45  28  9  GD2 47   6 15
")

hsv <- data.frame(scan(connection, list(vac="", pat=0, wks=0, x=0)))
hsv <- transform(hsv, status= (wks >0), wks = abs(wks))

fit1 <- survreg(Surv(wks, status) ~ 1, data=hsv, dist='exponential')
temp <- predict(fit1, type='quantile', p=.5, se=TRUE)

c(median= temp$fit[1], std= temp$se[1])
   median    std
24.32723  4.36930

--
The predict function gives the predicted median survival and standard deviation for each
observation in the data set.  Since this was a mean only model all n of them are the same
and I printed only the first.
For prediction they make the assumption that the std error for my future study will be the
same as the std from this one, you want the future 95% CI to not include the value of 16,
so the future mean will need to be at least 16 + 1.96* 4.369.

A nonparmetric version of the argument would be

> fit2 <- survfit(Surv(wks, status) ~ 1, data=hsv)
> print(fit2)
records   n.max n.start  events  median 0.95LCL 0.95UCL
      48      48      48      31      21      15      35

Then make the argument that in our future study, the 95% CI will stretch 6 units to the
left of the median, just like it did here.  This argument is a bit more tenuous though.
The exponential CI width depends on the total number of events and total follow-up time,
and we can guess that the new study will be similar.  The Kaplan-Meier CI also depends on
the spacing of the deaths, which is less likely to replicate.

Notes:
  1. Use summary(fit2)$table to extract the CI values.  In R the print functions don't
allow you to "grab" what was printed, summary normally does.
  2. For the exponential we could work out the formula in closed form -- a good homework
exercise for grad students perhaps but not an exciting way to spend my own afternoon.  An
advantage of the above approach is that we can easily use a more realistic model like the
weibull.
  3. I've never liked extracting out the "Surv(t,s)" part of a formula as a separate
statement on another line.  If I ever need to read this code again, or even just the
printout from the run, keeping it all together gives much better documentation.
  4. Future calculations for survival data, of any form, are always tenuous since they
depend critically on the total number of events that will be in the future study.  We can
legislate the total enrollment and follow-up time for that future study, but the number of
events is never better than a guess.  Paraphrasing a motto found on the door of a well
respected investigator I worked with 30 years ago (because I don't remember it exaclty):

   "The incidence of the condition under consideration and its subsequent death rate will
both drop by 1/2 at the commencement of a study, and will not return to their former
values until the study finishes or the PI retires."


Terry T.

---------------------------------------------------------------------------

On 07/10/2014 05:00 AM, [hidden email] wrote:

> Hello All,
>
> I'm trying to figure out how to perform a survival analysis with an historical control. I've spent some time looking online and in my boooks but haven't found much showing how to do this. Was wondering if there is a R package that can do it, or if there are resources somewhere that show the actual steps one takes, or if some knowledgeable person might be willing to share some code.
>
> Here is a statement that describes the sort of analyis I'm being asked to do.
>
> A one-sample parametric test assuming an exponential form of survival was used to test the hypothesis that the treatment produces a median PFS no greater than the historical control PFS of 16 weeks.  A sample median PFS greater than 20.57 weeks would fall beyond the critical value associated with the null hypothesis, and would be considered statistically significant at alpha = .05, 1 tailed.
>
> My understanding is that the cutoff of 20.57 weeks was obtained using an online calculator that can be found at:
>
> http://www.swogstat.org/stat/public/one_survival.htm
>
> Thus far, I've been unable to determine what values were plugged into the calculator to get the cutoff.
>
> There's another calculator for a nonparamertric test that can be found at:
>
> http://www.swogstat.org/stat/public/one_nonparametric_survival.htm
>
> It would be nice to try doing this using both a parameteric and a non-parametric model.
>
> So my first question would be whether the approach outlined above is valid or if the analysis should be done some other way. If the basic idea is correct, is it relatively easy (for a Terry Therneau type genius) to implement the whole thing using R? The calculator is a great tool, but, if reasonable, it would be nice to be able to look at some code to see how the numbers actually get produced.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Survival Analysis with an Historical Control

Paul Miller
In reply to this post by Paul Miller
Hi Dr. Therneau,

Thanks for your response. This is very helpful.

My historical control value is 16 weeks. I've been having some trouble though determining how this value was obtained. Are you able to indicate how people normally go about determining a value for the historical control? Or do you have an view on how it ought to be done?

It does seem like selecting an appropriate value is extremely important. Otherwise the results obtained from the analysis are likely to be nonsense.

Thanks,

Paul

--------------------------------------------
On Thu, 7/10/14, Therneau, Terry M., Ph.D. <[hidden email]> wrote:

 Subject: Re: Survival Analysis with an Historical Control
 To: [hidden email], "Andrews, Chris" <[hidden email]>, pjmill
 Received: Thursday, July 10, 2014, 8:52 AM

 You are asking for a one sample
 test.  Using your own data:

 connection <- textConnection("
 GD2  1   8 12  GD2  3 -12
 10  GD2  6 -52  7
 GD2  7  28 10  GD2  8  44 
 6  GD2 10  14  8
 GD2 12   3  8  GD2 14 -52 
 9  GD2 15  35 11
 GD2 18   6 13  GD2 20  12 
 7  GD2 23  -7 13
 GD2 24 -52  9  GD2 26 -52 12  GD2 28  36
 13
 GD2 31 -52  8  GD2 33   9 10 
 GD2 34 -11 16
 GD2 36 -52  6  GD2 39  15 14  GD2
 40  13 13
 GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
 GD2 48  28  9  GD2  2  15 
 9  GD2  4 -44 10
 GD2  5  -2 12  GD2 
 9   8  7  GD2 11  12  7
 GD2 13 -52  7  GD2 16  21  7  GD2
 17  19 11
 GD2 19   6 16  GD2 21  10 16 
 GD2 22 -15  6
 GD2 25   4 15  GD2 27  -9 
 9  GD2 29  27 10
 GD2 30   1 17  GD2 32  12 
 8  GD2 35  20  8
 GD2 37 -32  8  GD2 38  15  8  GD2
 41   5 14
 GD2 43  35 13  GD2 45  28  9  GD2
 47   6 15
 ")

 hsv <- data.frame(scan(connection, list(vac="", pat=0,
 wks=0, x=0)))
 hsv <- transform(hsv, status= (wks >0), wks =
 abs(wks))

 fit1 <- survreg(Surv(wks, status) ~ 1, data=hsv,
 dist='exponential')
 temp <- predict(fit1, type='quantile', p=.5, se=TRUE)

 c(median= temp$fit[1], std= temp$se[1])
    median    std
 24.32723  4.36930

 --
 The predict function gives the predicted median survival and
 standard deviation for each
 observation in the data set.  Since this was a mean
 only model all n of them are the same
 and I printed only the first.
 For prediction they make the assumption that the std error
 for my future study will be the
 same as the std from this one, you want the future 95% CI to
 not include the value of 16,
 so the future mean will need to be at least 16 + 1.96*
 4.369.

 A nonparmetric version of the argument would be

 > fit2 <- survfit(Surv(wks, status) ~ 1, data=hsv)
 > print(fit2)
 records   n.max n.start  events 
 median 0.95LCL 0.95UCL
       48      48   
   48      31     
 21      15      35

 Then make the argument that in our future study, the 95% CI
 will stretch 6 units to the
 left of the median, just like it did here.  This
 argument is a bit more tenuous though.
 The exponential CI width depends on the total number of
 events and total follow-up time,
 and we can guess that the new study will be similar. 
 The Kaplan-Meier CI also depends on
 the spacing of the deaths, which is less likely to
 replicate.

 Notes:
   1. Use summary(fit2)$table to extract the CI
 values.  In R the print functions don't
 allow you to "grab" what was printed, summary normally
 does.
   2. For the exponential we could work out the formula
 in closed form -- a good homework
 exercise for grad students perhaps but not an exciting way
 to spend my own afternoon.  An
 advantage of the above approach is that we can easily use a
 more realistic model like the
 weibull.
   3. I've never liked extracting out the "Surv(t,s)"
 part of a formula as a separate
 statement on another line.  If I ever need to read this
 code again, or even just the
 printout from the run, keeping it all together gives much
 better documentation.
   4. Future calculations for survival data, of any
 form, are always tenuous since they
 depend critically on the total number of events that will be
 in the future study.  We can
 legislate the total enrollment and follow-up time for that
 future study, but the number of
 events is never better than a guess.  Paraphrasing a
 motto found on the door of a well
 respected investigator I worked with 30 years ago (because I
 don't remember it exaclty):

    "The incidence of the condition under
 consideration and its subsequent death rate will
 both drop by 1/2 at the commencement of a study, and will
 not return to their former
 values until the study finishes or the PI retires."


 Terry T.

 ---------------------------------------------------------------------------

 On 07/10/2014 05:00 AM, [hidden email]
 wrote:
 > Hello All,
 >
 > I'm trying to figure out how to perform a survival
 analysis with an historical control. I've spent some time
 looking online and in my boooks but haven't found much
 showing how to do this. Was wondering if there is a R
 package that can do it, or if there are resources somewhere
 that show the actual steps one takes, or if some
 knowledgeable person might be willing to share some code.
 >
 > Here is a statement that describes the sort of analyis
 I'm being asked to do.
 >
 > A one-sample parametric test assuming an exponential
 form of survival was used to test the hypothesis that the
 treatment produces a median PFS no greater than the
 historical control PFS of 16 weeks.  A sample median
 PFS greater than 20.57 weeks would fall beyond the critical
 value associated with the null hypothesis, and would be
 considered statistically significant at alpha = .05, 1
 tailed.
 >
 > My understanding is that the cutoff of 20.57 weeks was
 obtained using an online calculator that can be found at:
 >
 > http://www.swogstat.org/stat/public/one_survival.htm
 >
 > Thus far, I've been unable to determine what values
 were plugged into the calculator to get the cutoff.
 >
 > There's another calculator for a nonparamertric test
 that can be found at:
 >
 > http://www.swogstat.org/stat/public/one_nonparametric_survival.htm
 >
 > It would be nice to try doing this using both a
 parameteric and a non-parametric model.
 >
 > So my first question would be whether the approach
 outlined above is valid or if the analysis should be done
 some other way. If the basic idea is correct, is it
 relatively easy (for a Terry Therneau type genius) to
 implement the whole thing using R? The calculator is a great
 tool, but, if reasonable, it would be nice to be able to
 look at some code to see how the numbers actually get
 produced.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Survival Analysis with an Historical Control

Andrews, Chris
In reply to this post by Paul Miller
Hi Paul,
Sorry for the delayed reply.  I was away last week. I'm not clear what you want confirmed about your approach.

(a) "20.57"- computing the rejection region of the analysis.  The formulas implemented at the addresses you gave in your original post are from a reputable source - Lawless (1982) - at least that is claimed in the help file (http://www.swogstat.org/stat/Public/Help/survival1.html).  It seems that another person derived 20.57 from some combination of input.  I don't see how to back calculate what the input was from the information provided.  Perhaps elsewhere in your study protocol there is discussion of accrual time, et al. that can help you.

(b) "16" - the value of the parameter in the null hypothesis is very important as you noted in your response to Terry.  But this is not really a statistical question. It may be derived from historical data, expert opinion, regulatory mandate, or some combination of these and other factors.  Presumably this study was undertaken because 16 was an important number to somebody.

(c) performing the one-sample survival analysis itself.  This is what I did with the data you provided.

# Non-parametric
(km <- survfit(Surv(WKS, 1-CENS) ~ 1, data=hsv, type="kaplan-meier", conf.type="log", conf.int=0.9))
summary(km)

# Compare to median survival = 16
# (Used 90% CI above to get 0.05 one sided test here)
quantile(km, prob=0.5)$lower > 16

# Parametric
(paraExp <- survreg(Surv(WKS, 1-CENS) ~ 1, data=hsv, dist="exponential"))
summary(paraExp)

# Compare to median survival = 16
# That is, compare to beta0 = log(16/log(2)) = 3.1391...
# one sided test
pnorm(c((coef(paraExp) - log(16/log(2))) / sqrt(vcov(paraExp))), lower.tail=FALSE)


Chris

-----Original Message-----
From: Paul Miller [mailto:[hidden email]]
Sent: Thursday, July 10, 2014 8:59 AM
To: Andrews, Chris
Cc: [hidden email]
Subject: RE: [R] Survival Analysis with an Historical Control


Hi Chris,

Thanks for pointing out the use of "View page source". Very helpful to know.

Do you happen to know anything about how to perform the analysis itself? I haven't been able to find anything confirming that the approach described in my original email (below) is correct.

Thanks,

Paul

--------------------------------------------
On Wed, 7/9/14, Andrews, Chris <[hidden email]> wrote:

 Subject: RE: [R] Survival Analysis with an Historical Control
 To: "Paul Miller" <[hidden email]>, "[hidden email]" <[hidden email]>
 Received: Wednesday, July 9, 2014, 11:26 AM
 
 The code is actually
 available at the websites you provide.  Try "View page
 source" in your browser.  The most cryptic code
 isn't needed because the math functions (e.g, incomplete
 gamma function) are available in R.
 
 
 -----Original Message-----
 From: Paul Miller [mailto:[hidden email]]
 
 Sent: Tuesday, July 08, 2014 12:00 PM
 To: [hidden email]
 Subject: [R] Survival Analysis with an
 Historical Control
 
 Hello
 All,
 
 I'm trying to
 figure out how to perform a survival analysis with an
 historical control. I've spent some time looking online
 and in my boooks but haven't found much showing how to
 do this. Was wondering if there is a R package that can do
 it, or if there are resources somewhere that show the actual
 steps one takes, or if some knowledgeable person might be
 willing to share some code.
 
 Here is a statement that describes the sort of
 analyis I'm being asked to do.
 
 A one-sample parametric test assuming an
 exponential form of survival was used to test the hypothesis
 that the treatment produces a median PFS no greater than the
 historical control PFS of 16 weeks.  A sample median PFS
 greater than 20.57 weeks would fall beyond the critical
 value associated with the null hypothesis, and would be
 considered statistically significant at alpha = .05, 1
 tailed. 
 
 My understanding
 is that the cutoff of 20.57 weeks was obtained using an
 online calculator that can be found at:
 
 http://www.swogstat.org/stat/public/one_survival.htm
 
 Thus far, I've been unable
 to determine what values were plugged into the calculator to
 get the cutoff.
 
 There's
 another calculator for a nonparamertric test that can be
 found at:
 
 http://www.swogstat.org/stat/public/one_nonparametric_survival.htm
 
 It would be nice to try doing
 this using both a parameteric and a non-parametric model.
 
 So my first question would be
 whether the approach outlined above is valid or if the
 analysis should be done some other way. If the basic idea is
 correct, is it relatively easy (for a Terry Therneau type
 genius) to implement the whole thing using R? The calculator
 is a great tool, but, if reasonable, it would be nice to be
 able to look at some code to see how the numbers actually
 get produced.
 
 Below are
 some sample survival data and code in case this proves
 helpful.
 
 Thanks,
 
 Paul
 
 ###################################
 #### Example Data: GD2 Vaccine ####
 ###################################
 
 connection <-
 textConnection("
 GD2  1   8
 12  GD2  3 -12 10  GD2  6 -52  7
 GD2 
 7  28 10  GD2  8  44  6  GD2 10  14  8
 GD2 12   3  8  GD2 14 -52  9 
 GD2 15  35 11
 GD2 18   6 13 
 GD2 20  12  7  GD2 23  -7 13
 GD2 24
 -52  9  GD2 26 -52 12  GD2 28  36 13
 GD2
 31 -52  8  GD2 33   9 10  GD2 34 -11 16
 GD2 36 -52  6  GD2 39  15 14  GD2 40  13
 13
 GD2 42  21 13  GD2 44 -24 16  GD2 46
 -52 13
 GD2 48  28  9  GD2  2  15  9 
 GD2  4 -44 10
 GD2  5  -2 12  GD2 
 9   8  7  GD2 11  12  7
 GD2
 13 -52  7  GD2 16  21  7  GD2 17  19 11
 GD2 19   6 16  GD2 21  10 16  GD2
 22 -15  6
 GD2 25   4 15  GD2
 27  -9  9  GD2 29  27 10
 GD2
 30   1 17  GD2 32  12  8  GD2 35  20  8
 GD2 37 -32  8  GD2 38  15  8  GD2
 41   5 14
 GD2 43  35 13  GD2
 45  28  9  GD2 47   6 15
 ")
 
 hsv
 <- data.frame(scan(connection, list(VAC="",
 PAT=0, WKS=0, X=0)))
 hsv <-
 transform(hsv, CENS=ifelse(WKS < 1, 1, 0),
 WKS=abs(WKS))
 head(hsv)
 
 require("survival")
 
 survObj <- Surv(hsv$WKS,
 hsv$CENS==0) ~ 1
 
 km <-
 survfit(survObj, type=c("kaplan-meier"))
 print(km)
 
 paraExp <- survreg(survObj,
 dist="exponential")
 print(paraExp)
 
 
 **********************************************************
 Electronic Mail is not secure, may not be read
 every day, and should not be used for urgent or sensitive
 issues
 
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Survival Analysis with an Historical Control

Paul Miller
Hi Chris,

Thanks for your reply. Certainly no need to apologize for a delayed response. Appreciate your taking the time to answer my question.

My concern was about the value of "16". My understanding is it is based on one expert's opinion. I wondered if this is in keeping with whatever the standard practice might be. I feel uncomfortable with the idea of using this as the basis but perhaps I am alone in feeling this way. My preference would be to see citations from previous studies to justify the number. A meta analysis of results from previous studies might be even better.  

Your code is very interesting. Especially the parametric part. Nice to get an actual p-value for the test.

Thanks,

Paul

--------------------------------------------
On Mon, 7/21/14, Andrews, Chris <[hidden email]> wrote:

 Subject: RE: [R] Survival Analysis with an Historical Control

 Cc: "[hidden email]" <[hidden email]>
 Received: Monday, July 21, 2014, 10:33 AM

 Hi Paul,
 Sorry for the delayed reply.  I was away last
 week. I'm not clear what you want confirmed about your
 approach.

 (a)
 "20.57"- computing the rejection region of the
 analysis.  The formulas implemented at the addresses you
 gave in your original post are from a reputable source -
 Lawless (1982) - at least that is claimed in the help file
 (http://www.swogstat.org/stat/Public/Help/survival1.html). 
 It seems that another person derived 20.57 from some
 combination of input.  I don't see how to back
 calculate what the input was from the information
 provided.  Perhaps elsewhere in your study protocol there
 is discussion of accrual time, et al. that can help you.

 (b) "16" - the value
 of the parameter in the null hypothesis is very important as
 you noted in your response to Terry.  But this is not
 really a statistical question. It may be derived from
 historical data, expert opinion, regulatory mandate, or some
 combination of these and other factors.  Presumably this
 study was undertaken because 16 was an important number to
 somebody.

 (c) performing
 the one-sample survival analysis itself.  This is what I
 did with the data you provided.

 # Non-parametric
 (km <-
 survfit(Surv(WKS, 1-CENS) ~ 1, data=hsv,
 type="kaplan-meier", conf.type="log",
 conf.int=0.9))
 summary(km)

 # Compare to median survival =
 16
 # (Used 90% CI above to get 0.05 one
 sided test here)
 quantile(km,
 prob=0.5)$lower > 16

 #
 Parametric
 (paraExp <- survreg(Surv(WKS,
 1-CENS) ~ 1, data=hsv, dist="exponential"))
 summary(paraExp)

 # Compare to median survival = 16
 # That is, compare to beta0 = log(16/log(2)) =
 3.1391...
 # one sided test
 pnorm(c((coef(paraExp) - log(16/log(2))) /
 sqrt(vcov(paraExp))), lower.tail=FALSE)


 Chris

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.