Power calculation for survival analysis

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Power calculation for survival analysis

Duke
useR's,
I am trying to do a power calculation for a survival analysis using a logrank test and I need some help properly doing this in R.  Here is the information that I know:
- I have 2 groups, namely HG and LG
- Retrospective analysis with subjects gathered from archival data over 20 years. No new recruitment of subjects and no estimated time to target accrual and accrual rate.
- Survival measured in both groups at 1 year, 3 years, 5 years.
- Assume 50% survival for LG and 30% survival for HG at 5 years.
- Assume a 6 month difference in overall survival to be statistically significant.
- Total sample size is ~ N=500 with 15% of subjects comprising the LG group; 85% make up the HG group.

The main hypothesis is that HG group has shorter overall survival than LG group.

Can someone please help me out with how to properly calculate the power for such a situation using R? This is new to me.

Thanks,
D  
Reply | Threaded
Open this post in threaded view
|

Re: Power calculation for survival analysis

Marc Schwartz-3
On Sep 21, 2011, at 8:54 AM, Duke wrote:

> useR's,
> I am trying to do a power calculation for a survival analysis using a
> logrank test and I need some help properly doing this in R.  Here is the
> information that I know:
> - I have 2 groups, namely HG and LG
> - Retrospective analysis with subjects gathered from archival data over 20
> years. No new recruitment of subjects and no estimated time to target
> accrual and accrual rate.
> - Survival measured in both groups at 1 year, 3 years, 5 years.
> - Assume 50% survival for LG and 30% survival for HG at 5 years.
> - Assume a 6 month difference in overall survival to be statistically
> significant.
> - Total sample size is ~ N=500 with 15% of subjects comprising the LG group;
> 85% make up the HG group.
>
> The main hypothesis is that HG group has shorter overall survival than LG
> group.
>
> Can someone please help me out with how to properly calculate the power for
> such a situation using R? This is new to me.
>
> Thanks,
> D  



Short answer, look at the cpower() function in Frank's Hmisc package on CRAN.

Longer answer:

Have you already performed the data collection and analysis? If so, then performing a post hoc power calculation is highly problematic. Do a Google search on "post hoc power" and you will find a myriad of resources/citations.

Given the sizable differences in the two samples and that this is a retrospective analysis, you are almost certainly going to have selection bias issues to deal with in comparing the two groups, since presumably they were not prospectively randomized to group, even with the ratio indicated.

Is the "HG" group High Grade Lymphoma and the LG group Low Grade Lymphoma? That would help to explain some of the issues here, since you have two groups with differing diagnoses, differing baseline characteristics and known material differences in prognosis.

With a retrospective analysis over this time frame, loss to follow up (LTFU) is likely to be another issue, impacting your available data over time, especially if there is a bias in LTFU between the groups. LTFU is hard enough to manage in a prospective study.

Using your numbers, you also have the potential for temporal issues impacting your comparison. If you are looking out to 5 years and the data was collected over a 20 year time frame, that suggests a possible 15 year difference between your first patient Time 0 and your last patient Time 0. What changes in patient and/or treatment profiles occurred over time that might impact your findings? Were the two groups treated concurrently or is there a stagger of some time window? Are the patients a consecutive series in each group or is there other selection bias involved as to why one patient is in the study and another is not.

If you are not comfortable with these issues, you have a lot of resources at Duke (eg. DCRI) with some very experienced folks there.

HTH,

Marc Schwartz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Power calculation for survival analysis

Duke
Thanks for your response, Marc.  HG and LG are high-grade/low-grade tumors.  The data has not been collected yet, but will be soon.  It's all archived data that will be pulled from computer records.  The IRB wants some mention of power or sample size, but doing it for this scenario has been a bit of a head scratcher for me.
If it's not really feasible to do a power analysis for this scenario, I can work to explain why to the IRB.

D
Reply | Threaded
Open this post in threaded view
|

Re: Power calculation for survival analysis

Marc Schwartz-3

On Sep 21, 2011, at 12:37 PM, Duke wrote:

> Thanks for your response, Marc.  HG and LG are high-grade/low-grade tumors.
> The data has not been collected yet, but will be soon.  It's all archived
> data that will be pulled from computer records.  The IRB wants some mention
> of power or sample size, but doing it for this scenario has been a bit of a
> head scratcher for me.
> If it's not really feasible to do a power analysis for this scenario, I can
> work to explain why to the IRB.
>
> D


Hi Derek,

My guess is that the IRB wants to have some CYA in terms of the justification for the study. In a design such as this, safety is not the typical concern, since the patients have already been treated and nothing that you are going to do will affect that. More than likely, there may be privacy (e.g. HIPAA) and ethical issues, pertaining to your accessing the medical records of the patients and having a reasonable level of assurance that you will be able to offer some scientific value at the end of the day as a consequence of that access.

I don't know the particulars of your IRB, so it may be of value to approach others at Duke who have experience in dealing with them in the setting of a retrospective chart review. You may be able to get a sense for what they are open to in terms of justification and where they may or may not be amenable to a discussion of the pros/cons of this particular approach.

It is not uncommon, in my experience, to simply indicate that n = 500 is a "convenience sample", based upon some assessment of time/budget limitations and some attempt to assess the number of patients with some common set of characteristics that are likely to be available within a reasonable time frame. In that setting, power as a discrete quantity is not quoted and you don't have an explicit hypothesis to be tested. You "get what you get" and within the limitations of the study design, can offer some insight into the differences in the two groups. I have seen the same approach even with prospective, non-randomized designs.

All of that being said, you can use Frank's cpower() function in Hmisc, if they put a gun to your head. It would not be overly difficult to do that, you just need to be aware of your assumptions and how they can impact the resultant power calculation. Using the function itself is not overly complex.

HTH,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.