Binary outcome sample size calculations (created 2000-08-23)

This page is moving to a new website.

Dear Professor Mean, I have to calculate a sample size for a binary outcome variable. The research study is on breast feeding failures within 7 to 10 days of birth for mothers who intended to breast feed. The rate of failure overall is expected to be about 12%. What sample size do I need? -- Baffled Bob

Dear Baffled,

Breast feeding failure is an example of a binary outcome measure. There are only two possible values: the mother is successfully breast feeding at 7 to 10 days, or the mother is not successfully breast feeding at 7 to 10 days. Other examples of binary outcomes would be:

The sample size you need when you outcome is binary is different than when your outcome is continuous. For a continuous outcome, you need to specify the variability of your outcome measure and how much of a change you would consider clinically relevant. For a binary outcome, you still need to specify the clinically relevant change. But you don't need a measure of variability. What you need instead is an estimate in your control group of the probability for one level of your binary outcome. You might also need to specify the distribution of your explanatory (independent) variable.

Example

One of the factors that might influence breast feeding failure is whether the delivery was a vaginal birth or a C-section. Let's assume that roughly 20% of the mothers in the sample had a C-section. Expressing it in a different way, the ratio of vaginal births to C-sections is 4 to 1.

Let's also assume that the rate of breast feeding failure is 15% in the C-section group and 30% in the vaginal birth group. You hypothesize that C-section babies fare better, because the mother stays in the hospital longer. The extra time in the hospital allows greater interaction with lactation consultants.

You wish to use a two sided test at an alpha level of 0.05. You also want the power to be at least 0.80. Under these conditions, you would need a sample size of 435 mothers.

[Show some of the formulas and calculations.]

Summary

Baffled Bob wants to know how to calculate a sample size when his outcome variable is binary (has only two possible values). Professor Mean explains that you need to specify the probability of an outcome at two different values of your predictor or independent variable.

Further reading

  1. Binomial Program to Calculate Power or Sample Size. Brent Hostetler, Southwest Oncology Group Statistical Center. Accessed on 2003-05-08. "Two Arm Binomial is a program to calculate either estimates of sample size or power for differences in proportions. The program allows for unequal sample size allocation between the two groups." www.swogstat.org/Stat/Public/binomial/binomial.htm
  2. One sample binomial. Southwest Oncology Group Statistical Center. Accessed on 2003-05-08. "One Arm Binomial program calculates either estimates of sample size or power for one sample binomial problem. The first button calculates approximate power or sample size and critical values (reject if >= critical value). The second button calculates "exact" power and alpha for the given null and alternative proportions and sample size. Note, sample size and null and alternative proportions can be changed before using the second button." www.swogstat.org/Stat/Public/one_binomial.htm
  3. Bayesian sample size determination for estimating binomial parameters from data subject to misclassification. Elham Rahme, Lawrence Joseph, Theresa W. Gyorkos. Accessed on 2003-05-08. "We investigate the sample size problem when a binomial parameter is to be estimated, but some degree of misclassification is possible. The problem is especially challenging when the degree to which misclassification occurs is not exactly known." Published November 29, 1999. www.med.mcgill.ca/epidemiology/Joseph/diagsmp.pdf