Binary outcome sample size calculations (created 2000-08-23)
This page is moving to a new website.
Dear Professor Mean, I have to calculate a sample size for a binary outcome variable. The research study is on breast feeding failures within 7 to 10 days of birth for mothers who intended to breast feed. The rate of failure overall is expected to be about 12%. What sample size do I need? -- Baffled Bob
Dear Baffled,
Breast feeding failure is an example of a binary outcome measure. There are only two possible values: the mother is successfully breast feeding at 7 to 10 days, or the mother is not successfully breast feeding at 7 to 10 days. Other examples of binary outcomes would be:
- the patient died (survived) within the first year of study,
- the patient experienced (did not experience) a specific side effect,
- the patient had a positive (negative) result on a diagnostic test.
The sample size you need when you outcome is binary is different than when your outcome is continuous. For a continuous outcome, you need to specify the variability of your outcome measure and how much of a change you would consider clinically relevant. For a binary outcome, you still need to specify the clinically relevant change. But you don't need a measure of variability. What you need instead is an estimate in your control group of the probability for one level of your binary outcome. You might also need to specify the distribution of your explanatory (independent) variable.
Example
One of the factors that might influence breast feeding failure is whether the delivery was a vaginal birth or a C-section. Let's assume that roughly 20% of the mothers in the sample had a C-section. Expressing it in a different way, the ratio of vaginal births to C-sections is 4 to 1.
Let's also assume that the rate of breast feeding failure is 15% in the C-section group and 30% in the vaginal birth group. You hypothesize that C-section babies fare better, because the mother stays in the hospital longer. The extra time in the hospital allows greater interaction with lactation consultants.
You wish to use a two sided test at an alpha level of 0.05. You also want the power to be at least 0.80. Under these conditions, you would need a sample size of 435 mothers.
[Show some of the formulas and calculations.]
Summary
Baffled Bob wants to know how to calculate a sample size when his outcome variable is binary (has only two possible values). Professor Mean explains that you need to specify the probability of an outcome at two different values of your predictor or independent variable.
Further reading