P.Mean: I can't find a standard deviation for my power calculations (created 2011-04-26).

News: Sign up for "The Monthly Mean," the newsletter that dares to call itself average, www.pmean.com/news.

On 4/26/2011 5:41 AM, Mehwish Hussain wrote:

> The organization from where she will take the data said her that
> they will provide her primary data if she will be able to tell them
> the power of the test for each objective. Then, she came to me to
> calculate the power of the test with each objectives. She only knows
> that sample size was 1314 for the primary data.

One could argue that a power calculation is irrelevant when the sample size is fixed. But the organization is unlikely to buy this.

As I describe in
* http://www.pmean.com/01/power.html
you need three things for a power calculation. A research hypothesis, a standard deviation of your outcome measure, and the minimum clinically important difference.

The research hypothesis is easy in your case. The standard deviation is harder because the papers that you have gotten, as I understand it, do not have a standard deviation for the outcome she is interested in, but rather for other outcome measures.

But it is a very rare outcome measure that has never had anything published about it. Surely some paper somewhere can provide you with that standard deviation. It's not like she's inventing totally new outcome measures that have never been studied by anyone before.

The population that these outcome measures were used on, of course, are probably quite different than the one she is proposing to use. That's always going to be an issue. Try to find a population that is not too radically different, but keep in mind that this will always be an imperfect fit.

As a worst case scenario, you can use a SWAG (look it up). If you know the range of your data, that can give you a rough idea of how big your standard deviation might be. It's impossible for a standard deviation to be 500, for example, if your data lies between 0 and 10.

Here's an example. Your outcome measure is birthweight. The tiniest babies are about 500 grams, anything smaller is not viable. The biggest babies are about 5000 grams, as human females are not big enough to give birth to babies much larger than this. The range is 4500 grams. Divide by 4 or 6 to get an approximate standard deviation. Here it would be 750 or 1125. Now your situation might produce a smaller standard deviation (maybe much much smaller), but it would be really hard to get a larger standard deviation, because physiology places limits on the variability of birthweights. What this means is that your estimate of sample size might be a bit too big if you use a standard deviation of 750 or 1125.

Now I rarely use a SWAG, but keep in mind that here are not any serious ethical problems with a sample size that is too big (see below). It's not like you are asking a bunch of patients to undergo a needless medical test or forcing half of your patients to forgo the active medicine for a placebo. The data is already collected. So you can't argue that it is unethical to have too large a sample size here.

The minimum clinically important difference here is actually not too hard. You know that a t-test with 1,314 subjects can detect a pretty small difference (about a tenth of a standard deviation, assuming equal sample sizes in each group). So figure out whether a tenth of a standard deviation is "small enough". It's probably too small, but there is no problem here.

The only issue that might come up is if you have a binary outcome and the event in question is extremely rare. There is a rule of 50 that says that if you are comparing the probability of an event in two groups, you want to have about 25 to 50 events in each group. You should be safe if all your rates are 5% or more.
* http://www.pmean.com/01/quick.html

Seriously imbalanced sample size in the treatment and control group (as could happen if you are looking at a rare subgroup) will complicate things. You lose a lot of power when the data is divided more extremely than an 80-20 split.

More complicated data analyses, such as ANOVA or regression are unlikely to be a problem though.

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2011-01-01. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Incomplete pages.