P.Mean: Iowa talk on accrual (created 2012-04-03).

News: Sign up for "The Monthly Mean," the newsletter that dares to call itself average, www.pmean.com/news.

I will be giving a talk "Slipped deadlines and sample size shortfalls in clinical trials: a proposed remedy using a Bayesian model with an informative prior distribution." at the University of Iowa. Here is the handout for my talk.

Abstract: "Background: The most common reason why clinical trials fail is that they fall well below their goals for patient accrual. Researchers will frequently overpromise and underdeliver on the number of patients that they can recruit during the proposed time frame. The result is studies that take far longer than planned and/or that end with fewer patients than planned. This raises serious economic and ethical issues. Our research efforts have focused on (1) getting reliable data on the scope and magnitude of problems with slow patient accrual in clinical trials, and (2) developing a Bayesian model for accrual that will encourage careful planning of accrual rates as well as allow regular monitoring of accrual patterns during the conduct of the clinical trial. Methods: A random sample of 130 prospective studies approved by the Children's Mercy Hospital (CMH) IRB from 2001 through 2005 were retrospectively reviewed for the proposed and actual accrual rates. At the same time, a Bayesian model for accrual was developed and applied to a clinical trial at Kansas University Medical Center to produce monthly reports projecting estimated final sample sizes with uncertainty limits given the initial projection and currently available enrolment data. Results: 117 (90%) of the studies submitted to the IRB did not specify a start date, a completion date, or both, making it impossible to assess the accrual rate. Of the remaining studies, two failed to list actual start or end dates. Of the remaining 11 studies, 8 took more time than proposed and the average increase in duration in these 8 studies was 100%. Among the 109 studies that included both a target and an actual sample size, 59 (54%) fell short of the proposed sample size. The average shortfall across these 59 studies was 55%. The informative prior used in the Bayesian model was reasonable and produced early estimates of total sample size that were an accurate reflection of the end result. Conclusions: A large number of studies failed to meet the specified sample sizes and the average shortfall among these studies was considerable. The Bayesian model for accrual produced useful reports for a particular study and provided reassurance to the researchers that their accrual rates were on target. The Bayesian model, however, also has the capability of correcting an inaccurate prior distribution as the accumulated accrual patterns provide contradictory results. Future research should focus on collaborations with organizations that conduct large numbers of clinical trials to get more data on existing problems with slipped deadlines and sample size shortfalls and to test the Bayesian accrual model on a wide range of clinical trials."

For the past 20, I've been very interested in Evidence Based Medicine (EBM). The foundation of EBM is critical appraisal of research studies, and successful appraisal requires an understanding of the credibility of published research. The development of critical appraisal tools has been done partly through expert opinion, but it has also been informed by empirical research about the research process.I've been fascinated by this "meta-research" or research about the research process. A good example of meta-research is a study

Hróbjartsson A, Gøtzsche PC: Is the placebo powerless? An analysis of clinical trials comparing placebo with no treatment. N Engl J Med 2001, 344:1594-1602. Available at http://www.nejm.org/doi/full/10.1056/NEJM200105243442106

This study looked at 130 trials that included a placebo arm and a no treatment arm. In general, there was no difference between the two, which indicates, among other things, that alternative medicine approaches that try to harness the placebo effect may be pinning their hopes on a fading star.

By the way, I drew a cartoon about placebos several years ago. I'm sharing it here to keep things from getting too boring.

Cartoon about placebos

Anyway in 2006 I decided to do my own meta-research. It was my observation that researchers would commonly promise to get a hundred patients into their clinical trial within a year. After two years, they'd only have a dozen patients and they'd come to see me to ask my permission to shut down the trial early.

I already had a model for monitoring accrual in mind, so I thought it would be interesting to get some hard numbers on the tendency for studies to have slow accrual. So I went to the IRB of Children's Mercy Hospital and asked for a random sample of 130 studies that ended in the years 2001-2005. The studies had to involve humans, and they had to be prospective. I got the initial protocol, any amendments, all continuing review reports, and the final report. With the help of my administrative assistant, Judy Champion, and a doctor at Children's Mercy, Vidya Sharma, we got information on the proposed sample size, actual sample size, proposed and actual start dates, and the proposed and actual end dates. If the study were part of a multi-center trial, we calculated the planned and actual sample sizes from CMH only.

Studies of the Pediatric Oncology Group were also excluded. This group conducts a large number of multi-center trials where they expect to only see one or two patients at each center. Additional information about whether the study had an external sponsor, a study coordinator, if consent was required, or if randomization was used.

The first surprise was that only 11 studies had enough information to assess whether the study started or ended on time. Researchers just did not mention this in the initial protocol and they did not comment on it during continuing review. This was partly the fault of the IRB, which did not demand this information. I viewed it as the IRB effectively signing a blank check, saying, in effect, that they didn't care if the study took two years or twenty years. Quite honestly, this issue is not even on the radar for most IRBs.

Table 1. Studies with sufficient data to compare actual and proposed accrual time frames.

  130 studies

 - 90 missing both the proposed start and proposed end date

 -  2 missing the proposed start date

 - 25 missing the proposed end date.

 = 13 with a proposed acrrual time frame

 -  1 missing the actual start date

 -  1 missing the actual end date

 = 11 evaluable studies

For the record, the 11 studies had a mean planned duration of 18 months (range 4.6 to 45 months) and 8 (73%) took longer than the planned duration. The averge relative increase in the duration of these 8 studies was 100% (range 6% to 286%).

There was sufficient information on most of the studies (n=109, to be precise) to look at sample size issues. The mean planned sample size was 49.5 (range 1 to 830). About half of the studies (54%) had an actual sample size that was less than the planned sample size. The average shortfall in these studies was 55%. There were 8 studies where no patients were recruited, including two studies that had planned sample sizes of 25 and 50. To be fair, about a quarter of the studies (24%) exceeded the proposed sample size, even when you factor these in, the average sample size shortfall was still 18%.

Table 2. Studies with sufficient data to compare proposed and actual sample sizes.

   130 studies

 -   2 missing proposed sample size

 -   2 with ambiguous sample sizes ("up to 40" and "up to 5")

 = 126 studies with an unambiguous proposed sample size

 -  17 missing the actual sample size

 = 109 evaluable studies

That's not too extreme a discrepancy, but it is large enough to be of concern. You typically have to get prior IRB approval before you can increase the sample size, but you can curtail the sample size without hearing a peep from them. A smaller sample size has the possibility of upsetting the risk/benefit balance so it should be reviewed, but it is unclear what action an IRB could take to prevent a researcher from stopping a study with fewer patients than originally promised.

These results are reported in greater detail at my old website:

-->http://www.childrensmercy.org/stats/08/SlippedDeadlines.aspx

Byron Gajewski and I have developed a simple Bayesian model for patient accrual. Suppose that you are proposing a research study that will recruit a total of n patients by time T. The trial has been running for a while and you have recruited so far a total of m patients. Let

t0=0

represent the start of the trial and let

t1<=t2<=...<=tm

represent the times that the m patients have entered your trial. You can model either the number of patients recruited on a given day as a Poisson distribution or the waiting time between successive patients as an exponential distribution. There are advantages and disadvantages to either approach, but for today, let's consider an exponential model of accrual. Define the m observed waiting times as

wi=t(i)-t(i-1)

Our job is to predict the remaining n-m unknown waiting times

Wm+1, Wm+2, ... Wn

which have an exponential distribution

f(w | theta) = (1 / theta) exp(-w / theta)

where the theta parameter represents the average waiting time. Place an informative prior distribution on theta using an inverse gamma prior with parameters k and V.

Inverse gamma distribution

It is fairly standard to show that the posterior distribution of theta is also inverse gamma. Eliciting a prior distribution is tricky, but my colleague suggested a fairly simple question to start the process. You ask two questions.

1. How long do you think this trial will take, and

2. On a scale of 1-10, how certain are you of this result?

Take the answer to the second question and divide it by 10 to get P. Then a starting choice for the inverse gamma prior distribution would be k=nP and V=TP. This prior can then be used to forecast the expected duation of the trial and various percentiles. If you look at these percentiles and think they are too narrow, you can decrease P. If you think they are too wide, you can increase P.

The posterior distribution of theta is inverse gamma with parameters k=nP+m and V=TP+tm. This posterior distribution has a mean of

Posterior mean

This posterior mean is a weighted average of the prior mean waiting time and the observed mean waiting time.

Here's a graph illustrating the projected completion time. The gray region is a 95% credible region, and the white line represents the median time. The black line represents the observed accrual. Notice that the white line is bent to the left relative to the data. The prior was optimistic relative to the accrual data observed so far and it still pulls a fair amount of weight relative to the data.

Graph of projected study completion

This is data from an actual clinical trial. The researchers believed that it would take 3 years (1095 days) to recruit 350 patients (for an average waiting time of 3.1 days between patients). When asked how confident they were on a scale of 1 to 10, they replied with a 5. The prior distribution would be IG(k=175,V=547.5). After tm=239 days, the researchers had recruited 41 patients (for an average waiting time of 5.8 days per patient). The posterior distribution for the waiting time would be IG(k=175+41=216, V=547.5+239=786.5). The mean of the posterior distribution is 3.7, which is a weighted average of the data mean (5.8) and the prior mean (3.1).

You get a prediction for the duration of the trial by drawing random number theta from the posterior distribution, then drawing 309 exponential random variables with that theta. Add the 309 exponentials to the 239 days that you've already waited to get the trial duration. Repeat this about a thousand times. This estimate accounts for both the uncertainty associated with exponential waiting times and the uncertainty associated with imperfect knowledge of the mean of this exponential distribution.

There's a closed form solution, which is worth examining. Multiply posterior distribution for theta by the distribution of the waiting time for the remaining n-m patients and integrate out theta.

Closed form solution, step 1

This integral looks tough, but you can handle it. First pull out anything not relating to theta. At the same time, regroup the terms left inside the integral.

Closed form solution, step 2

What remains inside sort of looks like an inverse gamma distribution. Put in the appropriate normalizing constants.

Closed form solution, part 3

You've done it! With the right normalizing constants, the integral equals 1. What is left outside the integral is the distribution function for the waiting time for the remaining n-m patients. Now plug in k=nP+m and V=TP+tm to get

Closed form solution, final step

This sort of looks like a twisted form of the beta distribution.

If you use a change of variable

Change of variable

and simplify the notation with

Simplifying notation

then you get the following density function

Inverse beta distribution

This is known as the inverse beta distribution or the beta-prime distribution. It is actually quite close to an F distribution as well.

If you go to Wikipedia, you will find a couple of interesting properties involving the beta-prime distribution.

Properties of the beta-prime distribution

This distribution represents the splitting point between two gamma distributions. One gamma distribution represents the remaining n-m patients. The second gamma distribution represents the first m patients plus the nP "psuedo-patients" created through the prior distribution. With the inverse beta distribution, you can now place exact probability bounds on the total duration of the clinical trial

Probability limits for trial duration

where B.025 and B.975 are percentiles from the beta (not beta-prime) distribution with parameters alpha=n-m and beta=nP+m. An equivalent confidence interval would be

Alternate form of confidence interval

where F.025 and F.975 are percentiles of the F distribution with 2(n-m) and 2(nP+m) degrees of freedom.

There are some extensions to the accrual model that I want to work on. The accrual model shown above is a homogenous model. There are several settings for clinical trials that can produce a more complex pattern of acccrual. I call these heterogenous models.

Some clinical trials have a "warm-up" period at the beginning when accrual is slower.

Graph of nonconstant accrual
Figure 3. Graph of nonconstant accrual rate.

There are several models that are worth examining here, such as an "elbow" model allows a continuous transition in accrual (Figure 5).

An elbow function for modelling slow early accrual rates
Figure 5. An elbow function for modelling slow early accrual rates.

In the elbow model, the accrual rate starts at 0.11 and rises linearly over 90 days. After 90 days, the accrual rate levels off at 0.33. This model requires the specification of priors on three different parameters:

Another extension for accrual is a hierarchical model. Different centers have different accrual rates and their staggered entry into the trial will also cause heterogeneity. Figure 7 shows a simulated example of accrual from three centers.

Graph of accrual in a multicenter trial
Figure 7. Accrual from a multicenter trial.

We believe that a hierarchical model can fit data well from a multi-center trial. An offset term would be needed to allow the size of a center to enter into the equation. A random center effect in the Bayesian model will allow each center to over-perform or under-perform relative to its size. Entry times for individual centers could be prespecified, or these times could be additional random components of the model.

One big advantage of a hierarchical approach is that the accrual pattern of a late entering center would benefit not only from the prior distribution but also on accumulated accrual information from the centers that are already in the trial. The hierarchical model allows one center to borrow information on accrual patterns from other centers, with the amount of information being borrowed limited by the degree of heterogeneity among centers. This will allow IRBs and DSMBs to receive more precise and accurate prediction of study completion time, enahncing their ability to make critical decisions about modifying or terminating a study.

Another advantage is that a hierarchical model could accomodate mid-study additions of new centers intended to compensate for slow accrual. A DSMB, for example, could examine the projected completion date midways through a twenty center study, but could also get a simulated answer to whether adding four additional centers at this late date would have any impact on the projected completion time.

Another source of heterogeneity may be caused by a variation in the waiting time distribution. The exponential distribution has a "memoryless" property. This means that if you've waited five days after a given patient has arrived and no one new has volunteered yet, the probability that someone will volunteer tomorrow is no different than it was on the day after the last patient had volunteered.

This assumption may not be too unreasonable but there are some alternatives. The gamma distribution, for example, with the shape parameter less than 1 will represent an accrual pattern of "feast or famine". In this setting, patients arrive in very rapid succession at times with long stretches of no activity in between. Such a pattern is illustrated in Figure 8.

Graph of first alternative to exponential accrual
Figure 8. Accrual using a gamma distribution with shape parameter = 0.1.

It is possible that the opposite might occur and that patients appear at more or less regular intervals. This could be modelled by a gamma distribution with shape parameter greater than 1. This might occur if there is a queue of patients waiting at a gate of some kind that retard progress to some extent. In this setting, patients appear more or less like the drips of a faucet, as is illustrated in Figure 9.

Graph of second alternative to exponential accrual
Figure 9. Accrual using a gamma distribution with shape parameter = 10.

There are other alternative distributions for waiting times other than the gamma distribution, of course. You can also induce similar patterns of irregularity or excess regularity through a correlational model. The hallmark of most of these alternatives, we suspect will be in how they control the irregularity of patient accrual.

Most clinical trials have one of more screening steps and the loss of patients during the screening will contribute to heterogeneity. Screening could involve a decision by the researcher that a subject who volunteers is ineligible for the trial or it could involve a decision by the volunteer not to participate after the details of the study are explained to him.

An example of losses due to screening appears in Figure 10. The vertical lines represent the appearance of research volunteers, but the red lines represent volunteers who refuse to sign the consent form, who are deemed ineligible by the recruiting physician, or who otherwise do not make it into the trial. Thus the cumulative number of patients jumps up at each black line but stays flat at each red line.

Graph of accrual with extra accept/reject step
Figure 10. Accrual with an extra screening step.

Losses due to screening are easily modelled using a beta-binomial model and this can be added onto the exponential waiting time. The actual waiting time for patients who make it into the trial will not be exponential, but rather a sum of a random number of exponential random variables reflecting the waiting time extension when a subject fails a screen.

Creative Commons License This page was written by Steve Simon and is licensed under the Creative Commons Attribution 3.0 United States License. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Accrual Problems.