P.Mean: Archive organized by date (created 2011-01-01).

This page lists files created in calendar year 2011. Also look at the archives for 2013, 2012, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, and 1999. You can also browse through an archive of pages organized by topic.

December 2011

59. P.Mean: Unrealistic scenarios for sample size calculations (created 2011-12-20). I'm not a doctor, so when someone presents information to me about the clinically important difference (a crucial component of any sample size justification), I should just accept their judgement. After all, I've never spent a day in a clinic in my life (at least not on the MD side) so who am I to say what's clinically important. Nevertheless, sometimes I'm presented with a scenario where the clinically important difference is so extreme that I have to raise a question. Here's a recent example.

58. P.Mean: Discrepancies in the chisquare test (created 2011-12-16). I was working with two researchers on a project and they got different results for their chisquare tests. See if you can find out what went wrong.

57. P.Mean: Abstracts for teaching about p-values and confidence intervals (created 2011-12-08). I am giving a webinar to a group that is interested in applications of statistics to Alzheimer's disease and schizophrenia. I wanted to show some real world uses of p-values and confidence intervals, and did a few quick searches for open source articles. I also am including the abstracts of several articles that they sent me. The font size on the handout is a bit small, so I am including the abstracts here as well so you can view them with a reasonable font size.

November 2011

56. P.Mean: Justifying the sample size in a qualitative research study (created 2011-11-22). I was asked to help justify the sample size for a qualitative research study. When the goals of a study are qualitative, the sample size justification is also qualitative.

55. P.Mean: Why is a 20% dropout rate bad? (created 2011-11-21). Dear Professor Mean, How can we give an evidence based answer about why 20% loss of follow-up in a randomized trial is too much?

54. P.Mean: Possible webcast topics for a new client (created 2011-11-14). I was asked to develop a list of webcast topics for a client working in the pharmaceutical industry. They are listed roughly in priority order for the first half. I have some handouts already for some of these classes and I have included links to those handouts when possible. These handouts would be updated, of course and I do want to target specific medical applications for this client, though, so I might swap out the some of the examples used in the small group exercises. Still, they give a rough idea of topic coverage.

53. P.Mean: Looking at another grant opportunity (created 2011-11-07). It must be masochism on my part, but I'm looking at writing yet another grant. This grant would go to the Heartland Institute for Clinical and Translational Research and would be for a pilot study.

52. P.Mean: A bunch of univariate nonparametric tests versus a single parametric model (created 2011-11-03). Dear Professor Mean, I'm working on a of weight loss during hospitalization. I'm measuring the percent change in the weight loss from admission to discharge and looking at the factors that influence it. I ran some non parametric tests and found a few factors that were associated with the weight loss. When I run a multivariate linear regression model, only one factor is associated with weight loss. The linear regression model assumes normative data, so I am not sure I can do that here. The data appears to be normally distributed but fails the test of normality. So, should I just report the non parametric tests? Is there a multivariable model for non normally distributed data?

October 2011

51. P.Mean: A bunch of univariate nonparametric tests versus a single parametric model (created 2011-11-03). Dear Professor Mean, I'm working on a of weight loss during hospitalization. I'm measuring the percent change in the weight loss from admission to discharge and looking at the factors that influence it. I ran some non parametric tests and found a few factors that were associated with the weight loss. When I run a multivariate linear regression model, only one factor is associated with weight loss. The linear regression model assumes normative data, so I am not sure I can do that here. The data appears to be normally distributed but fails the test of normality. So, should I just report the non parametric tests? Is there a multivariable model for non normally distributed data?

50. P.Mean: Meta-analysis with non-comparable procedures (created 2011-10-31). Dear Professor Mean, In publications on meta-analysis where vast numbers of papers must be culled from the analyzable dataset due to non-comparable procedures. The resulting smaller sample sizes can reduce power which then limits the ability to detect significance. Isn't this a problem?

49. P.Mean: Lasagna's Law on patient recruitment (created 2011-10-24). I sent out a question to an email discussion group asking about documentation of sample size shortfalls in clinical research. Someone suggested that I google "Lasagna's Law." What a great suggestion. Here's what I found.

48. P.Mean: My entry into the Applications of R in Business Competition (created 2011-10-20). I recently heard about a contest sponsored by Revolution Analytics. Revolution Analytics is offering \$20,000 in prizes to the best examples of applying R to business problems. This competition is designed to grow the collection of on-line materials describing how to use R, and to spur adoption of R and Revolution R for business applications. The contest is open to all R users worldwide. http://www.inside-r.org/howto/enter. I want to submit the R code that I developed for the accrual paper published by Byron Gajewski and me in 2008.

47. P.Mean: Net-accessible resources for group sequential designs (created 2011-10-19). Dear Professor Mean: I am trying to locate a good net-accessible resource for the group sequential designs, and sample size re-estimation for adaptive designs. Can you help?

46. P.Mean: How you can teach Bayesian methods in an introductory Statistics class and why you should (created 2011-10-17). There's been a discussion among members of the Statistics in Epidemiology Section of the American Statistical Association about what topics should be covered in an introductory Statistics class. Within that discussion there has been a polite but heated debate about whether it is worthwhile to teach Bayesian methods in such a class. Some people were for it, but others thought it would be too confusing. Here's what I wrote about the topic.

45. P.Mean: Conditional Frailty Models (created January 20, 2006, updated 2011-10-11). One of the people I am working with is interested in using gap time analysis with a conditional frailty model. I was impressed with this request and asked her to send any relevant references that she had. She gave me a pointer to the following PDF file.

September 2011

44. P.Mean: Draft grant submission on patient accrual (created 2011-09-13). Here's an early draft of a grant submission on patient accrual.

43. P.Mean: Sharing the same classroom for five consecutive years (created 2011-09-12). I needed to do a probability in in my head. I think I did it wrong and can't figure out. I'm trying to figure out what the probability of two kids out 50 that get divided into class each year. What is the chance that they would be in same class five years in a row. I thought it was 0.5 times 5. Or 2.5 But that was not right because 6 years in a row would be 3. What did I do wrong?

42. P.Mean: Should I report the univariate or the multivariate logistic regression analysis? (created 2011-09-07). Dear Professor Mean, I have results from four univariate logistic regression models and one multivariate logistic regression model with all four variables. In my univariate analysis, all the variables are significant. But, in multivariate analysis, only x1 is significant. Which results should I report?

41. P.Mean: Statistics is more than just cutting computer code (created 2011-09-06). Someon on LinkedIn asked how to react to a comment like "Statistics is just cutting computer code, right?" Here's how I responded.

40. P.Mean: I can't get SAS to model the cluster effects in the MEPS data set (created 2011-09-02). I'm trying to use SAS to analyze data from the Medical Expenditure Panel Survey (MEPS) but when I try to model the cluster effect using proc glimmix, I get an error message. What am I doing wrong?

39. P.Mean: Can I salvage my negative confidence interval (created 2011-09-02). I was involved in a small case-control project that was intended to explore some genotypes as predictors of disease progression. We had between 50 and 60 cases and controls (each). One particular predictor had a OR of 0.5 with 95% confidence limits of 0.2 and 1.2. We reported the negative reulsts, but a a long time a go, I did read some papers showing some different interpretations of confidence intervals. If I remember right, there was some statements like: it is less likely that a point estimate such as 0.5[0.2-1.2] be 1 then one as 0.8[0.5-1.2], considering the proportion of the CI that is distant from 1. Even now, it sounds weird to me. Can I say something about this in my paper?

38. P.Mean: What are the assumptions of logistic regression (created 2011-09-01). Does anyone have a good reference for the assumptions of binary logistic regression? I have a client who has an anonymous reviewer who says his analysis doesn't meet one of the assumptions, but it doesn't make any sense in this situation, and I think the reviewer doesn't understand something.

August 2011

37. P.Mean: Positive statements about no conflicts of interest (created 2011-08-16). There is a lot of confusion about when you can report "No conflict of interest." You don't know whether this means that there is no financial relationship with any pharmaceutical product, with a phamaceutical product named in the paper, or with competitors to pharmaceutical products named in the paper. You don't know if the person making the claim about no conflict has gotten money from a drug company, but believes that this does not influlence his/her perspective. I believe that the no conflict statement should be replaced with something far more specific. Here are some examples.

36. P.Mean: Banning editorials and clinical reviews from authors with industry ties (created 2011-08-13). BMJ published a commentary on conflict of interest policies that ended with the question "should the BMJ ... ban editorials and clinical reviews from authors with ties to industry?" Here's my response to that question.

35. P.Mean: The advantages of IBM SPSS software (created 2011-08-11). You have a lot of choices for how to do your data analysis. I have found that the best option for most people I work with is to use IBM SPSS software. Here are the main reasons why IBM SPSS is your best choice.

33. P.Mean: Using Social Media to promote your consulting career (created 2011-08-01). I am leading a roundtable discussion of using Social Media to promote your consulting career. Here are some things that I plan to discuss.

July 2011

32. P.Mean: Validation of OpenEpi software (created 2011-07-27). I was asked to "validate" a software program called OpenEpi. If you want to validate software, you need to show that it produces correct answers for a variety of test cases. This webpage outlines the range of test cases and demonstrates validity for those cases by comparing them to an alternative program and to published peer-reviewed research sources.

31. Second invitation to talk about how independent consulting is different (created 2011-07-24). I have a second invitation to talk on how independent consulting is different in 2012 (send me an email if you're curious about when and where). I was asked to submit an abstract, so here it is.

30. P.Mean: There's more than one way to calculate a Fisher's exact p-value (created 2011-07-21). I was trying to check the calculations associated with a two by two table and I noticed an inconsistency in the reporting of results. One program reported a p-value of 0.4588 for the two-tailed Fisher's exact test, and the other package reported a p-value of 0.308088. The packages otherwise agreed with one another. So which package is right? Well it turns out that both of them are correct because there is more than one way to calculate a Fisher's exact p-value. To understand this, you need to recall the computational details of Fisher's exact test.

29. P.Mean: Establishing accuracy for a statistical program (created 2011-07-14). I've been asked to establish that a particular program, OpenEpi, can accurately calculate test statistics for a two by two table. The package appears quite good, and it is written by some really smart people, which would have been good enough for me, but I do understand that others feel the need to go an extra step. So here is how you might establish accuracy.

28. P.Mean: Calculating a confidence interval for a standard deviation (created 2011-07-10). Suppose that you had a sample of 80 observations and you computed a standard deviation of those 80 observations. Like any other statistic, the standard deviation will have some sampling error associated with it. But how much sampling error. Not a whole lot, it seems, but you could quantify this easily using the Chi-square distribution.

June 2011

27. P.Mean: Software that I use on my computers (created 2011-06-21). I thought it might be useful to list the software programs that I use on my various laptop computers. That way I can check to see if I have installed the programs that I used to use on my earlier laptop onto my new laptop. I also want to check (when the software license allows it) that I have installed a second copy of this software on my small laptop. This is one of those pages that is probably more useful to me than it is to you.

26. P.Mean: How much work does that second reviewer have to do in a meta-analysis (created 2011-06-20). Someone asked about the process of using a second reviewer in a meta-analysis to abstract data from studies. The rationale for a second reviewer, of course, is to establish that there is no serious subjectivity involved with the recording of information from individual studies. By showing that two independent reviewers produced roughly comparable data set, you have established objectivity in the data abstraction step. The question arises, though, do you have to use the second reviewer on all studies, or can you just do this for a certain percentage of the studies. If so, is there a certain percentage that is generally accepted?

25. P.Mean: Looking for help to test software for monitoring accrual in a clinical trial (created 2011-06-16). I need some collaborators for a grant I am writing from people who conduct prospective clinical trials. I am working on methods to monitor patient accrual in clinical trials. Accrual means how rapidly do patients enter into a clinical trial. In my experience, researchers overpromise and underdeliver on the time frame in which they expect to recruit a certain number of patients.

24. P.Mean: Small business grant? Maybe not (created 2011-06-16). I want to document on this webpage, a general idea of where we might want to submit a grant to continue our work on accrual models. In particular, I was originally leaning towards an SBIR (Small Business Innovation Research) grant, but now I am not so sure. The impetus was attendance at a webinar on how to write a grant for SBIR.

23. P.Mean: How I became a skeptic (created 2011-06-15). I'm a big fan of the skeptic movement. If you're not familiar with this, it is a group of professional and amateur scientists who critically examine claims of fringe science areas like parapsychology, UFOs, and alternative medicine. So when a blog post on the James Randi Educational Foundation website called for people to share their stories of how people became skeptics, I wrote the following story.

22. P.Mean: Cartoon about placebos (created 2011-06-14). I drew a small cartoon about placebos. I know you think that this is drawn by a professional artist, but I did this. Really!

21. P.Mean: Which version of SPSS should you get (created 2011-06-03). I was showing a client how to use their version of SPSS to a variety of different things and when I went to run a logistic regression model, it wasn't there. Apparently, there are several versions of SPSS (I knew this already) and some of the versions do not include logistic regression (that I was surprised to find out). I had to research all the options and offer a recommendation. Here's a quick guide to what I learned by browsing through the SPSS site.

20. P.Mean: When you're stuck writing major sections of another person's grant (created 2011-06-02). I was helping someone write a grant when I got that request that I always dread, "Can you write this section of the grant." I hate those requests for a personal reason--I'd much rather tell someone else what to do than to actually do it myself. One of the great joys of consulting is being able to boss other people around. But there's a serious reason why I dislike this. I believe that a grant should be written by one person, with guidance of course by other experts. But one person needs to have at least a passing level of familiarity with each and every aspect of the grant; enough familiarity that they can write the entire grant. It also assures consistency of tone and language. But there are often reasons why this can't be done, and if you're stuck writing major sections of someone else's grant, you need to write your section of the grant so that it fits in well with the rest of the grant. There's a famous saying that a camel is a horse designed by a committee. You want to make sure that the completed protocol does not come out looking like a camel. If certain sections have abrupt transitions, use different terms for the same thing, and have radical changes in writing style, you've got problems. You won't get things perfect, and I certainly didn't with this project. But the closer you get, the better the grant will be.

19. P.Mean: New shortened structure for NIH grants (created 2011-06-02). I am working on an NIH grant looking at various Bayesian models for accrual. NIH changed the grant proposal format last year to a much shorter proposal. Good for them, I say. Here are some of the details that I'm reviewing prior to writing my grant proposal.

18. P.Mean: It's been a quiet year (created 2011-06-01). I was asked by my boss to document my scholarly activities for the past year. I used July 1, 2010 through June 30, 2011 as the time frame (I don't anticipate a lot happening in the next 30 days). Here's what I wrote

May 2011

17. P.Mean: A simple segmented linear regression model, borrowed from the BUGS manual (created 2011-05-25). I am interested in various extensions to the simple Bayesian model for accrual that Byron Gajewski and I derived and published in Statistics in Medicine. An important extension would be a segmented regression model for accrual that would allow for slow accrual at the start of the study, gradually rising to a steady state of accrual. Before I tackle that extension, I want to see how a simpler segmented regression model works in BUGS. I'm borrowing an example from the BUGS manual.

16. P.Mean: Using a binary coding trick illustrated by a Car Talk puzzler (created 2011-05-21). I often need to see how often certain variables and combinations of those variables appear in a data set. If the variable is binary, there is a trick for doing this that is illustrated by a Car Talk puzzler.

15. A simple hierarchical model for the Poisson distribution, borrowed from the BUGS manual (created 2011-05-20). I am interested in various extensions to the simple Bayesian model for accrual that Byron Gajewski and I derived and published in Statistics in Medicine. An important extension would be accrual in multi-center trials. A hierarchical model makes a lot of sense in this case, so I wanted to examine a simple hierarchical model that appears in the BUGS manual.

14. Promoting your consulting career in the era of Web 2.0 (created 2011-05-20). I was approached by a member of the planning committee for the American Statistical Association Conference on Statistical Practice about giving a talk at that conference. The talk would be an extension of a roundtable discussion I am giving at the Joint Statistical Meetings in 2011, Using Email Newsletters, Webinars, Blogs, And Social Media To Promote Your Consulting Career. After a telephone call this morning, I offered to prepare an abstract of a talk that I might give at this conference. I'm very flexible on the content of this talk, but I thought it would be a good idea to put my thoughts down in writing.

13. P.Mean: Yet another biography (created 2011-05-16). I'm asked often to provide a short biography that can be used as an introduction to a talk I am giving. It helps to keep track of these on my website. I have versions written in 2009, 2008, 2004, and 2002. Here's the latest biography.

12. P.Mean: What I'd look for in a new computer (created 2011-05-16). I am hardly an expert on computing, but I do try to help out when someone asks me about what sort of computer they should buy for statistical analyses. Here are some general guidelines that I offer. I'm assuming that you want a system that can run Windows and the advice here is not all that helpful if you are using the MacOS or Linux.

11. Is it ethical to provide statistical consulting on a disseration to a Ph.D. candidate (created 2011-05-11). Someone asked a hypothetical question about consulting assistance for a Ph.D. candidate. Clearly some assistance is okay and the question is when the work becomes so much that the work is no longer perceived as that of the Ph.D. candidate.

10. How independent consulting is different (created 2011-05-09). There's a huge difference between independent consulting and any of these other forms of consulting. I want to identify some of the major differences that I have experienced as an independent consultant.

9. P.Mean: Why is my standard deviation so small? (created 2011-05-02). I am helping someone with a projec that involves (among other things), computing averages of many Likert scale items. A Likert scale has different interpretations, but I use the term to mean a scale that has five items with a logical ordering. So the scale 1=Strongly disagree, 2=Disagree, 3=Neutral, 4=Agree, and 5=Strongly agree is a Likert scale. This person ran some descriptive statistics on the individual items and on the mean of those items. The results are shown below with generic names for the individual items. I was asked why the average had a standard deviation that was so much smaller than the standard deviations of the individual items.

April 2011

8. P.Mean: Resources for Comparative Effectiveness Research (created 2011-04-13). I attended an interesting webinar on Comparative Effectiveness Research (CER). I always try to take notes during presentations like this, but my notes are often a poor amagalm of random thoughts and realizations. What I did find, though, during this webinar, were links to two important resources for CER.

7. P.Mean: Thinking about the title for my second book (created 2011-04-11). As I mentioned on an earlier webpage, Cambridge University Press has agreed to publish my second book. There were some suggestions, including a change in the proposed title to "Successful Research Projects: A Practical Guide for the Health and Social Sciences." I was not thrilled with this title and I was appreciative when my contact at Cambridge described that title as bland.

March 2011

6. P.Mean: Does a wide confidence interval mean that my conclusions are all wrong? (created 2011-03-24). Dear Professor Mean, My confidence intervals are very wide. I do not know how to explain this. Does this mean that my results are likely to be wrong?

5. P.Mean: Macros in Stata (created 2011-03-08). I have just started using macros in Stata. I like R better, but Stata has a pretty good set of macro facilities, once you get the hang of things. Here is a simple example.

4. The nature of advice on email discussion lists (created 2011-03-08). I participate on several email discussion lists, and someone complained a bit about the advice he was getting. "So my question is if this forum is open for people like me? Can I ask questions and get advice without being patronised?" Here's what I wrote in response.

February 2011

3. P.Mean: My special Zotero style (created 2011-02-22). I use Zotero to produce html code for the "Outside Resource" section of my category pages. It requires a special style which I have adapted. Here is the code for that style.

2. P.Mean: I won't serve on my IRB, but there is a way I can still help (created 2011-02-17). I had offered to help out our local IRB but was not careful to clarify how I could help. This resulted in me being appointed as an alternate member of the IRB. That's not a good role for me because of my conflict of interest. I work with a large number of clients through the Research and Statistical Consult Service (RSCS), and would have a hard time providing an impartial review of an IRB application that I had some role in developing. Also, for researchers who have not worked with the RSCS, if there protocols had major statistical flaws, I would be uncomfortable asking them to consult with the RSCS, as it might appear that I'm trying to artificially build demand for the RSCS. So I'll have to turn down this role. But there is another way I could help.

January 2011

1. P.Mean: Good news about my second book proposal (created 2011-01-01). I got an email three days ago from my contact at Cambridge University Press. It looks like they want to publish my book!