[Previous issue] [Next issue]
[The Monthly Mean] May/June 2012-- Old and new websites reunited at last
The Monthly Mean is a newsletter with articles about Statistics with occasional forays into research ethics and evidence based medicine. I try to keep the articles non-technical, as far as that is possible in Statistics. The newsletter also includes links to interesting articles and websites. There is a very bad joke in every newsletter as well as a bit of personal news about me and my family.
Welcome to the Monthly Mean newsletter for May/June 2012. If you are having trouble reading this newsletter in your email system, please go to www.pmean.com/news/201205.html. If you are not yet subscribed to this newsletter, you can sign on at www.pmean.com/news. If you no longer wish to receive this newsletter, there is a link to unsubscribe at the bottom of this email. Here's a list of topics.
--> Old and new websites reunited at last
--> Justifying your sample size when you only have a median and an interquartile range
--> All or nothing
--> Monthly Mean Article (peer reviewed): Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents
--> Monthly Mean Article (popular press): In Documents on Pain Drug Celebrex, Signs of Doubt and Deception
--> Monthly Mean Book: Thinking, Fast and Slow
--> Monthly Mean Definition: What is autocorrelation?
--> Monthly Mean Quote: Equipoise is ...
--> Monthly Mean Video: TEDxSwarthmore - Steve Wang - 180 Degrees
--> Monthly Mean Website: R versus SAS/SPSS in corporations
--> Nick News: Nicholas in Paris
--> Very bad joke: How many epidemiologists...
--> Tell me what you think.
--> Join me on Facebook, LinkedIn and Twitter
--> Permission to re-use any of the material in this newsletter
--> Old and new websites reunited at last. When I left Children's Mercy Hospital, I had to leave my webpages behind and start a new website from scratch. But this month, the old and the new websites are re-united, which is part of a long story.
Back on December 22, 1997, I convinced my employer, Children's Mercy Hospital (CMH) to put up a few webpages on their external website about statistics in medical research. It was an unusual thing for CMH to allow, and I am grateful that they gave me space on the server and time to work on this. I just put up informational handouts that I used as part of my consulting at CMH, as well as handouts from some of the short courses that I taught at CMH. Frequently a researcher at CMH would come in with a question and I'd show them some resources on my website that might help them. I also developed new pages, as I had time, typically one or two per week. After 11 years of this, I had well over a thousand pages.
I called the website StATS, which was short for Steve's Attempt to Teach Statistics. Over the years, the website has morphed and reformed, in many ways.
In 1999, I started an advice column, which I first called "Ask Doctor Data." I thought it was a cute turn on the concept of doctoring (fudging) your data, and I boasted about how only those with a Ph.D. in Statistics should be allowed to doctor their data. The name caused a problem, though. A lot of people on the Internet would Google the phrase "ask doctor" and my page would turn up near the top of the list. Then people would ask me some very strange medical questions without first taking the time to read my site. I put a disclaimer on my site pointing out that I was a Ph.D. and not an M.D. and even offering some legitimate sites where you could ask questions about medical issues. It didn't work, so I had to switch from "Ask Doctor Data" to "Ask Professor Mean." For those people who didn't get the joke, I pointed out that Professor Mean is not just your average professor. I've written hundreds of pages in the "Ask Professor Mean" format, sometimes in response to a real question, and sometimes an adaptation to protect people's identities, and sometimes a generic question that I was commonly asked but which was not prompted by any specific questioner. One of my favorite early pages in this format was advice to Harried Howard (a real person) who was going on a job interview and was worried about getting questions about interim data analysis.
I wrote almost all the content myself, but I did adapt a very nice email from Ronan Conroy (with his permission, of course) about where to develop new ideas for research. It is one of the best pages on my site and I have used it in many of my classes.
I also developed pages about critical appraisal of the literature. For example, I had a nice example of computation of the Number Needed to Treat (NNT) along with some examples taken from the literature
This led to a book, Statistical Evidence in Medical Trials, which was an amalgamation of hundreds of web pages written in 2000 through 2006. I have a page bragging about my book with links to some of the old material and pointers to some reviews of my book though this page probably has a bunch of broken links right now.
Probably the page I hand out the most is my general guidance on power calculations and sample size justification. It was written in 2001. I've added over 50 additional pages about power and sample size justification since then, but the material I wrote over a decade ago is still highly relevant.
In 2002, I took a foray in research ethics, and developed a handout on how to get IRB approval for your research study. It is one of the few pages where I used clip art. I'm not an expert on ethics, except for the fact that many ethical issues touch on statistical issues, especially issues about design of experiments.
In 2003, I developed some material on diagnostic testing. I included a small invention, the likelihood ratio slide rule, which is a portable version of the Fagan nomogram. There are many controversies about diagnostic testing (such as whether it is appropriate for women between the ages of 40 and 50 to get a yearly mammogram) and you need to understand some fairly sophisticated statistical arguments about false negative and false positive findings to get a full appreciation of these controversies.
In 2004, I developed a weblog format for my site. My first entry, on February 4, was an annotated list of educational resources about statistics that you could share with your research team. The weblog format encouraged me to post semi-regularly, and not to keep mixing and mashing old pages into new ones.
On a personal note, 2004 was also the year that my wife and I adopted a 2 year old boy, Nicholas. I had the details about that on my web pages, but was told several years later to take that page down because the Children's Mercy website was not a place for personal information like this. Fair enough, but it should have served as a warning of things to come.
I've never worried too much about making my pages appear high in the Google page ranks. I did note once that my definition for retrospective research was ranked at the top of Google when you searched on the word "prospective." I'm still shocked at the irony. How backwards is it that my retrospective page beat out a bunch of pages devoted to defining the term "prospective."
I've already shared my enema story in this newsletter, but if you are a relatively new subscriber, you might take a peek at a time when I wish that I wasn't so high on the Google page rank.
In 2006, I tried a different tack. I started writing Statistical Koans. I dropped the concept after a few months, as I wasn't quite sure where to go with it. But I still like the first koan that I wrote.
I've always tried to heavily crosslink inside my site, but in 2006, I decided to develop category pages for various statistical concepts, and make sure each page had a link to at least one of these category pages. It was an immense effort, but it brought a strong sense of structure to my entire website. I also took the time to track down the creation dates of some of the pages on my website that were developed prior to the use of my weblog format.
I gave a talk on the philosophy of open source publication in October 2007. I was a big fan of open source publication largely because my website had benefited so much from being able to use and link to so many open source publications. After the talk I decided to put my money where my mouth was, and designated all of my webpages as open source. Everything was fine for six months, but then all hell broke loose. Some noticed after all that time that I was designating something on the Children's Mercy website as open source. I was not allowed to do this. Fair enough, and I reverted to the original copyright notice (which allowed for some uses, but was far more restrictive than an open source license). It was pretty easy to do the global search and replace, but the incident made me realize that "my" website was not really mine.
In June 2008, I requested and got a leave of absence to explore the possibility of starting a new career as an independent consultant. I got enough encouragement that I made the jump. I had to return to Children's Mercy to finish up a few things in September and October, but by November 2008 I was on my own. One of the first things I did on my leave was to set up a website to advertise my consulting service. I purchased my own domain name and set up my own website. I had a backup copy of the stuff I had written while at CMH, and I toyed with the idea of putting it all up on the new site. In fact, my first weblog entry at the new site talked about the 1,192 pages that I was bringing with me. But CMH kept my pages up, even after November, so I thought it was best to link to the pages at CMH, rather than reproduce all the old pages at the new site. So I just wrote new pages and linked back to the old pages, until I had over a thousand links to the old website. That's the way things stood for the next four years.
There weren't a lot of big changes at my new site. Any new page that I wrote started out with P.Mean instead of StATS. As I found errors or broken links at my old website, I would fix things and place the new material at my new site, with a link back to the original file. I was never sure if someone at CMH would send some lawyers after me, but nothing ever happened.
There were two big changes at the new website. The first was the development of an email newsletter, The Monthly Mean. You're reading a copy right now, of course, though it turns out that The Monthly Mean is more of a bimonthly newsletter.
A second big change was presenting webinars, which I started in October 2009. This was a lot of work, and I gave away quite a few of these webinars to build up some publicity and goodwill and to test the waters for possibly charging for future webinars. It became too much work to present these regularly, so I stopped doing webinars in September 2010. I might restart them if there were enough demand, but for now, there are plenty of other things to keep me busy.
So in June 2012, I found out that CMH was no longer keeping my material up on their website. To be honest, it never did fit in well with the image they were trying to present with the rest of their website. I'm excited to have an excuse to put all this old material up again on my new website. There's always the chance that someone at CMH will raise an objection to what I'm doing, but I seriously doubt they will. I'm also confident that if someone does have a problem, we can find an accommodation that will make everyone happy.
If you're curious what the 1,052 pages have added to this website, start at the 1999 archive page and work your way forward. I've mentioned a few highlights in this article, but there's lots of really good stuff to browse through.
Did you like this article? Visit http://www.pmean.com/category/WebsiteDetails.html for related links and pages.
--> Justifying your sample size when you only have a median and an interquartile range. Dear Professor Mean, I have been asked to estimate a sample size for a parameter, where the only data in the literature is the median and the interquartile range (IQR). Can I estimate a sample size from this data? You didn't say whether you wanted to justify your sample size using the desired width of the 95% confidence interval or using a power calculation for a particular hypothesis test. But the simple answer is to pretend that the median is actually the mean and pretend that the data is approximately normally distributed. It may not be perfect, but I doubt you could improve on things without more information about the distribution of your data. Besides, you're probably going to be able to rely on the central limit theorem. So here's what you do. Notice that the standard normal has its 25th and 75th percentiles about 2/3 of a standard deviation away from zero. So the interquartile range for a normal distribution is going to be approximately 4/3 of a standard deviation. Plug this standard deviation into your confidence interval or sample size formula.
Did you like this article? Visit http://www.pmean.com/category/SampleSizeJustification.html for related links and pages.
--> All or nothing. This is one of the earliest Ask Professor Mean pages that I could find. It's a fictional question, but it provides a rational basis for why 6 to 8 subjects is probably a floor for the sample size, even in a setting where you expect an "all or nothing" response.
Dear Professor Mean, I would like to know the minimum number of patients needed in order to achieve statistical significance. I am assuming a perfect research situation where all of the patients who got a treatment lived and all the patients who got the placebo died. What would the proper sample size for an all or nothing response be?-- Hesitant Harrison
Dear Hesitant, There are some experimental situations, usually involving animal research or in vitro systems that tend to show an all or nothing response. An all or nothing response could mean 100% survival in one group and 0% survival in another group. Or it could mean no overlap between two groups. In other words, the smallest value is one group is much larger than the largest value in another group.
Some simple probability arguments can show that you can achieve statistical significance with six to eight subjects total. Still, you should consult with a professional statistician face-to-face to define an appropriate sample size, even for an all or nothing response.
More details. Let's conceptualize an experiment where we measure thyroid hormone in eight mice, four with the thyroid gland removed and four with a sham surgery. We get measurable thyroid hormones in four sham surgery mice and nothing in the thyroidectomy mice.
That's a pretty extreme result. If the thyroid gland had nothing to do with thyroid hormones, then it would quite a rare event for the four zeros associated with the thyroidectomy. This is quite a relief, because we didn't want to have to rename the thyroid gland to the "it doesn't produce thyroid hormone" gland.
There are seventy possible ways that we can associate four zero values among 8 mice, and only one other way leads to results as extreme as we have seen: the result where the four zeros all occur in the sham surgery group. So a two-sided p-value for this data would be 2/70 = 0.029.
Let's consider a different experiment, where we measured thyroid levels in six mice, before and after removing the thyroid gland. All six mice had lower levels of thyroid hormone after surgery.
Again, this is an extreme event. If the thyroid gland had no influence on thyroid hormone, then this would be like flipping a coin six times and getting the same result each time. If we ignore the possibility that thyroid levels remain the same, then there are 64 possible outcomes to this experiment, and only one other outcome is as extreme as the results we saw: the event where all six mice showed an increase in thyroid hormone after surgery. So a two-sided p-value would be 2/64 = 0.031.
I would hate to plan a study that relied on an all or nothing response. Maybe one of our mice recently returned from a vacation at Chernobyl. A cautious researcher should plan for a few extra mice.
Now don't go telling all your friends that 6 or 8 is a magic sample size. Every research problem is different, and a careful sample size justification requires a face-to-face consultation with a professional statistician.
Summary. Hesitant Harrison wants to know the minimum number of patients that you would need under a perfect research situation where all of the treated patients survive and none of the controls do. Professor Mean explains that some research situations can lead to an "all or nothing" response. With an all or nothing response, you need about six to eight subjects total to achieve statistical significance. Such a small sample size, however, leaves you with no room for error if one of your subjects produces an unexpected response. You should always consult with a professional statistician before starting a research study with such a small sample size.
Did you like this article? Visit http://www.pmean.com/category/SampleSizeJustification.html for related links and pages.
--> Monthly Mean Article (peer reviewed): Caroline Bennette, Andrew Vickers. Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents. BMC Medical Research Methodology. 2012;12(1):21. Abstract: "Background: Quantiles are a staple of epidemiologic research: in contemporary epidemiologic practice, continuous variables are typically categorized into tertiles, quartiles and quintiles as a means to illustrate the relationship between a continuous exposure and a binary outcome. Discussion: In this paper we argue that this approach is highly problematic and present several potential alternatives. We also discuss the perceived drawbacks of these newer statistical methods and the possible reasons for their slow adoption by epidemiologists. Summary: The use of quantiles is often inadequate for epidemiologic research with continuous variables." [Accessed on July 5, 2012]. http://www.biomedcentral.com/1471-2288/12/21/abstract
Did you like this article? Visit http://www.pmean.com/category/ModelingIssues.html for related links and pages.
--> Monthly Mean Article (popular press): Katie Thomas. In Documents on Pain Drug Celebrex, Signs of Doubt and Deception. The New York Times. 2012. Description: This article describes some of the subtle and not so subtle ways that a drug company can manipulate research to its advantage. [Accessed on July 5, 2012]. http://www.nytimes.com/2012/06/25/health/in-documents-on-pain-drug-celebrex-signs-of-doubt-and-deception.html.
Did you like this article? Visit http://www.pmean.com/category/EthicsInResearch.html for related links and pages.
--> Monthly Mean Book: Daniel Kahneman. Thinking, Fast and Slow. ISBN: 0374533557. Dr. Kahneman won the Nobel prize for his work with Amos Tversky on Behavioral Economics. This is an accessible introduction to that work. I like this book because it shows how and why our intuition often fails us, which is a warning to those who would abandon Evidence-Based Medicine because it does not provide enough latitude for clinical judgement.
Did you like this book? Visit http://www.pmean.com/category/CriticalAppraisal.html for related links and pages.
--> Monthly Mean Definition: What is autocorrelation? If you know how the stock market behaves, it meanders up and down much like a random walk. This means that two measurements of a stock's price close in time are likely to be more highly correlated than two measurements distant in time. The more distant the time point, the weaker the correlation. The technical term for this is positive autocorrelation. Like stock prices, serial blood pressure measurements are likely to exhibit positive autocorrelation. Blood pressure doesn't jump randomly about a mean but meanders up and down. Unless your measurements are taken far enough apart in time, you will see positive autocorrelation. If your first measurement is above average the second is likely to be above average as well because you haven't given the system enough time to meander around. How far about do the measurements have to be? That depends a lot on how quickly the series meanders. I'm not an expert here, but I suspect that multiple measurements taken during the same doctor's visit would exhibit positive autocorrelation. There are other issues such as cyclical variation. I suspect that blood pressure measurements vary by the time of day, much like a sine wave. Your blood pressure might, for example, start off in a trough in the morning when you just wake up. Two measurements, both taken in the morning are going to be correlated because both are likely to be in the trough. So what's the impact of positive autocorrelation? When you average a set of measurements that are positively correlated, you get a larger standard error than if those measurements were independent.
Did you like this article? Visit http://www.pmean.com/category/ModelingIssues.html for related links and pages.
--> Monthly Mean Quote: Equipoise is an irrelevance. Patients are entered onto the trial because the trialists believe that the experimental treatment is better. The control group treatment should be as good as that available outside the trial. Experimentation continues until either the trialists are convinced they are wrong (equipoise is reached) or they convince Society they are right. Stephen Senn, as quoted at http://www.bmj.com/rapid-response/2011/10/28/placebo-confusion.
--> Monthly Mean Video: TEDxSwarthmore - Steve Wang - 180 Degrees. 15 minutes, 7 seconds. Excerpt: "Sometimes you're wrong. Sometimes you think you're right, but it turns out that you're wrong. Sometimes you're sure you're right because the answer is obvious, and yet you're still wrong anyway—maybe even dead wrong, 180 degrees wrong. This talk is about the last situation." Description: Steve Wang is a statistician who illustrates the problem of selection bias and failure to account properly for missing data with three amusing examples. http://www.youtube.com/watch?v=lRHa82vMPhU.
Did you like this video? Visit http://www.pmean.com/category/ExclusionsInResearch.html for related links and pages.
--> Monthly Mean Website: Webpage: Allan Engelhardt. R versus SAS/SPSS in corporations. Excerpt: "A recent question on one of the LinkedIn groups about the advantages of using R over commercial tools like SAS or IBM SPSS Modeller drew lots of comments for R. We like R a lot and we use it extensively, but I also wanted to balance the discussion. R is great, but looking at commercial organizations near the end of 2011 it is not necessarily the right choice to make." [Accessed on May 31, 2012]. http://www.r-bloggers.com/r-versus-sasspss-in-corporations/.
Did you like this website? Visit http://www.pmean.com/category/RSoftware.html for related links and pages.
--> Nick News: Nicholas in Paris. In early June, the whole family took a trip to Paris. There's lots to write about, but not enough time. Here are a few quick pictures until I get a chance to write up a web page.
We rented an apartment that was advertised as having a view of the Eiffel Tower, but we were shocked at how spectacular a view it was. This apartment was literally two blocks away. We could watch the light show every night from our living room window.
We bought Nicholas a new digital camera just for the trip, and he has a pretty good eye for photography. His main problem is that he is not patient enough to hold the camera steady, causing many of his best pictures to be blurry. Here, he took a picture of Cathy and Steve at the Louvre.
There are a lot more pictures to show and a lot more stories to tell.
--> Very bad joke: How many epidemiologists does it take to change a light bulb? None. The light bulb already changed itself because it was a retrospective study.
--> Tell me what you think. How did you like this newsletter? Give me some feedback by responding to this email. Unlike most newsletters where your reply goes to the bottomless bit bucket, a reply to this newsletter goes back to my main email account. Comment on anything you like but I am especially interested in answers to the following three
--> What was the most important thing that you learned in this newsletter?
--> What was the one thing that you found confusing or difficult to follow?
--> What other topics would you like to see covered in a future newsletter?
I received feedback from three people. I got compliments on my articles on the Bonferroni correction, data cleaning, and surrogate outcomes. I got a request to elaborate a bit more on surrogate outcomes. I knew at the time that my commentary was way too brief, but I'm having a hard time putting together a longer summary. Surely it is needed, but I want to cite papers from some of the experts in the area, and I don't have those articles in front of me. I also got a request to elaborate more on propensity scores. Another excellent suggestion, though I must admit again that this is a tough topic to tackle.
--> Join me on Facebook, LinkedIn, and Twitter. I'm just getting started with social media. My Facebook page is www.facebook.com/pmean, my page on LinkedIn is www.linkedin.com/in/pmean, and my Twitter feed name is @profmean. If you'd like to be a Facebook friend, LinkedIn connection (my email is mail (at) pmean (dot) com), or tweet follower, I'd love to add you. If you have suggestions on how I could use these social media better, please let me know.
--> Permission to re-use any of the material in this newsletter. This newsletter is published under the Creative Commons Attribution 3.0 United States License, http://creativecommons.org/licenses/by/3.0/us/. You are free to re-use any of this material, as long as you acknowledge the original source. A link to or a mention of my main website, www.pmean.com, is sufficient attribution. If your re-use of my material is at a publicly accessible webpage, it would be nice to hear about that link, but this is optional.
Sign up for the Monthly Mean newsletter
Review the archive of Monthly Mean newsletters
Go to the main page of the P.Mean website
This work is licensed under a Creative Commons Attribution 3.0 United States License.