Frank Harrell’s Philosophy of Biostatistics

*Blog post
2006
Modeling issues
Author

Steve Simon

Published

October 10, 2006

There are a lot of people in the world who are a lot smarter than I am and it is always a humbling experience when I recognize how little I really know.

Frank Harrell, chair of the Department of Biostatistics at Vanderbilt University, is one of those people. He has a book,

  • Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. Frank E. Harrell (2001) New York, NY: Springer. [BookFinder4U link]

that should be required reading for any statistician who is planning to develop a regression model. It covers all the new things that I know I should be using, but that I have been too lazy to check out. It’s not for a beginner, but if you have some experience with regression models and want to learn how to use the best state of the art methods, there is no better place to look.

A reminder of how important all of this stuff is appears on Dr. Harrell’s website:

where he outlines his philosophy of biostatistics. It is well worth repeating here, and each bullet point needs to be expanded on

  • *<U+FFFD>Biostatistics needs to be fully integrated into biomedical research; experimental design is all important*
  • Don’t be afraid of using modern methods
  • Avoid categorizing continuous variables and predicted values at all costs
  • Don’t assume that anything operates linearly
  • Account for model uncertainty and avoid it when possible by using subject matter knowledge
  • Use the bootstrap routinely
  • Make the sample size a random variable when possible
  • Consider using Bayesian methods
  • Use excellent graphics, liberally

A good elaboration of the third bullet point appears at

which outlines the issue far better than anything I have written on the topic.

Earlier versions are here and here.