P.Mean: Quotes and references from Dr. Harrell  (created 2011-04-07).

News: Sign up for "The Monthly Mean," the newsletter that dares to call itself average, www.pmean.com/news.

Too many studies that spend ten million dolars on data collection and one hundred dollars on data analysis.

Nonparametric tests are a good starting point for simple models. The Wilcoxon, Spearman, and Kruskal-Wallis are simple cases of the proportional odds ordinal logistic model.

"If you seek what doesn't exist, what you find will not agree with what someone else finds."

"When you dicohotomize a variable that has measurement error, you increase the impact of measurement error."

Put the knots where the data is.

Random forests, bagging, and boosting are coming into favor over classification and regression trees.

The lasso does variable selection and shrinkage at the same time.

The worst errors you can make is to underfit a strong relationship and to overfit a weak relationship.

We have a legacy of software that allows us to use

Bootstrap bumping tibshirani

Backwards stepwise works worst with smallest alpha level. The best alpha is 1, and the next best is alpha=0.5.

"All models should be as big as an elephant" L.J. Savage, as quoted in Draper 1995 and Greenland.

Differences in predicted values are more interpretable and understandable than least square means.

Data splitting is now obsolete. It is too volatile, it depends on the luck of the split.

Pfizer pays $14 million dollars a year in SAS licenses.

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2011-01-01. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Incomplete pages.