Data masking in R

*Recommendation
2024
R programming
Author

Steve Simon

Published

October 5, 2024

One major innovation of tidyverse is the use of non-standard evaluation. It allows you to avoid a lot of repetition of dataframe names in R code. I wrote a page about non-standard evaluation about a year ago, and referenced some key website that explain things. It was not a very good explanation, and the references that I included, although a bit better, were still difficult.

I ran across this page, which tries to clarify things. It uses a simpler term, data masking, instead of non-standard evaluation and it explains how distinguishing between programming variables (env-variables) and statistical variables (data-variables) is difficult inside of R functions and loops.

The topic is still not easy to follow, but this page seems to be better than my descriptions and earlier resources about this topic.

  • Lionel Henry, Hadley Wickham. Argument type: data-masking. Available in html format.

An earlier version of this page was published on new.pmean.com.