P.Mean: Files for my R class (created 2015-07-29, updated 2017-08-31).

News: I'm blogging now! http://blog.pmean.com.

I am teaching a class on R. This page gives links to files that you may need as we go along. Please note that this class was originally given as a five day class, but is now being offered as an online class that you can work on at your own pace and schedule. Each "day" is now a "part" with a "Part 0" being some introductory information that you need to install R on your own computer.

You can find this file at http://www.pmean.com/15/r.html.

You can complete the work at your own pace. Finish everything by the second week if you like or wait until the semester is almost over, if you like. I am listing some suggested completion dates, if you prefer to turn in the material early, just go ahead. If you plan to turn in the work later, please give me a heads up so I don't panic.

Part 0 (Installing R and optionally, RStudio) Please try to complete all this material by the third week of class (for fall semester 2017, that would be Friday, September 8, 2016).

You do not need any special files for this part of the class. Here are the videos that you should watch.

While you are watching these videos, have your computer ready to run the same things that you see on your video. Pause the video as needed to run what you need to. Go to the same website, download the same software, run the same commands. Compare your results to the results that you see in the video. If you see a major discrepancy, back up and try again. If they still don't match, send me an email to set up an appointment to look at these together.

The total time for all the videos in Part 0 is 42 minutes.

You can read a written summary of these topics as well:

These videos are optional viewing. The following quiz, however, is required, and will be graded on a pass/fail basis.

There's one more required assignment. Please go to the discussion board in

Submit your answers by email. Use the subject line "Introduction to R, (your name), Part 0 Quiz".

Part 1 (Introduction and data sets with mostly continuous variables) Please try to complete all this material by the fifth week of class (for fall semester 2017, that would be Friday, September 22, 2016).

All the files you need (except the videos) are available at

The R programming statements are inserted into an R Markdown document. Look for files: part1a.Rmd, part1b.Rmd, etc. You can also review the output from these R commands: part1a.html, part1b.html, etc. The data sets that you need to proceed from one section to another, should you wish to jump around are available as part1.RData.

While you are watching these videos, have your computer ready to run the same things that you see on your video. Pause the video as needed to run what you need to. Go to the same website, download the same software, run the same commands. Compare your results to the results that you see in the video. If you see a major discrepancy, back up and try again. If they still don't match, send me an email to set up an appointment to look at these together.

Here are the videos that you should watch:

The total time for all videos is 3 hours, 4 minutes. Your listening time will probably be longer because you should pause the video at many places so you can run the same R commands that you see in the video.

Please note that there will be some very minor inconsistencies between the code you see on part1a.Rmd, etc. and the code that appears in the videos part1a.mp4, etc. These are almost always just a fix of typographical errors or adding a few extra R commands to round out the program. Don't worry about the inconsistencies--they are too trivial to worry about. If you happen across an inconsistency that does not look trivial to you, send me an email and I'll take a look at it.

Complete a short quiz at the end of most of these videos.

Submit your answers by email. Use the subject line "Introduction to R, (your name), Part xx quiz".

Complete the following homework assignment after having viewed the videos from Part 1a through Part 1i.

Place all the key results into a Word document or PowerPoint presentation, or as a PDF file or as an HTML file. Turn it in by email. Use the subject line: "Introduction to R, (your name), Part 1 Homework"

Here are the data files you need.

Fitting Percentage of Body Fat to Simple Body Measurements

Sleep in Mammals

Exploring Relationships in Body Dimensions (if there is extra time)

Part 2 (Data sets with mostly categorical variables) Please try to complete all this material by the eighth week of class (for fall semester 2017, that would be Friday, October 13, 2016).

All the files you need (except the videos) are available at

The R programming statements are inserted into an R Markdown document. Look for files: part2a.Rmd, part2b.Rmd, etc.

You can also review the output from these R commands: part2a.html, part2b.html, etc.

The data sets that you need to proceed from one section to another, should you wish to jump around are all stored in part2.RData.

The videos are available here. Before you watch them, please note the following minor changes.

1. Originally, I had you save files to part2a.RData, part2b.RData, etc. but I got tired of keeping track of everything. So I changed the programs (part2a.Rmd, part2b.Rmd, etc.) so that you always saved to a single location, part2.RData, and always load from a single location (again, part2.RData). It's a trivial change, but this is inconsistent between the current program and the videos.

2. I have several sections labelled "On your own" but for the most part, (excluding the large homework assignment at the end of the video part2g.mp4), you do NOT have to send me anything to prove that you did these sections on your own. I trust you.

3. At the end of every video, except part2g.mp4, I failed to mention that there is a short quiz that you shoud take. Send the answers to the quiz to me as an email, using the subject line "Introduction to R, (your name), Quiz2x". At the end of part2g.mp4, there is a homework assignment that you need to turn in. Send the results to me as a Word document, PowerPoint presentation, PDF file, or HTML file. Use the subject line "Introduction to R, (your name), Part 2 Homework. Links to the quizzes and homework assignment are shown below the video list.

The total time for all videos is 1 hour, 54 minutes. Your listening time will probably be longer because you should pause the video at many places so you can run the same R commands that you see in the video.

Complete a short quiz at the end of most of these videos.

Submit your answers by email. Use the subject line "Introduction to R, (your name), Part xx quiz".

Complete the following homework assignment after having viewed the videos from Part 2a through Part 2g.

Place all the key results into a Word document or PowerPoint presentation, or as a PDF file or as an HTML file. Turn it in by email. Use the subject line: "Introduction to R, (your name), Part 2 Homework"

The data sets that you need for this part of the class are listed below.

Mortality on the Titanic

You should find the data file on mortality among passengers of the Titanic at

Gardasil vaccine

Dietary fiber (if there is extra time)

Part 3 (Mixture of categorical and continuous variables) Please try to complete all this material by the eleventh week of class (for fall semester 2017, that would be Friday, November 3, 2016).

All the files you need (except the videos) are available at

The videos are available here. Before you watch them, please note the following minor changes.

1. Originally, I had you save files to part3a.RData, part3b.RData, etc. but I got tired of keeping track of everything. So I changed the programs (part3a.Rmd, part3b.Rmd, etc.) so that you always saved to a single location, part3.RData, and always load from a single location (again, part3.RData). It's a trivial change, but this is inconsistent between the current program and the videos.

2. I have several sections labelled "On your own" but for the most part, (excluding the large homework assignment at the end of the video part3d.mp4), you do NOT have to send me anything to prove that you did these sections on your own. I trust you.

3. At the end of every video, except part3d.mp4, I failed to mention that there is a short quiz that you shoud take. Send the answers to the quiz to me as an email, using the subject line "Introduction to R, (your name), Quiz3x". At the end of part3d.mp4, there is a homework assignment that you need to turn in. Send the results to me as a Word document, PowerPoint presentation, PDF file, or HTML file. Use the subject line "Introduction to R, (your name), Part 3 Homework. Links to the quizzes and homework assignment are shown below the video list.

While you are watching these videos, have your computer ready to run the same things that you see on your video. Pause the video as needed to run what you need to. Go to the same website, download the same software, run the same commands. Compare your results to the results that you see in the video. If you see a major discrepancy, back up and try again. If they still don't match, send me an email to set up an appointment to look at these together.

The videos are available here:

The total time for all videos is 45 minutes. Your listening time will probably be longer because you should pause the video at many places so you can run the same R commands that you see in the video.

Complete a short quiz at the end of most of these videos.

Submit your answers by email. Use the subject line "Introduction to R, (your name), Part xx quiz".

Complete the following homework assignment after having viewed the videos from Part 3a through Part 3d.

Place all the key results into a Word document or PowerPoint presentation, or as a PDF file or as an HTML file. Turn it in by email. Use the subject line: "Introduction to R, (your name), Part 2 Homework"

The data files that you need for this part are listed below.

Forced Expiratory Volume (FEV) Data

Home prices in Albuquerque

Part 4 (Longitudinal data) Please try to complete all this material by the fifteenth week of class (for fall semester 2017, that would be Friday, December 1, 2016).

In today's class, you will see two different ways to store longitudinal data. These storage methods (tall and thin, short and fat) have different advantages and disadvantages. You will learn how to convert from a short and fat format to a tall and thin format and vice versa.

The information for this part of your class has been moved to an archive:

While you are watching these videos, have your computer ready to run the same things that you see on your video. Pause the video as needed to run what you need to. Go to the same website, download the same software, run the same commands. Compare your results to the results that you see in the video. If you see a major discrepancy, back up and try again. If they still don't match, send me an email to set up an appointment to look at these together.

The videos are available here:

The total time for all videos is two hours 19 minutes.

Complete a short quiz at the end of most of these videos.

There is no homework assignment for Part 4, start on your project instead.

Submit your answers by email. Use the subject line "Introduction to R, (your name), Part xx quiz".

The data files that you need for this part are listed below.

Effect of Surface and Vision on Balance

Resins Rid Termites from Trees

Depression Before and After an Earthquake

Energy Requirements Running, Walking and Cycling

Cholesterol Levels after Heart Attack

Runners with Low Back Pain

Recovery of Patients from Stroke

Part 5 (Analysis of your own data set) This must be completed by the end of the semester (for fall semester 2017, this would be Friday, December 8, 2017)

You will bring in a data set of your own and work on it in class.

Find a data set that interests you.

  1. Read it into R and compute some summary statistics. Are there any missing values?
  2. Create a factor for at least one categorical variable.
  3. Draw at least one graph appropriate for your data.
  4. Calculate a correlation or an odds ratio (or both) depending on what type of data you have.
  5. Produce at least two other statistics or graphs that you think might be interesting and informative.

Provide a brief interpretation in the context of your data set for everything that you produce above.

You may already have a data set that interests you, but if you do not, there are many data sources on the web that you can use. Here are just a couple of suggestions.

--> Gordon Smyth. Australasian Data and Story Library (OzDASL). Description: This website offers a library of data sets and associated stories. It is intended as a resource for teachers of statistics, and emphasis is given to data sets with an Australasian context. URL: www.statsci.org/data/

--> American Statistical Association. Journal of Statistics Education (JSE) Data Archive. Description: This website provides data sets used in the various articles in the Journal of Statistics Education. URL: www.amstat.org/publications/jse/jse_data_archive.htm

Creative Commons License This page was written by Steve Simon and is licensed under the Creative Commons Attribution 3.0 United States License. Need more information? I have a page with general help resources.