Category: Analysis of variance (created 2007-06-20). Analysis of variance (ANOVA) is an approach that allows you to compare a continuous outcome variable across a factor representing three or more groups and to examine interactions among factors.

Closely related categories:

Articles are arranged by date with the most recent entries at the top. You can find outside resources at the bottom of this page. Other entries about analysis of variance can be found in the analysis of variance page at the StATS website.

2008

  1. P.Mean: Using ANOVA for a sum of Likert scaled variables (created 2008-10-09). I want to analyse data derived from a questionnaire. The range of possible values that my variable can take goes from 20 to 100. No evidence for rejecting the hypothesis of normality was found. I would therefore apply an ANOVA, but I still have some doubts whether this methods of analysis is valid, since the range of my dependent variable is not [- infinity;+ infinity]. Is the ANOVA a valid method of analysis or are there other approaches I can apply?

Outside resources:

Assessing the Performance of a Sensory Panel - Panelist monitoring and tracking. Martin Kermit, Valérie Lengard. Journal of Chemometrics 2006: 19(3); 154-61. [PDF]. Description: This article shows a range of statistical analyses that can be used in a typical senory panel experiment.

Multiple Comparisons with Repeated Measures. David C. Howell, University of Vermont. Excerpt: "One of the commonly asked questions on listservs dealing with statistical issue is 'How do I use SPSS (or whatever software is at hand) to run multiple comparisons among a set of repeated measures?' This page is a (longwinded) attempt to address that question. I will restrict myself to the case of one repeated measure (with or without a between subjects variable), but the generalization to more complex cases should be apparent." This website was last verified on 2008-URL: www.uvm.edu/~dhowell/StatPages/More_Stuff/RepMeasMultComp/RepMeasMultComp.html

Regression with SAS. Chapter 5: Additional coding systems for categorical variables in regression analysis. Xiao Chen, Phil Ender, Michael Mitchell, Christine Wells, UCLA Academic Technology Services. Excerpt: Categorical variables require special attention in regression analysis because, unlike dichotomous or continuous variables, they cannot by entered into the regression equation just as they are. For example, if you have a variable called race that is coded 1 = Hispanic, 2 = Asian 3 = Black 4 = White, then entering race in your regression will look at the linear effect of race, which is probably not what you intended. Instead, categorical variables like this need to be recoded into a series of variables which can then be entered into the regression model. There are a variety of coding systems that can be used when coding categorical variables. Ideally, you would choose a coding system that reflects the comparisons that you want to make. In Chapter 3 of the Regression with SAS Web Book we covered the use of categorical variables in regression analysis focusing on the use of dummy variables, but that is not the only coding scheme that you can use. For example, you may want to compare each level to the next higher level, in which case you would want to use "forward difference" coding, or you might want to compare each level to the mean of the subsequent levels of the variable, in which case you would want to use "Helmert" coding. By deliberately choosing a coding system, you can obtain comparisons that are most meaningful for testing your hypotheses. This website was last verified on 2008-URL: www.ats.ucla.edu/stat/sas/webbooks/reg/chapter5/sasreg5.htm

Data sets:

Nambeware Polishing Times. DASL. Excerpt: Authorization: free use. Description: Nambe Mills manufactures a line of tableware made from sand casting a special alloy of several metals. After casting, the pieces go through a series of shaping, grinding, buffing, and polishing steps. In 1989 the company began a program to rationalize its production schedule of some 100 items in its tableware line. The total grinding and polishing times listed here were a major output of this program. Number of cases: 59. Variable Names: 1. BOWL: Bowl (1) or not (0); 2. CASS: Casserole (1) or not (0); 3. DISH: Dish (1) or not (0); 4. TRAY: Tray (1) or not (0); 5. DIAM: Diameter of item, or equivalent (inches); 6. TIME: Grinding and polishing time (minutes); 7. PRICE: Retail price ($). Note: Items not classed as bowl, casserole, dish, or tray are plates. This website was last verified on 2008-URL: lib.stat.cmu.edu/DASL/Datafiles/nambedat.html

Creative Commons License All of the material above this paragraph is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2008-12-03. The material below this paragraph links to my old website, StATS. Although I wrote all of the material listed below, my ex-employer, Children's Mercy Hospital, has claimed copyright ownership of this material. The brief excerpts shown here are included under the fair use provisions of U.S. Copyright laws.

2008

  1. Stats: When does heterogeneity become a concern? (June 5, 2008). Dear Professor Mean, I have an ANOVA model and I am worried about heterogeneity--unequal standard deviations in each group. How should I check for this?
  2. Stats: What statistic should I use when? (January 4, 2008). Someone was asking about a multiple choice question on a test that reads something like this: A group of researchers investigating in patients with diabetes on the basis of demographic characteristics and the level of diabetic control. Select the most appropriate statistical method to use in analyzing the data: a t-test, ANOVA, multiple linear regression, or a chi-square test. This is one of the more vexing things that people face--what statistic should I use when.

    2007
     
  3. Stats: Analyzing data from a simple crossover design (August 22, 2007). A doctor brought me some data from a crossover design and asked me to help analyze it. The analysis was a bit trickier than I had expected, so I reviewed some of the material in Stephen Senn's book.

    2006
     
  4. Stats: (Seminar notes) Confidence intervals for a variance ratio (July 17, 2006). One of the talks at the 18th Annual Applied Statistics in Agriculture Conference, sponsored by Kansas State University was "Selecting the Best Confidence Interval for a Variance Ratio (or Heritability)" by Brent Burch, Northern Arizona University. Here are my notes from that talk.
  5. Stats: (Seminar notes) Adjustments for multiple comparisons (July 17, 2006). One of the talks at the 18th Annual Applied Statistics in Agriculture Conference, sponsored by Kansas State University was "A Comparison of Multiple Tests Procedures: Spinosad as a Treatment for Lice on Cattle" by Zhanglin Cui, Eli Lilly and Company. Daniel H. Mowrey, Alan G. Zimmermann, and Douglas E. Hutchens, also of Eli Lilly and Company were co-authors.
  6. Stats: Post hoc comparisons (March 15, 2006). Dear Professor Mean, I need to run multiple comparisons among all possible pairs of means following an analysis of variance test. What is the best approach? Tukey? Scheffe? Bonferroni?

    2005
     

  7. Stats: When the F test is significant, but Tukey is not (September 9, 2005). Someone asked me how to interpret a one factor analysis of variance where the overall F test was significant, but the Tukey folloup test comparing all four group means was not significant for any pair of means.

    2004
     
  8. Stats: Multiple degree of freedom tests (September 22, 2004). Someone sent me an email describing a situation where an interaction effect in SPSS had a large p-value, but one of the individual components of that interaction had a small and statistically significant p-value. This can occur in many statistical models where you are testing a factor or interaction that involves multiple degrees of freedom.

    2003
     
  9. Stats: Guidelines for ANOVA models (June 20, 2003). Dear Professor Mean, I wanted to compare two groups in my research, those who completed every test battery, and those who completed only some of them. I ran ANOVAs on age, iq, adhd score, and so forth. My professor says that I should have used a t-test instead. Why can't I use ANOVA. Isn't ANOVA better than a t-test?  --Angry Anastasia

    2002

    2001
     
  10. Stats: Unequal group sizes (November 2, 2001). Dear Professor Mean: I am comparing several groups of subjects, but the number of subjects in each group differ quite a bit. How does this affect the assumptions in analysis of variance?

What now?

Browse other categories at this site

Browse through the most recent entries

Get help

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2008-12-03.