# Statistics — ANOVA

1. What is ANOVA?

Analysis of variance (ANOVA) is a statistical method to find out if the differences between the means of three or more groups are significant. The null hypothesis for ANOVA is that the population means of all groups are equal, while the alternative hypothesis is that at least one mean is significantly different from the rest (As shown below). In other words, ANOVA helps us to figure out whether we need to reject the null hypothesis.

The name ‘ANOVA’ suggests that we are actually analyzing variances between the groups. We would compute a measure of the variance between the means of each group and a measure of the variance within the groups, and examine a test statistic that is the ratio of these measures. The test statistic can be shown to have an F-distribution. Thus, if the F-statistic which we compute exceeds the critical value based on the level of significance we chosen, we would like to reject the null hypothesis.

2. Assumptions of ANOVA

In order to conduct ANOVA, we need several assumptions:

• The samples are randomly and independently obtained.
• The populations from which the samples were obtained must be normally or approximately normally distributed.
• The variances of the populations must be equal.

3. Example

For example, we have a data set of students from three different schools doing the same test, we might be interested in whether any significant differences exist in the grades among schools.

The whole idea behind the analysis of variance is to compare the ratio of between group variance to within group variance. If the variance caused by the interaction between the samples is much larger when compared to the variance that appears within each group, then it is because the means aren’t the same.

Unfortunately, although ANOVA can identify a difference among the means of multiple population, it cannot determine which means are different from the rest. In order to achieve this goal, we could use the Tukey-Kramer multiple comparison procedure.

4. Applying the Excel ANOVA Tool

Actually, Excel provides us an efficient tool to do ANOVA. We could use ‘ANOVA: Single Factor’ from the ‘Data Analysis’ options. By specifying the input range of the data and the level of significance, we could easily derive the result. Below is the result for our example by applying the Excel ANOVA tool:

According to the result, we could also find that the P-value: 0.011583, which is smaller than our chosen level of significance, 0.05, leading to the same conclusion, we would reject the null hypothesis.

Reference:

ArmstrongPSYC2190 (2013) ‘One way ANOVA’. https://youtu.be/kHwlB_j7Hkc.

Evans, J. (2017) Business Analytics. England: Pearson.