# Anova

The ANOVA or Analysis of Variance test is conducted to examine if there is any difference between the means of two or more groups on a variable. To explain this better let us take the example of the test scores of students in an internal semester exam and in an external final exam. The ANOVA test allows you to categorize the groups according to the grades and then examine how performance differs according to the grades.

There are different types of ANOVA. The simplest type is one way between groups. The example above corresponds to this type, where there is one factor namely the score, but the two samples are independent of each other. That is to say the scores in the internal exam have no bearing on the scores in the final exam. The second type is the repeated measures variety. This is normally used for measuring variances over a certain course of time. Let us say that we are administering a drug to a group of patients and looking at their response on Day 1, Day 2 and Day 3. In this case, the samples are dependent on each other; the outcome on Day 1 could have a bearing on the outcome of Day 2.

A third and more complex version is the factorial ANOVA. This is where there is more than one variable and more than one level. Let us say that a drug is being administered to men and women and their responses are being checked. Now we have two factors, men and women. We are also looking at their response on Day 1, Day 2 and Day 3. The responses of the men and women are not dependent on each other, but the response of one day has an effect on the response of the next day. Therefore, in this method variables can be both dependent and independent of each other.

Like all statistical tests, the ANOVA can only be successfully calculated based on some assumptions. The first assumption is that there is a normal distribution of sample means. The second assumption is that the errors in one set have no bearing on another set. They are all independent of each other. It is also assumed that scores, which are far away from the mean, are removed from the sample data to avoid a skewed population. The final assumption is that there is equality in the population variances in the different sets of each variable.