Tukey's test of what it consists, example case, exercise solved

3513
Charles McCarthy

The tukey test is a method that aims to compare the individual means from an analysis of variance of several samples subjected to different treatments.

The test, presented in 1949 by John.W. Tukey, allows us to discern if the results obtained are significantly different or not. It is also known as the Tukey's honestly significant difference test (Tukey's HSD test for its acronym in English).

Figure 1. The Tukey test allows us to discern whether the differences in the results between three or more different treatments applied to three or more groups with the same characteristics have significantly and honestly different mean values..

In experiments where three or more different treatments applied to the same number of samples are compared, it is necessary to discern whether the results are significantly different or not..

An experiment is said to be balanced when the size of all statistical samples is equal in each treatment. When the size of the samples is different for each treatment, an unbalanced experiment is then.

Sometimes it is not enough with an analysis of variance (ANOVA) to know if in the comparison of different treatments (or experiments) applied to several samples they fulfill the null hypothesis (Ho: “all treatments are equal”) or, on the contrary, meets the alternative hypothesis (Ha: "at least one of the treatments is different").

Tukey's test is not unique, there are many more tests to compare sample means, but this is one of the best known and applied.

Article index

  • 1 Comparator and Tukey table
    • 1.1 Unbalanced experiments
  • 2 Example case
  • 3 Exercise resolved
  • 4 References

Tukey comparator and table

In the application of this test a value is calculated w called the Tukey comparator whose definition is as follows:

w = q √ (MSE / r)

Where the factor what is obtained from a table (Tukey's Table), consisting of rows of values what for different number of treatments or experiments. Columns indicate factor value what for different degrees of freedom. Usually the available tables have relative significance of 0.05 and 0.01.

In this formula, within the square root appears the factor MSE (Mean Square of Error) divided by r, which indicates the number of repetitions. The MSE is a number that is normally obtained from an analysis of variances (ANOVA).

When the difference between two mean values ​​exceeds the value w (Tukey comparator), then it is concluded that they are different averages, but if the difference is less than the Tukey number, then it is two samples with statistically identical average value.

The number w is also known as the HSD number (Honestly Significant Difference).

This single comparative number can be applied if the number of samples applied for the test of each treatment is the same in each of them..

Unbalanced experiments

When for some reason the size of the samples is different in each treatment to be compared, then the procedure described above differs slightly and is known as Tukey-Kramer test.

Now you get a number w comparator for each pair of treatments i, j:

w (i, j) = q √ (½ MSE / (ri + rj))

In this formula, the factor q is obtained from Tukey's table. This factor q depends on the number of treatments and the degrees of freedom of the error. ri is the number of repetitions in treatment i, while rj is the number of repetitions in treatment j.

Example case

A rabbit breeder wants to do a reliable statistical study that tells him which of the four brands of rabbit fattening food is the most effective. For the study, it forms four groups with six month and a half old rabbits that until then had the same feeding conditions.

The reasons were that in groups A1 and A4, deaths occurred due to causes not attributable to food, since one of the rabbits was bitten by an insect and in the other case the death was surely the cause of a congenital defect. So that the groups are unbalanced and then it is necessary to apply the Tukey-Kramer test.

Exercise resolved

In order not to lengthen the calculations too long, a balanced experiment case will be taken as a solved exercise. The following will be taken as data:

In this case there are four groups corresponding to four different treatments. However, we observe that all groups have the same number of data, so it is then a balanced case.

To carry out the ANOVA analysis, the tool that is incorporated in the spreadsheet of Libreoffice. Other spreadsheets like Excel have incorporated this tool for data analysis. Below is a summary table that has resulted after the analysis of variance (ANOVA) has been performed:

From the analysis of variance, we also have the P value, which for the example is 2.24E-6 well below the 0.05 level of significance, which directly leads to rejecting the null hypothesis: All treatments are equal. 

That is, among the treatments, some have different mean values, but it is necessary to know which are the significantly and honestly different (HSD) from the statistical point of view using the Tukey test.

To find the number w or as the HSD number is also known, we need to find the mean square of the error MSE. From the ANOVA analysis it is obtained that the sum of squares within the groups is SS = 0.2; and the number of degrees of freedom within the groups is df = 16 with these data we can find MSE:

MSE = SS / df = 0.2 / 16 = 0.0125

It is also required to find the factor what of Tukey, using the table. Column 4, which corresponds to the 4 groups or treatments to be compared, and row 16 are searched, since the ANOVA analysis yielded 16 degrees of freedom within the groups. This leads to a value of q equal to: q = 4.33 corresponding to 0.05 of significance or 95% of reliability. Finally the value for the "honestly significant difference" is found:

w = HSD = q √ (MSE / r) = 4.33 √ (0.0125 / 5) = 0.2165

To know which are the honestly different groups or treatments, you have to know the average values ​​of each treatment:

It is also necessary to know the differences between the mean values ​​of pairs of treatments, which is shown in the following table:

It is concluded that the best treatments, in terms of maximizing the result, are T1 or T3, which are indifferent from a statistical point of view. To choose between T1 and T3, one would have to look for other factors unrelated to the analysis presented here. For example, price, availability, etc..

References

  1. Cochran William and Cox Gertrude. 1974. Experimental designs. Threshing. Mexico. Third reprint. 661p.
  2. Snedecor, G.W. and Cochran, W.G. 1980. Statistical methods. Seventh Ed. Iowa, The Iowa State University Press. 507p.
  3. Steel, R.G.D. and Torrie, J.H. 1980. Principles and procedures of Statistics: A Biometrical Approach (2nd Ed.). McGraw-Hill, New York. 629p.
  4. Tukey, J. W. 1949. Comparing individual means in the analysis of variance. Biometrics, 5: 99-114.
  5. Wikipedia. Tukey's test. Recovered from: en.wikipedia.com

Yet No Comments