**Multiple Comparisons**

Often in the context of planning an experiment or analyzing data after an experiment has been completed, we find that comparison of specific pairs or larger groups of treatment means are of greater interest than the simple question posed by an analysis of variance - do at least two treatment means differ? It may be that embedded in a group of treatments there is only one "control" treatment to which every other treatment should be compared, and comparisons among the non-control treatments may be uninteresting. One may also, after performing an analysis of variance and rejecting the null hypothesis of equality of treatment means want to know exactly which treatments or groups of treatments differ. To answer these kinds of questions requires careful consideration of the hypotheses of interest both before and after an experiment is conducted, the Type I error rate selected for each hypothesis, the power of each hypothesis test, and the Type I error rate acceptable for the group of hypotheses as a whole.

**Comparisons or Contrasts**

If we let represent a treatment mean and *c _{i} *a weight associated with
the

** ** **,**

** **

where It can be seen that this contrast is a linear combination of treatment means (other contrasts such as quadratic and cubic are also possible). All of the following are possible comparisons:

** **

** **

** **

** **

** **

because they are weighted linear combinations of treatment means and the weights sum to zero .

For example, previously we have performed comparisons
between two treatment means using the *t *- statistic:

** ** ** **

** **

with (*n*_{1} + *n*_{2}) - 2
degrees of freedom. This statistic is a "contrast." The numerator of this
expression follows the general form of the contrast outlined above with the weights *c*_{1}
and *c*_{2 }equal to 1 and -1, respectively:

** **

However, we also see that this contrast is divided by the pooled within cell or within group variation. So, a contrast is actually the ratio of a linear combination of weighted means to an estimate of the pooled within cell or error variation in the experiment:

** **

** **

with degrees of freedom. For
a non - directional null hypothesis *t* could be replaced by *F*:

** **

** **

with 1, and degrees of freedom. In general, a contrast is the ratio of a linear combination of weighted means to the mean square within cells times the sum of the squares of the weights assigned to each mean divided by the sample size within cells:

** **

** **

where the *c _{I}' *s are the weights assigned
to each treatment mean, ,

** **

*n*_{1}+*n*_{2} -2
degrees of freedom, or

** **

** **

with 1, and degrees of freedom. More generally; where indicates the contrast

** **

with 1, and degrees of freedom.

The *F* - statistic outlined above provides a
parametric test of the null hypothesis that the contrasted means are equal. Similar
statistics can be elaborated for rank like non-parametric tests. Hollander and Wolfe
(1973) outline several non-parametric contrast estimators.

**Experiment and Comparison - Wise Error Rates**

In an experiment where two or more comparisons are made
from the data there are two distinct kinds of Type I error. The comparison - wise error
rate is the probability of a Type I error set by the experimentor for evaluating each
comparison. The experiment - wise error rate is the probability of making at least one
Type I error when performing the whole set of comparisons. If we let** ***a _{c}* the comparison - wise error rate,

_{ } _{ }
_{.}

An approximate estimate of the relationship
between *a _{c }*and

_{ } _{ }

_{ }As *j *increases the Bonferroni
approximation departs markedly from the exact calculation given by the Dunn-Sidak
correction. In the table below *a _{c}* = 0.05 and the values
tabulated represent estimates of

* j*
Dunn-Sidak
Bonferroni

_{__________________________________________________________}

_{ }1
0.05
0.05

2 0.0975 0.10

3 0.142625 0.15

4 0.1854 0.20

5 0.2262 0.25

10 0.40126 0.50

20 0.6415 1.0

_{_________________________________________________________}

Note that the value of *a _{e }*estimated under the Dunn-Sidak correction assumes that all
contrasts performed are mutually independent. If some of the contrasts performed are
dependent then the value of

_{.}

_{ }The above results apply for
planned or *a priori* comparisons. When comparisons are performed after the data have
been examined (*a posteriori*) or subjected to an analysis of variance then
controlling the experiment - wise error rate requires an even larger penalty. If we let *m*
equal the number of possible contrasts of size *g* then

_{ } _{,}

and_{ }*a _{m}* is

Which error rate should we pay most attention to in
planning and analyzing experiments? This again is a matter of judgment and must be
balanced against the acceptable contrast and experiment - wise Type II error rate. Since
to achieve a low experiment - wise error rate requires an even lower contrast - wise Type
I error rate, the contrast - wise Type II error rate will be high. If it is more costly to
the researcher to permit even one Type I error in a set of contrasts then the experiment -
wise error rate should be minimized. On the otherhand, if failing to detect a true
treatment effect is more costly than less emphasis should be placed on minimizing the
experiment - wise Type I error rate. Although no rule of thumb exists regarding an
acceptable value for_{ }*a _{e}*, I recommend that the
experiment - wise Type I error rate be set at 10 to 15%.

_{ } **Further Reading**

Jones, D. 1984. Use, misuse, and role of
multiple-comparison procedures in ecological and

agricultural
entomology Environmental Entomology 13: 635-649.

Chew, V. 976. Comparing treatment means: a compendium. Hortscience 11: 348-357.

Hays, W.L. 1981. Statistics. 3rd edition, Chapter 12. Holt, Rinehart, and Winston.

Hollander, M. and D.A. Wolfe. 1973. Nonparametric Statistical Methods. Wiley, New York.