POSTHOC Subcommand (UNIANOVA command)

POSTHOC allows you to produce multiple comparisons between means of a factor. These comparisons are usually not planned at the beginning of the study but are suggested by the data in the course of study.

Post hoc tests are computed for the dependent variable. The alpha value used in the tests can be specified by using the keyword ALPHA on the CRITERIA subcommand. The default alpha value is 0.05. The confidence level for any confidence interval constructed is (1−α) × 100. The default confidence level is 95.
Only factors appearing in the factor list are valid in this subcommand. Individual factors can be specified.
You can specify one or more effects to be tested. Only fixed main effects appearing or implied on the DESIGN subcommand are valid test effects.
Optionally, you can specify an effect defining the error term following the keyword VS after the test specification. The error effect can be any single effect in the design that is not the intercept or a main effect named on a POSTHOC subcommand.
A variety of multiple comparison tests are available. Some tests are designed for detecting homogeneity subsets among the groups of means, some are designed for pairwise comparisons among all means, and some can be used for both purposes.
For tests that are used for detecting homogeneity subsets of means, non-empty group means are sorted in ascending order. Means that are not significantly different are included together to form a homogeneity subset. The significance for each homogeneity subset of means is displayed. In a case where the numbers of valid cases are not equal in all groups, for most post hoc tests, the harmonic mean of the group sizes is used as the sample size in the calculation. For QREGW or FREGW, individual sample sizes are used.
For tests that are used for pairwise comparisons, the display includes the difference between each pair of compared means, the confidence interval for the difference, and the significance. The sample sizes of the two groups being compared are used in the calculation.
Output for tests specified on the POSTHOC subcommand are available according to their statistical purposes. The following table illustrates the statistical purpose of the post hoc tests:

Table 1. Post hoc comparisons
Keyword	Homogeneity Subsets Detection	Pairwise Comparison and Confidence Interval
`LSD`		Yes
`SIDAK`		Yes
`BONFERRONI`		Yes
`GH`		Yes
`T2`		Yes
`T3`		Yes
`C`		Yes
`DUNNETT`		Yes^*
`DUNNETTL`		Yes*
`DUNNETTR`		Yes*
`SNK`	Yes
`BTUKEY`	Yes
`DUNCAN`	Yes
`QREGW`	Yes
`FREGW`	Yes
`WALLER`	Yes^†
`TUKEY`	Yes	Yes
`SCHEFFE`	Yes	Yes
`GT2`	Yes	Yes
`GABRIEL`	Yes	Yes

^* Only CIs for differences between test group means and control group means are given.

^† No significance for Waller test is given.

Tests that are designed for homogeneity subset detection display the detected homogeneity subsets and their corresponding significances.
Tests that are designed for both homogeneity subset detection and pairwise comparison display both kinds of output.
For the DUNNETT, DUNNETTL, and DUNNETTR keywords, only individual factors can be specified.
The default reference category for DUNNETT, DUNNETTL, and DUNNETTR is the last category. An integer greater than 0 within parentheses can be used to specify a different reference category. For example, POSTHOC = A (DUNNETT(2)) requests a DUNNETT test for factor A, using the second level of A as the reference category.
The keywords DUNCAN, DUNNETT, DUNNETTL, and DUNNETTR must be spelled out in full; using the first three characters alone is not sufficient.
If the REGWT subcommand is specified, weighted means are used in performing post hoc tests.
Multiple POSTHOC subcommands are allowed. Each specification is executed independently so that you can test different effects against different error terms.

SNK. Student-Newman-Keuls procedure based on the Studentized range test.

TUKEY. Tukey’s honestly significant difference. This test uses the Studentized range statistic to make all pairwise comparisons between groups.

BTUKEY. Tukey’s b. Multiple comparison procedure based on the average of Studentized range tests.

DUNCAN. Duncan’s multiple comparison procedure based on the Studentized range test.

SCHEFFE. Scheffé’s multiple comparison t test.

DUNNETT(refcat). Dunnett’s two-tailed t test. Each level of the factor is compared to a reference category. A reference category can be specified in parentheses. The default reference category is the last category. This keyword must be spelled out in full.

DUNNETTL(refcat). Dunnett’s one-tailed t test. This test indicates whether the mean at any level (except the reference category) of the factor is smaller than that of the reference category. A reference category can be specified in parentheses. The default reference category is the last category. This keyword must be spelled out in full.

DUNNETTR(refcat). Dunnett’s one-tailed t test. This test indicates whether the mean at any level (except the reference category) of the factor is larger than that of the reference category. A reference category can be specified in parentheses. The default reference category is the last category. This keyword must be spelled out in full.

BONFERRONI. Bonferroni t test. This test is based on Student’s t statistic and adjusts the observed significance level for the fact that multiple comparisons are made.

LSD. Least significant difference t test. Equivalent to multiple t tests between all pairs of groups. This test does not control the overall probability of rejecting the hypotheses that some pairs of means are different, while in fact they are equal.

SIDAK. Sidak t test. This test provides tighter bounds than the Bonferroni test.

GT2. Hochberg’s GT2. Pairwise comparisons test based on the Studentized maximum modulus test. Unless the cell sizes are extremely unbalanced, this test is fairly robust even for unequal variances.

GABRIEL. Gabriel’s pairwise comparisons test based on the Studentized maximum modulus test.

FREGW. Ryan-Einot-Gabriel-Welsch’s multiple stepdown procedure based on an F test.

QREGW. Ryan-Einot-Gabriel-Welsch’s multiple stepdown procedure based on the Studentized range test.

T2. Tamhane’s T2. Tamhane’s pairwise comparisons test based on a t test. This test can be applied in situations where the variances are unequal. This test is invalid when there are multiple factors in the model, and the keyword is ignored with a warning in such cases.

T3. Dunnett’s T3. Pairwise comparisons test based on the Studentized maximum modulus. This test is appropriate when the variances are unequal. This test is invalid when there are multiple factors in the model, and the keyword is ignored with a warning in such cases.

GH. Games and Howell’s pairwise comparisons test based on the Studentized range test. This test can be applied in situations where the variances are unequal. This test is invalid when there are multiple factors in the model, and the keyword is ignored with a warning in such cases.

C. Dunnett’s C. Pairwise comparisons based on the weighted average of Studentized ranges. This test can be applied in situations where the variances are unequal. This test is invalid when there are multiple factors in the model, and the keyword is ignored with a warning in such cases.

WALLER(kratio). Waller-Duncan t test. This test uses a Bayesian approach. It is restricted to cases with equal sample sizes. For cases with unequal sample sizes, the harmonic mean of the sample size is used. The k-ratio is the Type 1/Type 2 error seriousness ratio. The default value is 100. You can specify an integer greater than 1 within parentheses.