Testing Homogeneity of Covariance Matrices

The assumption for the multivariate approach is that the vector of the dependent variables follow a multivariate normal distribution, and the variance-covariance matrices are equal across the cells formed by the between-subjects effects. Box's M tests the null hypothesis that the observed covariance matrices of the dependent variables are equal across groups. The Box's M test statistic is transformed to an F statistic with df1 and df2 degrees of freedom. Here, the significance value of the test is less than 0.05, suggesting that the assumptions are not met, and thus the model results are suspect. Box's M is sensitive to large data files, meaning that when there are a large number of cases, it can detect even small departures from homogeneity. Moreover, it can be sensitive to departures from the assumption of normality. As an additional check of the diagonals of the covariance matrices, look at Levene's tests.

This table tests equality of the error variances across the cells defined by the combination of factor levels. A separate test is performed for each dependent variable. The significance value for Length of stay is greater than 0.10, so there is no reason to believe that the equal variances assumption is violated for this variable. However, the significance value for the test of Treatment costs is less than 0.05, indicating that the equal variances assumption is violated for this variable. Like Box's M, Levene's test can be sensitive to large data files, so look at the spread vs. level plot for Treatment costs for visual confirmation.

The spread-versus-level plot is a scatterplot of the cell means and standard deviations. It provides a visual test of the equal variances assumption, with the added benefit of helping you to assess whether violations of the assumption are due to a relationship between the cell means and standard deviations. This plot agrees with the result of Levene's test, that the equal variances assumption is violated for Treatment costs. There is also a clear positive relationship in the scatterplot, showing that as the cell mean increases, so does the variability. This relationship suggests a possible solution to the problem. Since Treatment costs is a positive-valued variable, you could propose that the error term has a multiplicative, rather than additive, effect on cost. Instead of modeling Treatment costs, you will analyze Log-cost .