Influence test

The influence test is a chi-square test that tests whether the number of records in a group is significantly different from the expected frequency. The group might be a category or combination of categories. Considering the significance value and effect size, the test identifies influential groups.

The influence test uses the standardized Pearson residual to calculate the chi-square value. The formula for the Pearson residual depends on whether a single category or combination of two categories is considered.

For field summary and single category the formula is (O - N/J)2/(N/J)*(1-1/J). O is the observed, actual frequency. N is the total count and 1/J is the expected probability where J is the number of categories.

For two fields and combination of two categories, it is (O - E)2/E*(1-Ni/N)*(1-Nk/N). E is the expected frequency E= Ni * Nk / N where Ni is the count of i-th category for the first field, and Nk is the count of k-th category for the second field.

The chi-square value is compared to a theoretical chi-square distribution to determine the probability of obtaining the chi-square value by chance.

  • This probability is the significance value.
  • If the significance value after a Bonferroni adjustment is less than the significance level, the group is judged to be influential. The Bonferroni adjustment is necessary because multiple chi-square tests are conducted, one for each group.

The top group has the greatest difference when compared to the expected frequency. The top high group is the group with frequencies that are greater than the expected frequencies. The top low group is the group with frequencies that are less than the expected frequency.

The effect size is influential target category strength. It is computed as the square root of the chi-square value divided by the total count. Meaningful differences highlight the categories with the highest effect size.