ONESAMPLE Subcommand (NPTESTS command)

The ONESAMPLE subcommand produces one-sample nonparametric tests. The TEST keyword is required, all other keywords are optional. If ONESAMPLE is specified with none of the optional keywords, the following tests are performed automatically:

  • Categorical fields with two values are tested using a binomial test.
  • Categorical fields with more than two values are tested using a chi-square test with equal frequencies on category values found in the sample.
  • Continuous fields are tested using a Kolmogorov-Smirnov test against a normal distribution with the sample mean and standard deviation.

TEST keyword

The TEST keyword lists the fields that you want to test.

  • Specify one or more fields. Note that certain tests are not applicable to fields of a particular measurement level; for example, the chi-square test is only performed for categorical fields. NPTESTS automatically determines which tests are applicable to which fields. See the individual keyword descriptions for details.

CHISQUARE keyword

     [CHISQUARE([EXPECTED={EQUAL**                      }])]
                          {CUSTOM(FREQUENCIES=valuelist
                                  CATEGORIES=valuelist) }

The CHISQUARE keyword produces a one-sample test that computes a chi-square statistic based on the differences between the observed and expected frequencies of categories of a field.

  • A separate chi-square test is performed for each and every categorical field specified on the TEST keyword.
  • The test specifications given on the CHISQUARE keyword apply to all chi-square tests performed.
  • If CHISQUARE is specified without any keywords, equal frequencies are expected in each category.

EXPECTED = EQUAL|CUSTOM(FREQUENCIES=valuelist CATEGORIES=valuelist). Expected frequencies.

  • The EXPECTED keyword defines how expected frequencies are derived. The default is EQUAL.
  • EQUAL produces equal frequencies among all categories in the sample. This is the default when CHISQUARE is specified without any other keywords..
  • CUSTOM allows you to specify unequal frequencies for a specified list of categories.
  • On the CATEGORIES keyword, specify a list of string or numeric values. The values in the list do not need to be present in the sample.
  • On the FREQUENCIES keyword, specify a value greater than 0 for each category, and in the same order as the categories, on the CATEGORIES keyword. Custom frequencies are treated as ratios so that, for example, FREQUENCIES=1 2 3 is equivalent to FREQUENCIES=10 20 30, and both specify that 1/6 of the records are expected to fall into the first category on the CATEGORIES keyword, 1/3 into the second, and 1/2 into the third.
  • When CUSTOM is specified, the number of expected frequencies must match the number of category values; otherwise the test is not performed for that field.

BINOMIAL keyword

     [BINOMIAL([TESTVALUE={0.5**}]
                          {value}
         [SUCCESSCATEGORICAL={FIRST**               }]
                             {LIST(valuelist)       }
         [SUCCESSCONTINUOUS=CUTPOINT({MIDPOINT**})]
                                     {value     }
         [CLOPPERPEARSON] [JEFFREYS] [LIKELIHOOD]
      )]

The BINOMIAL keyword produces a one-sample test of whether the observed distribution of a dichotomous field is the same as what is expected from a specified binomial distribution. In addition, you can request confidence intervals.

  • A separate binomial test is performed for each and every field specified on the TEST keyword.
  • The test specifications given on the BINOMIAL keyword apply to all binomial tests performed.
  • If BINOMIAL is specified without any keywords, each categorical field is assumed to have only two values and each continuous field is dichotomized using the average of the minimum and maximum as a cut point. The distribution of each named field is compared to a binomial distribution with p (the proportion of cases expected in the first category) equal to 0.5.

TESTVALUE. Hypothesized proportion. The TESTVALUE keyword specifies the expected proportion of cases in the first category. Specify a value greater than 0 and less than 1. The default is 0.5.

SUCCESSCATEGORICAL=FIRST|LIST(valuelist).

  • The SUCCESSCATEGORICAL keyword specifies how "success", the data value(s) tested against the test value, is defined for categorical fields.
  • FIRST performs the binomial test using the first value found in the sample to define "success". This option is only applicable to nominal or ordinal fields with only two values; all other categorical fields specified on a ONESAMPLE subcommand where FIRST is used will not be tested. This is the default.
  • LIST performs the binomial test using the specified list of values to define "success". Specify a list of string or numeric values. The values in the list do not need to be present in the sample.

SUCCESSCONTINUOUS=CUTPOINT (MIDPOINT|value). Define success for continuous fields. The SUCCESSCONTINUOUS keyword specifies how "success", the data value(s) tested against the test value, is defined for continuous fields.

  • CUTPOINT defines values that are equal to or less than the cut point as "success". MIDPOINT sets the cut point at the average of the minimum and maximum values. Alternatively, specify a value for the cut point. The default is MIDPOINT.

CLOPPERPEARSON. Exact interval based on the cumulative binomial distribution.

JEFFREYS. Bayesian interval based on the posterior distribution of p using the Jeffreys prior.

LIKELIHOOD. Interval based on the likelihood function for p.

KOLMOGOROV_SMIRNOV keyword

     [KOLMOGOROV_SMIRNOV(
         [NSAMPLES={1000**}{integer}]
         [MC_CILEVEL-{99**}{value}]
         [NORMAL={SAMPLE**(SIMULATION={TRUE**}{FALSE})}
                 {CUSTOM(MEAN=value
                         SD=value   )}]
         [UNIFORM={SAMPLE**            }]
                  {CUSTOM(MIN=value
                          MAX=value   )}
         [EXPONENTIAL={SAMPLE**           }]
                      {CUSTOM(MEAN=value )}
         [POISSON=CUSTOM(MEAN=value )]
      )]

The KOLMOGOROV_SMIRNOV keyword produces a one-sample test of whether the sample cumulative distribution function for a field is homogenous with a uniform, normal, Poisson, or exponential distribution.

  • A separate Kolmogorov-Smirnov test is performed for each and every continuous and ordinal field specified on the TEST keyword.
  • The test specifications given on the KOLMOGOROV_SMIRNOV keyword apply to all Kolmogorov-Smirnov tests performed.
  • If KOLMOGOROV_SMIRNOV is specified without any keywords, each field is tested against a normal distribution using its sample mean and sample standard deviation.

NSAMPLES=integer. NSAMPLES resets the number of replicates used by the Lilliefors test for Monte Carlo sampling.

MC_CILEVEL=value.Monte Carlo confidence interval.  MC_CILEVEL resets the confidence interval level that is estimated by the Kolmogorov-Smirnov test.

NORMAL (SAMPLE (SIMULATION=boolean)|CUSTOM (MEAN=value SD=value)). Normal distribution.  SAMPLE uses the observed mean and standard deviation, SIMULATION controls whether the Monte Carlo simulation will be used to conduct the Lilliefors test for Normal distribution when the parameters are not specified, and CUSTOM allows you to specify parameters.

UNIFORM (SAMPLE|CUSTOM (MIN=value MAX=value)). Uniform distribution.  SAMPLE uses the observed minimum and maximum, CUSTOM allows you to specify values.

POISSON=CUSTOM (MEAN=value)). Poisson distribution.  CUSTOM allows you to specify a mean value.

EXPONENTIAL(SAMPLE|CUSTOM (MEAN=value)). Exponential distribution.  SAMPLE uses the observed mean, CUSTOM allows you to specify a value.

RUNS keyword

     [RUNS([GROUPCATEGORICAL={SAMPLE**                  }]
                             {LIST(valuelist)           }
           [GROUPCONTINUOUS=CUTPOINT({SAMPLEMEDIAN**})]
                                     {SAMPLEMEAN    }
                                     {value         }
      )]

The RUNS keyword produces a one-sample test of whether the sequence of values of a dichotomized field is random.

  • A separate runs test is performed for each and every field specified on the TEST keyword.
  • The test specifications given on the RUNS keyword apply to all runs tests performed.
  • If RUNS is specified without any keywords, each categorical field is assumed to have only two values and each continuous field is dichotomized using the sample median as a cut point.

GROUPCATEGORICAL= SAMPLE|LIST(valuelist). Determine groups for categorical fields.  SAMPLE is the default.

  • SAMPLE performs the runs test using the values found in the sample to define the groups. This option is only applicable to nominal or ordinal fields with only two values; all other categorical fields specified on a ONESAMPLE subcommand where SAMPLE is used will not be tested.
  • LIST performs the runs test using the specified list of values to define one of the groups. All other values in the sample define the other group. The values in the list do not all need to be present in the sample, but at least one record must be in each group.

GROUPCONTINUOUS= CUTPOINT (SAMPLEMEDIAN | SAMPLEMEAN | value). Determine groups for continuous fields.  CUTPOINT defines values that are equal to or less than the cut point as the first group; all other values define the other group. SAMPLEMEDIAN sets the cut point at the sample median. SAMPLEMEAN sets the cut point at the sample men. Alternatively, specify a value for the cut point. The default is SAMPLEMEDIAN.

WILCOXON keyword

     [WILCOXON(TESTVALUE=value)]

The WILCOXON keyword produces a one sample test of median value of a field.

  • A separate Wilcoxon test is performed for each and every continuous and ordinal field specified on the TEST keyword.
  • The test specifications given on the WILCOXON keyword apply to all Wilcoxon tests performed.
  • The TESTVALUE keyword is required.

TESTVALUE=value. Hypothesized median. The Wilcoxon test is performed using the specified value. The TESTVALUE keyword is required. There is no default.