ONESAMPLE Subcommand (NPTESTS command)
The ONESAMPLE
subcommand produces one-sample nonparametric tests. The
TEST
keyword is required, all other keywords are optional. If
ONESAMPLE
is specified with none of the optional keywords, the following tests are
performed automatically:
- Categorical fields with two values are tested using a binomial test.
- Categorical fields with more than two values are tested using a chi-square test with equal frequencies on category values found in the sample.
- Continuous fields are tested using a Kolmogorov-Smirnov test against a normal distribution with the sample mean and standard deviation.
TEST keyword
The TEST
keyword lists the fields that you want to test.
- Specify one or more fields. Note that certain tests are not applicable to
fields of a particular measurement level; for example, the chi-square test is only performed for
categorical fields.
NPTESTS
automatically determines which tests are applicable to which fields. See the individual keyword descriptions for details.
CHISQUARE keyword
[CHISQUARE([EXPECTED={EQUAL** }])]
{CUSTOM(FREQUENCIES=valuelist
CATEGORIES=valuelist) }
The CHISQUARE
keyword produces a one-sample test that
computes a chi-square statistic based on the differences between the observed and expected
frequencies of categories of a field.
- A separate chi-square test is performed for each and every categorical field
specified on the
TEST
keyword. - The test specifications given on the
CHISQUARE
keyword apply to all chi-square tests performed. - If
CHISQUARE
is specified without any keywords, equal frequencies are expected in each category.
EXPECTED = EQUAL|CUSTOM(FREQUENCIES=valuelist CATEGORIES=valuelist). Expected frequencies.
- The
EXPECTED
keyword defines how expected frequencies are derived. The default isEQUAL
. -
EQUAL
produces equal frequencies among all categories in the sample. This is the default whenCHISQUARE
is specified without any other keywords.. -
CUSTOM
allows you to specify unequal frequencies for a specified list of categories. - On the
CATEGORIES
keyword, specify a list of string or numeric values. The values in the list do not need to be present in the sample. - On the
FREQUENCIES
keyword, specify a value greater than 0 for each category, and in the same order as the categories, on theCATEGORIES
keyword. Custom frequencies are treated as ratios so that, for example,FREQUENCIES=1 2 3
is equivalent toFREQUENCIES=10 20 30
, and both specify that 1/6 of the records are expected to fall into the first category on theCATEGORIES
keyword, 1/3 into the second, and 1/2 into the third. - When
CUSTOM
is specified, the number of expected frequencies must match the number of category values; otherwise the test is not performed for that field.
BINOMIAL keyword
[BINOMIAL([TESTVALUE={0.5**}]
{value}
[SUCCESSCATEGORICAL={FIRST** }]
{LIST(valuelist) }
[SUCCESSCONTINUOUS=CUTPOINT({MIDPOINT**})]
{value }
[CLOPPERPEARSON] [JEFFREYS] [LIKELIHOOD]
)]
The BINOMIAL
keyword produces a one-sample test of whether
the observed distribution of a dichotomous field is the same as what is expected from a specified
binomial distribution. In addition, you can request confidence intervals.
- A separate binomial test is performed for each and every field specified on
the
TEST
keyword. - The test specifications given on the
BINOMIAL
keyword apply to all binomial tests performed. - If
BINOMIAL
is specified without any keywords, each categorical field is assumed to have only two values and each continuous field is dichotomized using the average of the minimum and maximum as a cut point. The distribution of each named field is compared to a binomial distribution with p (the proportion of cases expected in the first category) equal to 0.5.
TESTVALUE. Hypothesized proportion. The TESTVALUE
keyword specifies
the expected proportion of cases in the first category. Specify a value greater than 0 and less than
1. The default is 0.5.
SUCCESSCATEGORICAL=FIRST|LIST(valuelist).
- The
SUCCESSCATEGORICAL
keyword specifies how "success", the data value(s) tested against the test value, is defined for categorical fields. -
FIRST
performs the binomial test using the first value found in the sample to define "success". This option is only applicable to nominal or ordinal fields with only two values; all other categorical fields specified on aONESAMPLE
subcommand whereFIRST
is used will not be tested. This is the default. -
LIST
performs the binomial test using the specified list of values to define "success". Specify a list of string or numeric values. The values in the list do not need to be present in the sample.
SUCCESSCONTINUOUS=CUTPOINT (MIDPOINT|value). Define success for continuous fields. The
SUCCESSCONTINUOUS
keyword specifies how "success", the data value(s) tested against
the test value, is defined for continuous fields.
-
CUTPOINT
defines values that are equal to or less than the cut point as "success".MIDPOINT
sets the cut point at the average of the minimum and maximum values. Alternatively, specify a value for the cut point. The default isMIDPOINT
.
CLOPPERPEARSON. Exact interval based on the cumulative binomial distribution.
JEFFREYS. Bayesian interval based on the posterior distribution of p using the Jeffreys prior.
LIKELIHOOD. Interval based on the likelihood function for p.
KOLMOGOROV_SMIRNOV keyword
[KOLMOGOROV_SMIRNOV(
[NSAMPLES={1000**}{integer}]
[MC_CILEVEL-{99**}{value}]
[NORMAL={SAMPLE**(SIMULATION={TRUE**}{FALSE})}
{CUSTOM(MEAN=value
SD=value )}]
[UNIFORM={SAMPLE** }]
{CUSTOM(MIN=value
MAX=value )}
[EXPONENTIAL={SAMPLE** }]
{CUSTOM(MEAN=value )}
[POISSON=CUSTOM(MEAN=value )]
)]
The KOLMOGOROV_SMIRNOV
keyword produces a one-sample test of
whether the sample cumulative distribution function for a field is homogenous with a uniform,
normal, Poisson, or exponential distribution.
- A separate Kolmogorov-Smirnov test is performed for each and every
continuous and ordinal field specified on the
TEST
keyword. - The test specifications given on the
KOLMOGOROV_SMIRNOV
keyword apply to all Kolmogorov-Smirnov tests performed. - If
KOLMOGOROV_SMIRNOV
is specified without any keywords, each field is tested against a normal distribution using its sample mean and sample standard deviation.
NSAMPLES=integer. NSAMPLES
resets the number of replicates used by the
Lilliefors test for Monte Carlo sampling.
MC_CILEVEL=value.Monte Carlo confidence interval.
MC_CILEVEL
resets the confidence interval level that is estimated by the
Kolmogorov-Smirnov test.
NORMAL (SAMPLE (SIMULATION=boolean)|CUSTOM (MEAN=value SD=value)). Normal distribution.
SAMPLE
uses the observed mean and standard deviation, SIMULATION
controls whether the Monte Carlo simulation will be used to conduct the Lilliefors test for Normal
distribution when the parameters are not specified, and CUSTOM
allows you to
specify parameters.
UNIFORM (SAMPLE|CUSTOM (MIN=value MAX=value)). Uniform distribution.
SAMPLE
uses the observed minimum and maximum, CUSTOM
allows you to
specify values.
POISSON=CUSTOM (MEAN=value)). Poisson distribution.
CUSTOM
allows you to specify a mean value.
EXPONENTIAL(SAMPLE|CUSTOM (MEAN=value)). Exponential distribution.
SAMPLE
uses the observed mean, CUSTOM
allows you to specify a
value.
RUNS keyword
[RUNS([GROUPCATEGORICAL={SAMPLE** }]
{LIST(valuelist) }
[GROUPCONTINUOUS=CUTPOINT({SAMPLEMEDIAN**})]
{SAMPLEMEAN }
{value }
)]
The RUNS
keyword produces a one-sample test of whether the
sequence of values of a dichotomized field is random.
- A separate runs test is performed for each and every field specified on the
TEST
keyword. - The test specifications given on the
RUNS
keyword apply to all runs tests performed. - If
RUNS
is specified without any keywords, each categorical field is assumed to have only two values and each continuous field is dichotomized using the sample median as a cut point.
GROUPCATEGORICAL= SAMPLE|LIST(valuelist). Determine groups for categorical fields.
SAMPLE
is the default.
-
SAMPLE
performs the runs test using the values found in the sample to define the groups. This option is only applicable to nominal or ordinal fields with only two values; all other categorical fields specified on aONESAMPLE
subcommand whereSAMPLE
is used will not be tested. -
LIST
performs the runs test using the specified list of values to define one of the groups. All other values in the sample define the other group. The values in the list do not all need to be present in the sample, but at least one record must be in each group.
GROUPCONTINUOUS= CUTPOINT (SAMPLEMEDIAN | SAMPLEMEAN | value). Determine groups for
continuous fields.
CUTPOINT
defines values that are equal to or less than the cut point as the first
group; all other values define the other group. SAMPLEMEDIAN
sets the cut point at
the sample median. SAMPLEMEAN
sets the cut point at the sample men. Alternatively,
specify a value for the cut point. The default is SAMPLEMEDIAN
.
WILCOXON keyword
[WILCOXON(TESTVALUE=value)]
The WILCOXON
keyword produces a one sample test of median
value of a field.
- A separate Wilcoxon test is performed for each and every continuous and
ordinal field specified on the
TEST
keyword. - The test specifications given on the
WILCOXON
keyword apply to all Wilcoxon tests performed. - The
TESTVALUE
keyword is required.
TESTVALUE=value. Hypothesized median. The Wilcoxon test is performed using the
specified value. The TESTVALUE
keyword is required. There is no default.