ONESAMPLE Subcommand (NPTESTS command)
The ONESAMPLE subcommand produces one-sample nonparametric tests. The
TEST keyword is required, all other keywords are optional. If
ONESAMPLE is specified with none of the optional keywords, the following tests are
performed automatically:
- Categorical fields with two values are tested using a binomial test.
- Categorical fields with more than two values are tested using a chi-square test with equal frequencies on category values found in the sample.
- Continuous fields are tested using a Kolmogorov-Smirnov test against a normal distribution with the sample mean and standard deviation.
TEST keyword
The TEST keyword lists the fields that you want to test.
- Specify one or more fields. Note that certain tests are not applicable to
fields of a particular measurement level; for example, the chi-square test is only performed for
categorical fields.
NPTESTSautomatically determines which tests are applicable to which fields. See the individual keyword descriptions for details.
CHISQUARE keyword
[CHISQUARE([EXPECTED={EQUAL** }])]
{CUSTOM(FREQUENCIES=valuelist
CATEGORIES=valuelist) }
The CHISQUARE keyword produces a one-sample test that
computes a chi-square statistic based on the differences between the observed and expected
frequencies of categories of a field.
- A separate chi-square test is performed for each and every categorical field
specified on the
TESTkeyword. - The test specifications given on the
CHISQUAREkeyword apply to all chi-square tests performed. - If
CHISQUAREis specified without any keywords, equal frequencies are expected in each category.
EXPECTED = EQUAL|CUSTOM(FREQUENCIES=valuelist CATEGORIES=valuelist). Expected frequencies.
- The
EXPECTEDkeyword defines how expected frequencies are derived. The default isEQUAL. -
EQUALproduces equal frequencies among all categories in the sample. This is the default whenCHISQUAREis specified without any other keywords.. -
CUSTOMallows you to specify unequal frequencies for a specified list of categories. - On the
CATEGORIESkeyword, specify a list of string or numeric values. The values in the list do not need to be present in the sample. - On the
FREQUENCIESkeyword, specify a value greater than 0 for each category, and in the same order as the categories, on theCATEGORIESkeyword. Custom frequencies are treated as ratios so that, for example,FREQUENCIES=1 2 3is equivalent toFREQUENCIES=10 20 30, and both specify that 1/6 of the records are expected to fall into the first category on theCATEGORIESkeyword, 1/3 into the second, and 1/2 into the third. - When
CUSTOMis specified, the number of expected frequencies must match the number of category values; otherwise the test is not performed for that field.
BINOMIAL keyword
[BINOMIAL([TESTVALUE={0.5**}]
{value}
[SUCCESSCATEGORICAL={FIRST** }]
{LIST(valuelist) }
[SUCCESSCONTINUOUS=CUTPOINT({MIDPOINT**})]
{value }
[CLOPPERPEARSON] [JEFFREYS] [LIKELIHOOD]
)]
The BINOMIAL keyword produces a one-sample test of whether
the observed distribution of a dichotomous field is the same as what is expected from a specified
binomial distribution. In addition, you can request confidence intervals.
- A separate binomial test is performed for each and every field specified on
the
TESTkeyword. - The test specifications given on the
BINOMIALkeyword apply to all binomial tests performed. - If
BINOMIALis specified without any keywords, each categorical field is assumed to have only two values and each continuous field is dichotomized using the average of the minimum and maximum as a cut point. The distribution of each named field is compared to a binomial distribution with p (the proportion of cases expected in the first category) equal to 0.5.
TESTVALUE. Hypothesized proportion. The TESTVALUE keyword specifies
the expected proportion of cases in the first category. Specify a value greater than 0 and less than
1. The default is 0.5.
SUCCESSCATEGORICAL=FIRST|LIST(valuelist).
- The
SUCCESSCATEGORICALkeyword specifies how "success", the data value(s) tested against the test value, is defined for categorical fields. -
FIRSTperforms the binomial test using the first value found in the sample to define "success". This option is only applicable to nominal or ordinal fields with only two values; all other categorical fields specified on aONESAMPLEsubcommand whereFIRSTis used will not be tested. This is the default. -
LISTperforms the binomial test using the specified list of values to define "success". Specify a list of string or numeric values. The values in the list do not need to be present in the sample.
SUCCESSCONTINUOUS=CUTPOINT (MIDPOINT|value). Define success for continuous fields. The
SUCCESSCONTINUOUS keyword specifies how "success", the data value(s) tested against
the test value, is defined for continuous fields.
-
CUTPOINTdefines values that are equal to or less than the cut point as "success".MIDPOINTsets the cut point at the average of the minimum and maximum values. Alternatively, specify a value for the cut point. The default isMIDPOINT.
CLOPPERPEARSON. Exact interval based on the cumulative binomial distribution.
JEFFREYS. Bayesian interval based on the posterior distribution of p using the Jeffreys prior.
LIKELIHOOD. Interval based on the likelihood function for p.
KOLMOGOROV_SMIRNOV keyword
[KOLMOGOROV_SMIRNOV(
[NSAMPLES={1000**}{integer}]
[MC_CILEVEL-{99**}{value}]
[NORMAL={SAMPLE**(SIMULATION={TRUE**}{FALSE})}
{CUSTOM(MEAN=value
SD=value )}]
[UNIFORM={SAMPLE** }]
{CUSTOM(MIN=value
MAX=value )}
[EXPONENTIAL={SAMPLE** }]
{CUSTOM(MEAN=value )}
[POISSON=CUSTOM(MEAN=value )]
)]
The KOLMOGOROV_SMIRNOV keyword produces a one-sample test of
whether the sample cumulative distribution function for a field is homogenous with a uniform,
normal, Poisson, or exponential distribution.
- A separate Kolmogorov-Smirnov test is performed for each and every
continuous and ordinal field specified on the
TESTkeyword. - The test specifications given on the
KOLMOGOROV_SMIRNOVkeyword apply to all Kolmogorov-Smirnov tests performed. - If
KOLMOGOROV_SMIRNOVis specified without any keywords, each field is tested against a normal distribution using its sample mean and sample standard deviation.
NSAMPLES=integer. NSAMPLES resets the number of replicates used by the
Lilliefors test for Monte Carlo sampling.
MC_CILEVEL=value.Monte Carlo confidence interval.
MC_CILEVEL resets the confidence interval level that is estimated by the
Kolmogorov-Smirnov test.
NORMAL (SAMPLE (SIMULATION=boolean)|CUSTOM (MEAN=value SD=value)). Normal distribution.
SAMPLE uses the observed mean and standard deviation, SIMULATION
controls whether the Monte Carlo simulation will be used to conduct the Lilliefors test for Normal
distribution when the parameters are not specified, and CUSTOM allows you to
specify parameters.
UNIFORM (SAMPLE|CUSTOM (MIN=value MAX=value)). Uniform distribution.
SAMPLE uses the observed minimum and maximum, CUSTOM allows you to
specify values.
POISSON=CUSTOM (MEAN=value)). Poisson distribution.
CUSTOM allows you to specify a mean value.
EXPONENTIAL(SAMPLE|CUSTOM (MEAN=value)). Exponential distribution.
SAMPLE uses the observed mean, CUSTOM allows you to specify a
value.
RUNS keyword
[RUNS([GROUPCATEGORICAL={SAMPLE** }]
{LIST(valuelist) }
[GROUPCONTINUOUS=CUTPOINT({SAMPLEMEDIAN**})]
{SAMPLEMEAN }
{value }
)]
The RUNS keyword produces a one-sample test of whether the
sequence of values of a dichotomized field is random.
- A separate runs test is performed for each and every field specified on the
TESTkeyword. - The test specifications given on the
RUNSkeyword apply to all runs tests performed. - If
RUNSis specified without any keywords, each categorical field is assumed to have only two values and each continuous field is dichotomized using the sample median as a cut point.
GROUPCATEGORICAL= SAMPLE|LIST(valuelist). Determine groups for categorical fields.
SAMPLE is the default.
-
SAMPLEperforms the runs test using the values found in the sample to define the groups. This option is only applicable to nominal or ordinal fields with only two values; all other categorical fields specified on aONESAMPLEsubcommand whereSAMPLEis used will not be tested. -
LISTperforms the runs test using the specified list of values to define one of the groups. All other values in the sample define the other group. The values in the list do not all need to be present in the sample, but at least one record must be in each group.
GROUPCONTINUOUS= CUTPOINT (SAMPLEMEDIAN | SAMPLEMEAN | value). Determine groups for
continuous fields.
CUTPOINT defines values that are equal to or less than the cut point as the first
group; all other values define the other group. SAMPLEMEDIAN sets the cut point at
the sample median. SAMPLEMEAN sets the cut point at the sample men. Alternatively,
specify a value for the cut point. The default is SAMPLEMEDIAN.
WILCOXON keyword
[WILCOXON(TESTVALUE=value)]
The WILCOXON keyword produces a one sample test of median
value of a field.
- A separate Wilcoxon test is performed for each and every continuous and
ordinal field specified on the
TESTkeyword. - The test specifications given on the
WILCOXONkeyword apply to all Wilcoxon tests performed. - The
TESTVALUEkeyword is required.
TESTVALUE=value. Hypothesized median. The Wilcoxon test is performed using the
specified value. The TESTVALUE keyword is required. There is no default.