SIMINPUT subcommand (SIMPLAN command)
The SIMINPUT subcommand specifies input fields
whose values will be simulated. It is a required subcommand. When
the MODEL subcommand is used, each input in the associated
model file must be specified as either simulated or fixed. Use the FIXEDINPUT subcommand
to specify fixed inputs.
INPUT keyword
The INPUT keyword specifies names and optional
settings for one or more simulated inputs. The INPUT keyword
is required. The specification for each input is the field name with
an optional qualifier of the form (MAPTO=name FORMAT=format),
where the parentheses are required. Use a blank space to separate
specifications for multiple inputs. The keywords TO and ALL are
not supported for variable lists specified on the INPUT keyword.
For example:
INPUT=input1(MAPTO=field1 FORMAT=F,4) input2(FORMAT=DOLLAR)
- MAPTO. Maps a simulated field to a field in the active
dataset. The
MAPTOkeyword is only needed when you are automatically fitting to data in the active dataset and the name of a simulated field differs from the name of the associated field in the active dataset. The value ofMAPTOshould be the name of the field in the active dataset.
- FORMAT. Specifies the output format of the field. The
format consists of a format type, such as
F, optionally followed by a comma and the number of decimal places. If the number of decimal places is omitted then it is assumed that there are 0 decimal places. The following formats are supported:
| Format Specification | Definition |
|---|---|
| F,d | Numeric |
| E,d | Scientific notation |
| N,d | Restricted numeric |
| DOT,d | Numeric with dots |
| COMMA,d | Numeric with commas |
| DOLLAR,d | Numeric with commas and dollar sign |
| PCT,d | Numeric with percent sign |
| CCA,d | Custom currency |
| CCB,d | Custom currency |
| CCC,d | Custom currency |
| CCD,d | Custom currency |
| CCE,d | Custom currency |
If no format is specified, then the Numeric format F is
used for numeric inputs.
OUTPUT. Specifies whether the inputs listed on the
INPUT keyword are included in table and chart output. The default
is YES.
TYPE keyword
The TYPE keyword specifies whether the probability
distribution for this simulated input field is determined by automatically
fitting to the data for this field in the active dataset or by manually
specifying the distribution.
- MANUAL(LOCK = YES | NO SAVEASFITTED = YES | NO). Indicates
that the probability distribution is manually specified. When
MANUALis specified, theDISTRIBUTIONkeyword must be used unless the globalSOURCEkeyword specifies a simulation plan file (for more information, see the topic SOURCE Keyword (SIMPLAN command)).The
LOCKkeyword specifies whether the distribution for this simulated input field will be locked. Locked distributions will not be modified when automatically fitting distributions interactively in the user interface, using the Simulation Builder or the Run Simulation dialog. If you are creating a simulation plan that you or someone else will work with in the user interface and you want to prevent refitting a simulated input to historical data, then be sure to specifyLOCK=YESfor that input. Users who open the plan in the Run Simulation dialog will not be able to make any modifications to the distribution of the input. Users who open the plan in the Simulation Builder will be able to make changes to the distribution once they unlock the input.The
SAVEASFITTEDkeyword occurs when pasting syntax for a simulated input whose distribution was automatically fitted to historical data and then locked in the user interface. The keyword allows the state of the input to be stored in the plan file so that it can be restored in the user interface when re-opening the plan file. The default isSAVEASFITTED=NO. - AUTOFIT. Indicates that the probability distribution
associated with this input field will be automatically determined
from the data for the associated field in the active dataset. By
default, the measurement level of the field is used to determine the
set of distributions that are considered.
For nominal input fields, the default set of distributions only includes the categorical distribution.
For ordinal input fields, the default set of distributions includes the following: binomial, negative binomial and Poisson. The chi-square test is used to determine the distribution that most closely fits the data.
For continuous input fields, the default set of distributions includes the following: beta, exponential, gamma, lognormal, normal, triangular, uniform and Weibull. By default, the Anderson-Darling test for goodness of fit is used to determine the distribution that most closely fits the data. Optionally, you can specify the Kolmogorov-Smirnoff test for goodness of fit. See the topic AUTOFIT Subcommand (SIMPLAN command) for more information.
You can override the default set of distributions by explicitly specifying one or more distributions on the
AUTOFITkeyword, but you cannot mix distributions belonging to different measurement levels. For example, you can specify one or more of the distributions for ordinal data (BINOM,NEGBINandPOISSON) but you cannot specify those in conjunction with distributions for continuous data such asNORMAL.Note: For the negative binominal distribution (
NEGBIN),AUTOFITuses the form of the distribution that describes the probability of a given number of failures before a given number of successes occur. If you require the alternate parameterization that describes the probability of a given number of trials before a given number of successes occur, then manually specify the distribution with theDISTRIBUTIONkeyword. Also note that for the Weibull distribution,AUTOFITonly considers the case where the location parameter C equals 0.
The DISTRIBUTION keyword specifies the probability
distribution for a simulated input field and is used when you want
to explicitly specify the probability distribution rather than have
it automatically determined from the data for the associated field
in the active dataset. The DISTRIBUTION keyword can
only be used with TYPE = MANUAL.
BERNOULLI(PROB=value). Bernoulli distribution.
BETA(SHAPE1=value SHAPE2=value). Beta distribution.
BINOM(N=value PROB=value). Binomial distribution.
CATEGORICAL(CATS=valuelist PROBS=valuelist CONTINGENCY=YES | NO**). Categorical distribution. The Categorical distribution describes an input field that has a fixed number of values, referred to as categories. Each category has an associated probability such that the sum of the probabilities over all categories equals one.
- The
CATSkeyword specifies a list of the categories and thePROBSkeyword specifies a list of the probabilities associated with each category. The n-th item in each list specifies the associated value for the n-th category. The number of values in each of the lists must be the same. - For string inputs with a Categorical distribution, the category values can be specified with or without quotation marks; however, numeric categories for such inputs must be enclosed in quotation marks.
- The
CONTINGENCYkeyword specifies whether the input is included in a multiway contingency table that is computed from the active dataset and that describes associations between inputs with categorical distributions. By default, the input is not included in a contingency table. WhenCONTINGENCY=YES,MULTIWAY=YESmust be specified on theCONTINGENCYsubcommand.Because the contingency table is computed from the active dataset, inputs with
CONTINGENCY=YESmust either exist in the active dataset or be mapped to a field in the active dataset with theMAPTOkeyword. In addition, whenCONTINGENCY=YESvalues specified for theCATSandPROBSkeywords are ignored because the categories and category probabilities are determined from the contingency table.Inputs specified as
TYPE=AUTOFITthat are fit to a categorical distribution are automatically included in the contingency table whenMULTIWAY=YESis specified on theCONTINGENCYsubcommand.
EMPIRICAL([SOURCE=AUTOFIT** | 'filespec']). Empirical
distribution. The empirical distribution is calculated from the
data in the active dataset corresponding to the input field. EMPIRICAL is
only supported for inputs with a continuous or ordinal measurement
level.
- For continuous inputs, the empirical distribution is the cumulative distribution function of the data.
- For ordinal inputs, the empirical distribution is the categorical distribution of the data.
- For nominal input fields, use
TYPE=AUTOFIT(CATEGORICAL). - The
SOURCEkeyword is deprecated. Use the globalSOURCEkeyword instead, which has the same specifications but applies to both contingency tables and parameters for empirical distributions. See the topic SOURCE Keyword (SIMPLAN command) for more information.
EXP(SCALE=value). Exponential distribution.
GAMMA(SHAPE=value SCALE=value). Gamma distribution.
LNORMAL(A=value B=value). Lognormal distribution.
NEGBIN(TYPE=FAILURES | TRIALS THRESHOLD=value PROB=value). Negative Binomial distribution. Two parameterizations of the negative binomial distribution are supported.
-
TYPE=FAILURESspecifies a distribution that describes the probability of a given number of failures before a given number of successes occur. -
TYPE=TRIALSspecifies a distribution that describes the probability of a given number of trials before a given number of successes occur, and is the parameterization used in the command syntax functionPDF.NEGBIN.
NORMAL(MEAN=value STDDEV=value). Normal distribution.
POISSON(MEAN=value). Poisson distribution.
TRIANGULAR(MIN=value MAX=value MODE=value). Triangular distribution.
UNIFORM(MIN=value MAX=value). Uniform distribution.
USER_RANGES(MIN=valuelist MAX=valuelist PROBS=valuelist). User-defined ranges. This distribution consists of a set of intervals with a probability assigned to each interval such that the sum of the probabilities over all intervals equals 1. Values within a given interval are drawn from a uniform distribution defined on that interval.
- The
MINkeyword specifies a list of the left endpoints of each interval, theMAXkeyword specifies a list of the right endpoints of each interval, and thePROBSkeyword specifies a list of the probabilities associated with each interval. The n-th item in each list specifies the associated value for the n-th interval. The number of values in each of the lists must be the same. The specified endpoints are included in the intervals. - Intervals can be overlapping. For example, you can specify
MIN= 10 12 andMAX= 15 20, which defines the two intervals [10,15] and [12,20].
WEIBULL(A=value B=value [C=value]). Weibull distribution. The
parameter C is an optional location parameter, which specifies
where the origin of the distribution is located. Omitting the value
of C is equivalent to setting its value to 0. When C equals
0, the distribution reduces to the Weibull distribution function in
command syntax (PDF.WEIBULL).
Iterating distribution parameters
For any of the above distributions, you can specify multiple values for one of the distribution parameters. An independent set of simulated cases--effectively, a separate simulation--is generated for each specified value, allowing you to investigate the effect of varying the input. This is referred to as Sensitivity Analysis, and each set of simulated cases is referred to as an iteration.
- The set of specified values for a given distribution parameter should be separated by spaces.
- For the
CATEGORICALdistribution, theCATSparameter can only specify a single set of category values, but you can specify multiple sets of values for thePROBSparameter, allowing you to vary the set of probabilities associated with the categories. Each set of probabilities should be separated by a semicolon. - For the
USER_RANGESdistribution, theMINandMAXparameters can only specify a single set of intervals, but you can specify multiple sets of values for thePROBSparameter, allowing you to vary the set of probabilities associated with the specified intervals. Each set of probabilities should be separated by a semicolon. - You can only iterate distribution parameters for a single simulated
input. An error results if you specify iterations of distribution
parameters and there are multiple fields on the
INPUTkeyword.
Example: Normal distribution with iterated parameters
The following specification results in two iterations, one with NORMAL(MEAN=15
STDDEV=2) and one with NORMAL(MEAN=15 STDDEV=3).
DISTRIBUTION= NORMAL(MEAN=15 STDDEV=2 3)
Example: Categorical distribution with iterated sets of probabilities
The example shows three iterations for the set of probabilities associated with the specified set of categories.
DISTRIBUTION= CATEGORICAL(CATS=1 2 3 PROBS=0.5 0.25 0.25; 0.4 0.3 0.3; 0.2 0.6 0.2)
The CATS keyword specifies the set of categories.
The probabilities for the first iteration are specified by the set
of values up to the first semi-colon following the PROBS keyword.
Thus the category with value 1 has probability 0.5, the category with
value 2 has probability 0.25 and the category with value 3 has probability
0.25. The probabilities for the second and third iterations are (0.4,
0.3, 0.3) and (0.2, 0.6, 0.2) respectively.
Example: User Ranges distribution with iterated sets of probabilities
The example shows two iterations for the set of probabilities associated with the specified intervals.
DISTRIBUTION= USER_RANGES(MIN=10 13 17 MAX=12 16 20 PROBS=0.3 0.3 0.4; 0.2 0.3 0.5)
The MIN and MAX keywords specify
the three intervals [10-12], [13-16] and [17-20]. For the first iteration,
the interval from 10 to 12 has probability 0.3, the interval from
13 to 16 has probability 0.3 and the interval from 17 to 20 has probability
0.4. The probabilities for the second iteration are (0.2, 0.3, 0.5).
MINVAL Keyword
The MINVAL keyword specifies the minimum allowed
value for the simulated input field. If MINVAL is
omitted, the minimum value is determined by the range of the associated
probability distribution. If the specified value is less than the
minimum allowed for the associated probability distribution, then
the minimum for the probability distribution is used. MINVAL is
not supported for the following distributions: Bernoulli, categorical,
empirical, triangular, uniform and user ranges.
MAXVAL Keyword
The MAXVAL keyword specifies the maximum allowed
value for the simulated input field. If MAXVAL is
omitted, the maximum value is determined by the range of the associated
probability distribution. If the specified value is greater than the
maximum allowed for the associated probability distribution, then
the maximum for the probability distribution is used. MAXVAL is
not supported for the following distributions: Bernoulli, categorical,
empirical, triangular, uniform and user ranges.