DISTRIBUTION Subcommand (SIMRUN command)

The DISTRIBUTION subcommand provides settings for displaying the distributions of predicted target values. For continuous targets, you can display cumulative distribution functions (CDF) and probability density functions (PDF). For categorical targets (targets with a measurement level of nominal or ordinal), a bar chart is generated that displays the percentage of cases that fall in each category of the target. Additional options for categorical targets of PMML models are available with the CATEGORICAL keyword. For Two-Step cluster models and K-Means cluster models, a bar chart of cluster membership is produced.

DISPLAY. Specifying DISPLAY=PDF displays the probability density function for continuous targets, the bar chart of the predicted values for categorical targets, and the bar chart of cluster membership for cluster models. You can display the cumulative distribution function for continuous targets by specifying CDF. Specifying DISPLAY=NONE suppresses all charts generated from the DISTRIBUTION subcommand.

CDFORDER. Specifies the default view for cumulative distribution functions. By default, cumulative distribution functions are displayed as ascending functions. Use CDFORDER=DESCENDING to display them as descending functions. When displayed as a descending function, the value of the function at a given point on the horizontal axis is the probability that the target lies to the right of that point. You can change the view in the Output Viewer by activating the chart. Specifying both ASCENDING and DESCENDING will generate two charts, one in ascending order and the other in descending order.

SCALE Keyword

The SCALE keyword specifies settings for continuous targets. With the exception of the PDFVIEW keyword, all keywords below apply to charts of both cumulative distribution functions and probability density functions.

MEAN. Displays a vertical reference line at the mean of the target.

MEDIAN. Displays a vertical reference line at the median of the target.

PCT. Displays fixed vertical reference lines at one or two specified percentiles of the target distribution. You can specify one or two values for PCT, as in PCT(10) or PCT(25 75). If PCT is specified without values, reference lines will be displayed at the 5-th and 95-th percentiles.

SIGMAS. Displays vertical reference lines at the specified number of standard deviations above and below the mean of the target distribution. Specify a numeric value between 1 and 10.

CUSTOM. Displays vertical reference lines at the specified positions along the horizontal axis. Use spaces to separate values, for example: CUSTOM(50000 100000).

PDFVIEW. By default, probability density functions are displayed as a continuous curve. Use PDFVIEW=HISTOGRAM to display them as histograms. You can change the view in the Output Viewer by activating the chart. Specifying both HISTOGRAM and CURVE will generate two charts, one for each of the two views.

OVERLAYTARGETS. In the case of multiple continuous targets, this specifies whether distribution functions for all continuous targets are displayed on a single chart (separate charts for cumulative distribution functions and probability density functions). The default is NO and results in a separate chart for each target.

  • If the plan file contains specifications for sensitivity analysis, OVERLAYTARGETS=YES will overlay all targets for a given iteration of the analysis on a single chart, with a separate chart for each iteration; and OVERLAYTARGETS=NO (the default) will overlay all iterations for a given target on a single chart, with a separate chart for each target.
  • OVERLAYTARGETS is ignored for probability density functions when PDFVIEW=HISTOGRAM.

REFLINES. Controls the initial positions of the moveable reference lines on PDF and CDF charts. Values specified for the lower and upper lines refer to positions along the horizontal axis, not percentiles. You can remove the lower line by specifying LOWER=LO, effectively setting the position to negative infinity; and you can remove the upper line by specifying UPPER=HI, effectively setting the position to infinity. You cannot specify both LOWER=LO and UPPER=HI. By default, the lines are positioned at the 5-th and 95-th percentiles. When multiple distributions are displayed on a single chart, the default refers to the distribution for the first iteration or first target.

Note: If the plan file contains specifications for sensitivity analysis and there is only one continuous target, then results for all iterations of the analysis will be displayed on the same chart. This applies to cumulative distribution functions and to probability density functions when PDFVIEW=CURVE. In addition, when multiple distributions are displayed on a single chart, vertical reference lines will only be applied to the distribution for the first iteration or first target. You can add reference lines to the other distributions from the Chart Options dialog, accessed from the PDF or CDF chart.

CATEGORICAL Keyword

The CATEGORICAL keyword specifies settings for categorical targets (targets with a measurement level of nominal or ordinal) and cluster models (Two-Step and K-Means models), and is in effect when PDF is specified on the DISPLAY keyword, or the DISPLAY keyword is omitted. It is ignored otherwise.

PREDVAL. For categorical targets, this option generates a bar chart that displays the percentage of simulated cases that fall in each category of the target. If the plan file contains specifications for sensitivity analysis, a clustered bar chart is displayed. By default, results are clustered by each iteration of the sensitivity analysis, with results for all categories (for a given iteration) grouped together (GROUP=CATS). You can cluster by category by specifying GROUP=ITERS. You can change the grouping in the Output Viewer by activating the chart. Specifying GROUP=CATS ITERS will generate two charts, one for each of the two groupings.

For Two-Step cluster models and K-Means cluster models, a bar chart of cluster membership is produced. When the plan file contains specifications for sensitivity analysis, the result is a clustered bar chart. The same choices for groupings, as for categorical targets, applies except categories are now replaced by the clusters from the predictive model.

PREDPROB. For categorical targets of PMML models, this option displays histograms of the probability distribution (over the simulated cases) for each of the categories of the target (one histogram for each category). If the plan file contains specifications for sensitivity analysis, the histograms for each iteration of the sensitivity analysis are grouped together by default (GROUP=CATS). You can group all histograms (one for each iteration) for each category together by specifying GROUP=ITERS. Specifying GROUP=CATS ITERS will generate the charts for each of the two groupings.

Note: The DISTRIBUTION subcommand is ignored when data are generated in the absence of a predictive model.