Distributions
You can manually specify the probability distribution for any field by opening the Specify Parameters dialog box for that field, selecting the desired distribution from the Distribution list, and entering the distribution parameters in the Distribution parameters table. Following are some notes on particular distributions:
- Categorical. The categorical distribution describes an
input field that has a fixed number of numeric values, referred to
as categories. Each category has an associated probability such that
the sum of the probabilities over all categories equals one.Note: If you specify probabilities for the categories that do not sum to 1, you will receive a warning.
- Negative Binomial - Failures. Describes the distribution of the number of failures in a sequence of trials before a specified number of successes are observed. The parameter Threshold is the specified number of successes and the parameter Probability is the probability of success in any given trial.
- Negative Binomial - Trials. Describes the distribution of the number of trials that are required before a specified number of successes are observed. The parameter Threshold is the specified number of successes and the parameter Probability is the probability of success in any given trial.
- Range. This distribution consists of a set of intervals with a probability assigned to
each interval such that the sum of the probabilities over all intervals equals 1. Values within a
given interval are drawn from a uniform distribution defined on that interval. Intervals are
specified by entering a minimum value, a maximum value and an associated probability.
For example, you believe that the cost of a raw material has a 40% chance of falling in the range of $10 - $15 per unit, and a 60% chance of falling in the range of $15 - $20 per unit. You would model the cost with a Range distribution consisting of the two intervals [10 - 15] and [15 - 20], setting the probability associated with the first interval to 0.4 and the probability associated with the second interval to 0.6. The intervals do not have to be contiguous and they can even be overlapping. For example, you might have specified the intervals $10 - $15 and $20 - $25 or $10 - $15 and $13 - $16.
- Weibull. The parameter Location is an optional location parameter, which specifies where the origin of the distribution is located.
The following table shows the distributions that are available for custom distribution fitting, and the acceptable values for the parameters. Some of these distributions are available for custom fitting to particular storage types, even though they are not fitted automatically to these storage types by the Simulation Fitting node.
Distribution | Storage type supported for custom fitting | Parameters | Parameter limits | Notes |
---|---|---|---|---|
Bernoulli | Integer, real, datetime | Probability | 0 ≤ Probability ≤ 1 | |
Beta | Integer, real, datetime | Shape 1
Shape 2 Minimum Maximum |
≥ 0
≥ 0 < Maximum > Minimum |
Minimum and maximum are optional. |
Binomial | Integer, real, datetime | Number of trials (n)
Probability Minimum Maximum |
> 0, integer
0 ≤ Probability ≤ 1 < Maximum > Minimum |
Number of trials must be an integer. Minimum and maximum are optional. |
Categorical | Integer, real, datetime, string | Category name (or label) | 0 ≤ Value ≤ 1 | Value is the probability of the category. The values must sum to 1, otherwise a warning is generated. |
Dice | Integer, string | Sides | 2 ≤ Sides ≤ 20 | The probability of each category (side) is calculated as 1/N, where N is the number of sides. The probabilities cannot be edited. |
Empirical | Integer, real, datetime | You cannot edit the empirical distribution,
or select it as a type. The Empirical distribution is only available when there is historical data. |
||
Exponential | Integer, real, datetime | Scale
Minimum Maximum |
> 0
< Maximum > Minimum |
Minimum and maximum are optional. |
Fixed | Integer, real, datetime, string | Value | You cannot specify the Fixed distribution for every field. If you want every field in your generated data to be fixed, you can use a User Input node followed by a Balance node. | |
Gamma | Integer, real, datetime | Shape
Scale Minimum Maximum |
≥ 0
≥ 0 < Maximum > Minimum |
Minimum and maximum are optional. Distribution uses a rate parameter, with a shape parameter α = k and an inverse scale parameter β = 1/θ. |
Lognormal | Integer, real, datetime | Shape 1
Shape 2 Minimum Maximum |
≥ 0
≥ 0 < Maximum > Minimum |
Minimum and maximum are optional. |
Negative Binomial - Failures | Integer, real, datetime | Threshold
Probability Minimum Maximum |
≥ 0
0 ≤ Probability ≤ 1 < Maximum > Minimum |
Minimum and maximum are optional. |
Negative Binomial - Trials | Integer, real, datetime | Threshold
Probability Minimum Maximum |
≥ 0
0 ≤ Probability ≤ 1 < Maximum > Minimum |
Minimum and maximum are optional. |
Normal | Integer, real, datetime | Mean
Standard deviation Minimum Maximum |
≥ 0
> 0 < Maximum > Minimum |
Minimum and maximum are optional. |
Poisson | Integer, real, datetime | Mean
Minimum Maximum |
≥ 0
< Maximum > Minimum |
Minimum and maximum are optional. |
Range | Integer, real, datetime | Begin(X)
End(X) Probability(X) |
0 ≤ Value ≤ 1 |
X is the index of each bin. The probability values must sum to 1. |
Triangular | Integer, real, datetime | Mode
Minimum Maximum |
Minimum ≤ Value ≤
Maximum < Maximum > Minimum |
|
Uniform | Integer, real, datetime | Minimum
Maximum |
< Maximum
> Minimum |
|
Weibull | Integer, real, datetime | Rate
Scale Location Minimum Maximum |
> 0
> 0 ≥ 0 < Maximum > Minimum |
Location, maximum and minimum are optional. |