Sample Sizes for Strata

When drawing a stratified sample, the default option is to sample the same proportion of records or clusters from each stratum. If one group outnumbers another by a factor of 3, for example, you typically want to preserve the same ratio in the sample. If this is not the case, however, you can specify the sample size separately for each stratum.

The Sample Sizes for Strata dialog box lists each value of the stratification field, allowing you to override the default for that stratum. If multiple stratification fields are selected, every possible combination of values is listed, allowing you to specify the size for each ethnic group within each city, for example, or each town within each county. Sizes are specified as proportions or counts, as determined by the current setting in the Sample node.

To Specify Sample Sizes for Strata

  1. In the Sample node, select Complex, and select one or more stratification fields. See the topic Cluster and Stratify Settings for more information.
  2. Select Custom, and select Specify Sizes.
  3. In the Sample Sizes for Strata dialog box, click the Read Values button at lower left to populate the display. If necessary, you may need to instantiate values in an upstream source or Type node. See the topic What is instantiation? for more information.
  4. Click in any row to override the default size for that stratum.

Notes on Sample Size

Custom sample sizes may be useful if different strata have different variances, for example, in order to make sample sizes proportional to the standard deviation. (If the cases within the stratum are more varied, you need to sample more of them to get a representative sample.) Or if a stratum is small, you may wish to use a higher sample proportion to ensure that a minimum number of observations is included.

Note: If you stratify by a field that has missing values (null or system missing values, empty strings, white space, and blank or user-defined missing values), then you cannot specify custom sample sizes for strata. If you want to use custom sample sizes when stratifying by a field with missing or blank values, then you need to fill them upstream.