# Optimal Binning

The Optimal Binning procedure discretizes one or more scale variables
(referred to henceforth as **binning input variables**) by distributing
the values of each variable into bins. Bin formation is optimal with
respect to a categorical guide variable that "supervises" the binning
process. Bins can then be used instead of the original data values
for further analysis.

**Examples.** Reducing the number of distinct values a variable
takes has a number of uses, including:

- Data requirements of other procedures. Discretized variables can be treated as categorical for use in procedures that require categorical variables. For example, the Crosstabs procedure requires that all variables be categorical.
- Data privacy. Reporting binned values instead of actual values can help safeguard the privacy of your data sources. The Optimal Binning procedure can guide the choice of bins.
- Speed performance. Some procedures are more efficient when working with a reduced number of distinct values. For example, the speed of Multinomial Logistic Regression can be improved using discretized variables.
- Uncovering complete or quasi-complete separation of data.

**Optimal versus Visual Binning.** The Visual Binning dialog
boxes offer several automatic methods for creating bins without the
use of a guide variable. These "unsupervised" rules are useful for
producing descriptive statistics, such as frequency tables, but Optimal
Binning is superior when your end goal is to produce a predictive
model.

**Output.** The procedure produces tables of cutpoints for
the bins and descriptive statistics for each binning input variable.
Additionally, you can save new variables to the active dataset containing
the binned values of the binning input variables and save the binning
rules as command syntax for use in discretizing new data.

Optimal Binning Data Considerations

**Data.** This procedure expects the binning input variables
to be scale, numeric variables. The guide variable should be categorical
and can be string or numeric.

To obtain optimal binning

- From the menus choose:
- Select one or more binning input variables.
- Select a guide variable.

Variables containing the binned data values are not generated by default. Use the Save tab to save these variables.

This procedure pastes OPTIMAL BINNING command syntax.