Hyperparameter search algorithms

Edit online

Hyperparameter search algorithms are the engine to propose hyperparameter combinations used by a model for training. Some hyperparameter search algorithms are included with IBM Watson Machine Learning Accelerator. You can also add other hyperparameter search algorithms if needed.

By default, some search algorithms are included in IBM Watson Machine Learning Accelerator. You can also add search algorithms.

Search algorithms are the engine to propose hyperparameter combinations used by a model for training.

The following parameters control the overall hyperparameter search process:

Max run time: The length of time (in minutes) that a tuning task runs. By setting this value to -1, the task can run indefinitely.
Max experiment number: The total number of training experiments that run in the tuning task.
Max parallel experiments: The maximal parallel experiment training can be submitted at the same time.
Objective: The search algorithm optimization goal. Value can be set to minimize or maximize.

Supported search algorithms

Supported search algorithms, include: Random, Hyperband, Bayesian, Tree-structured Parzen Estimator (TPE), and ExperimentGridSearch.

The following are algorithm-specific parameters that control the overall hyperparameter search process:

Random algorithm proposes hyperparameter combinations from input search space uniformly. It supports one parameter to control the search process:

RandomSeed: The random seed that is used to sample hyperparameters in uniform distribution. Optional.

The Hyperband search algorithm verifies more hyperparameter combinations in a fixed resource budget each round. The resource budget can be training epochs, datasets, training time, or other similar resources that affect training process. It supports the following parameters that control the search process:

RandomSeed: The random seed used by Hyperband to propose hyperparameter combinations in the first rung of brackets. Optional.
Eta: The reduction factor that controls the proportion of configurations that are discarded in each Hyperband bracket. Default value is 3.
ResourceName: The parameter name that is used as a resource in Hyperband, normally training epochs or iterations.
ResourceValue: The maximum resources that can be used by an experiment training.

The Bayesian search algorithm proposes hyperparameter combinations based on Gaussian process prior and Expected Improvement as the acquisition function. It supports the following parameters that control the search process:

RandomSeed: The random seed used by Bayesian. Optional.
InitPoints: The number of random searches before approximating with Bayesian algorithm. Default value is 10.
CubeSize: The Bayesian candidate size. The value of CubeSize cannot be less than the Max experiment number. If max number is -1, max(10000, CubeSize) is replaced as the final CubeSize, and the default CubeSize is (Max experiment number)*100.
Noiseless: Specifies whether the Bayesian sampling disables noise or not. If your model is entirely deterministic (for example, analytic), then set this value to true to speed up the optimization. If your model is not deterministic (as expected for most machine learning or deep learning models), then set this value to false. Default value is true (noiseless).

The TPE search algorithm proposes hyperparameter combinations by splitting the observed experiment training into good and bad ones. The sample hyperparameters are based on the good ones prior and bad prior through Expected Improvement. It supports the following parameters to control the search process:

RandomSeed: The random seed used for the initial warm up hyperparameter combinations and the random generator of Gaussian Mixture Model. Optional.
WarmUp: The number of initial warm-up hyperparameter combinations. Must be larger than 2. Default 20.
EICandidate: The number of hyperparameter combinations proposed each round as the candidates for Expected Improvement to propose the final one hyperparameter combination. Must be larger than 1. Default is 24.
GoodRatio: The fraction to use as good hyperparameter combinations from previous completed experiment training to build the good Gaussian Mixture Model. Must be larger than 0. Default is 0.25.
GoodMax: The max number of good hyperparameter combinations from previous completed experiment training to build the good Gaussian Mixture Model. Must be larger than 1. Default is 25.

ExperimentGridSearch asks for a list of experiments to train with well-defined hyperparameter combinations. It is not a typical search algorithm but providing a mechanism to submit training with different hyperparameter combination through one call.