Troubleshooting AutoAI experiments
The following list contains the common problems that are known for AutoAI. If your AutoAI experiment fails to run or deploy successfully, review some of these common problems and resolutions.
Speeding up experiment training with large data sets
If you find that training a model is timing out or taking an unusually long time, consider these guidelines for reducing training time:
From the Experiment settings pages of the AutoAI tool:
- Make sure that the Optimized algorithm selection option is set to Score and run time.
- Disable the XGBRegressor model. This adjustment can help you obtain results more quickly, but the scores might be slightly lower.
For a coded experiment:
- Pass
daub_give_priority_to_runtime
parameter as described in the SDK documentation.Note: This parameter can increase indeterminism (unreproducibility) of the experiment.
Passing incomplete or outlier input value to deployment can lead to outlier prediction
After you deploy your machine learning model, note that providing input data that is markedly different from data that is used to train the model can produce an outlier prediction. When linear regression algorithms such as Ridge and LinearRegression are passed an out of scale input value, the model extrapolates the values and assigns a relatively large weight to it, producing a score that is not in line with conforming data.
Time Series pipeline with supporting features fails on retrieval
If you train an AutoAI Time Series experiment by using supporting features and you get the error 'Error: name 'tspy_interpolators' is not defined' when the system tries to retrieve the pipeline for predictions, check to make sure your system is running Java 8 or higher.
Change in data ingestion can cause schema difference with older models
Starting in Cloud Pak for Data 4.5, the same technique is used to read the data from the project's data asset and connected data sources. That ensures aligned schema detection across different data sources. Note that this technique might result in some schema differences that are compared with models you created with an earlier release of Cloud Pak for Data.
Storage volume for connected data must be in same namespace as AutoAI pod
If you are accessing a connected data asset in a storage volume for training an AutoAI experiment, and the storage volume is in a different namespace from where the AutoAI pod is running, you might get an error like this:
The job runtime failed to start in 360 seconds ... persistentvolumeclaim "<volume-name>" not found.
Ask an administrator to check that the volume is in the same namespace as the AutoAI pod.
Applies to: Cloud Pak for Data 4.5.0 and later
Older time series experiments might require retraining to generate notebook
If you try to generate a notebook for an older time series experiment, you are unable to save the experiment code or save a pipeline as a notebook. To resolve the issue, retrain the experiment and try again.
Applies to: Cloud Pak for Data 4.5.0 and later
Running a pipeline or experiment notebook fails with a software specification error
If supported software specifications for AutoAI experiments change, you might get an error when you run a notebook built with an older software specification, such as an older version of Python. In this case, run the experiment again, then save a new notebook and try again.
Resolving an Out of Memory error
If you get a memory error when you run a cell from an AutoAI generated notebook, create a notebook runtime with more resources for the AutoAI notebook and execute the cell again.
Resolving data access error in a notebook
If you are connecting to data in an experiment or pipeline notebook, an error that includes FlightUnavailableError
or FileNotFoundError
can indicate a problem with the data connection service. This condition can happen
when you are running the notebook outside of the cluster. As a workaround, you can provide the data as a Pandas DataFrame object. For example, to read a .CSV file, use pandas.read_csv()
or use code that is generated code for you
for the selected data source by clicking the Code snippets icon () from the notebook toolbar and clicking
Read data.
AutoAI notebook errors require manual upgrade
If you are running an AutoAI notebook, you might encounter one of these issues:
-
Failed to GET project
can happen when you run an AutoAI notebook job from a deployment space. It indicates that the notebook cannot access theproject_libs
library that is required to run the notebook. To resolve:-
Modify the cell that contains the data reference, like this:
df = training_data_reference[0].read(experiment_metadata=experiment_metadata)
-
Manually upgrade your Python client to the latest version:
!pip install -U ibm-watson-machine-learning
-
Restart your notebook kernel.
-
-
TypeError: list indices must be integers or slices, not str
indicates that there is a problem accessing your data source. To resolve:-
Manually upgrade your Python client to the latest version:
!pip install -U ibm-watson-machine-learning
-
Restart your notebook kernel.
-
Notebook for an experiment with subsampling can fail generating predictions
If you do pipeline refinery to prepare the model, and the experiment uses subsampling of the data during training, you might encounter an “unknown class” error when you run a notebook that is saved from the experiment.
The problem stems from an unknown class that is not included in the training data set. The workaround is to use the entire data set for training or re-create the subsampling that is used in the experiment.
To subsample the training data (before fit()
), provide sample size by number of rows or by fraction of the sample (as done in the experiment).
-
If number of records was used in subsampling settings, you can increase the value of
n
. For example:train_df = train_df.sample(n=1000)
-
If subsampling is represented as a fraction of the data set, increase the value of
frac
. For example:train_df = train_df.sample(frac=0.4, random_state=experiment_metadata['random_state'])
Pipeline creation fails for binary classification
AutoAI analyzes a subset of the data to determine the best fit for experiment type. If the sample data in the prediction column contains only two values, AutoAI recommends a binary classification experiment and applies the related algorithms. However, if the full data set contains more than two values in the prediction column the binary classification fails and you get an error that indicates that AutoAI cannot create the pipelines.
In this case, manually change the experiment type from binary to either multiclass, for a defined set of values, or regression, for an unspecified set of values.
- Click the Reconfigure Experiment icon to edit the experiment settings.
- On the Prediction page of Experiment Settings, change the prediction type to the one that best matches the data in the prediction column.
- Save the changes and run the experiment again.
Updating an AutoAI notebook after an upgrade from an older version of Cloud Pak for Data
If you are trying to run an AutoAI notebook that was created with an earlier version of Cloud Pak for Data, the notebook might fail with an unsupported framework
error. You have two options for resolving the issue:
-
Retrain the experiment and save a new notebook to get the latest frameworks.
-
To use the existing notebook, use a higher version of the Python client by removing the following command:
!pip install 'ibm-watson-machine-learning==1.0.66' | tail -n 1
Note:If you run the notebook outside Watson Studio, replace the
!pip install
command with this update command:!pip install -U ibm-watson-machine-learning | tail -n 1
File schema differs according to upload method
A data file that is loaded from a data connection might return a different model schema from the same data file that is loaded directly in a CSV file. This scenario can result in errors when you score a deployment by using the scoring form in the UI. In that case, use the JSON tab to enter scoring data.
Creating a batch deployment for an AutoAI model
You can create a batch deployment for a saved AutoAI model, but the model must be trained by using the current version of Cloud Pak for Data. If it was trained by using an older version, run the experiment again and deploy the resulting saved model.
Next steps
Parent topic: AutoAI overview