Training neural networks using the experiment builder in Watson Studio Local

This topic describes how to train a neural network using the experiment builder in Watson Studio Local.

As a data scientist, you need to train thousands of models to identify the right combination of data in conjunction with hyperparameters to optimize the performance of your neural networks. You want to perform more experiments and faster. You want to train deeper neural networks and explore more complicated hyperparameter spaces. IBM Watson Machine Learning accelerates this iterative cycle by simplifying the process to train models in parallel with an auto-allocated GPU compute containers.

To try Experiment Builder for yourself, follow the steps in this Experiment builder tutorial to build and run an experiment that predicts handwritten digits.

Prerequisites for using Experiment Builder

The following are required for creating experiments:

Create a new experiment

  1. Open a project.

  2. Click Add to project and choose Experiments.

Specify your training data

Specify the source files folder where you have stored your training data. The path should point to a local repository on Watson Machine Learning Accelerator that your system administrator has set up for your use.

Associate training definitions

You must associate one or more training definitions to this experiment. You can associate multiple training definitions as part of running an experiment. Training definitions can be a mix of either existing training definitions or ones that you create as part of the process.

  1. Click Add training definition.

  2. Choose whether to create a new training definition or use an existing training definition.

    • To create a new training definition, click the New training definition tab and specify the options for the training definition.
    • To choose an existing training definition, click the Existing training definition tab and choose the training definition file.

Creating a new training definition

  1. Type a unique name and a description.

  2. Choose a .zip file that has the Python code that you have set up to indicate the metrics to use for training runs. To see a sample of training definition code, see Experiment builder tutorial. Note that when running an experiment on Watson Studio Local, you specify the data directory where your training data is stored in the training definition. For example:

     data_dir= os.environ["DATA_DIR"];
     train_images_file = data_dir + "/" + train_images_file
     train_labels_file = data_dir + "/" + train_labels_file
     test_images_file = data_dir + "/" + test_images_file
     test_labels_file = data_dir + "/" + test_labels_file
    
  3. From the Frameworks box, select the appropriate framework. This must be compatible with the code you use in the Python file.

  4. In the Execution command box, enter the execution command that can be used to execute the Python code. The execution command must:

    • reference the python code
    • pass the names of the training files
    • optionally specify metrics

    For example:

     convolutional_network.py --trainImagesFile train-images-idx3-ubyte.gz --trainLabelsFile train-labels-idx1-ubyte.gz --testImagesFile t10k-images-idx3-ubyte.gz --testLabelsFile t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 6000
    
  5. In the Attributes during this experiment section, select a compute plan, which determines the number and size of GPUs to use for the experiment.

  6. Save the training definition and run the experiment.

  7. Click the name of a training run to view overview information, metrics, or logs from the run.

  8. Choose Save model from the action menu for a completed training run to save the run as a model, then assign a name. The model will display on the Assets page for the project. From there you can deploy the model to a space and pass it sample data to generate predictions.