Edit PyTorch model for training

Before adding a PyTorch model to IBM Spectrum Conductor Deep Learning Impact, edit the model to enable some deep learning capabilities.

Using PyTorch with IBM Spectrum Conductor Deep Learning Impact you have the following training engine options:
Single node training

By default, IBM Spectrum Conductor Deep Learning Impact supports single node training for PyTorch models without deep learning insights. To edit your single node training model to support deep learning insights, see Edit a PyTorch training model for deep learning insights.

Elastic distributed training

IBM Spectrum Conductor Deep Learning Impact supports elastic distributing training for PyTorch models that are configured accordingly, refer to Edit a PyTorch training model for elastic distributed training. Elastic distributed training models include deep learning insights and do not require additional configuration for deep learning insights.

In order to run training for PyTorch models, some files must be included with the model.
Table 1. PyTorch model files and descriptions. File names and descriptions for PyTorch models.
File Description
main.py Required. PyTorch training model program main entrance.
inference.py Required. PyTorch inference model program main entrance
elastic-main.py Required for elastic distributed training. PyTorch elastic distributed training model program main entrance.
edi.py Required for elastic distributed inference. PyTorch elastic distributed inference model program main entrance. This file is required to publish training models to the elastic distributed inference engine.