Create a dataset
Create a dataset using IBM Spectrum Conductor Deep Learning Impact 1.1.0. IBM Spectrum Conductor Deep Learning Impact supports LMDB, TFRecord and other datasets. Each dataset can include training data, test data and validation data.
- Training data: The sample of data used for learning.
- Test data: The sample of data used to evaluate the model during the training phase.
- Validation data: The sample of data used to evaluate the final model.
IBM Spectrum Conductor Deep Learning Impact assumes that you have collected your raw data and labeled the raw data using a label file or organized the data into folders. In order to create a dataset, you must put the raw data in a folder on the shared file system that IBM Spectrum Conductor Deep Learning Impact has access to. The raw data must be in one of the formats accepted by IBM Spectrum Conductor Deep Learning Impact. The egoadmin and execute user must have read and write permissions to the folder.
Depending on the deep learning framework you are using, different IBM Spectrum Conductor Deep Learning Impact dataset types can be used. If you are using Caffe, you can use the following datasets: LMDBs and Images for object classification. If you are using TensorFlow, you can use the following datasets: TensorFlow Records, Images for object classification, Images for object detention, Images for vector output, CSV files, and other generic types.