Job arrays: running jobs with the same executable, but with different input data

You can include a job array in the flow. Using a job array is a convenient way to specify a group of jobs that share the same executable and resource requirements, but use different input data, with a single definition. All jobs in a job array have the same name and same job ID. Each job runs the same executable. Any parameters you specify apply to all jobs in the array. All jobs use an input file from the same location, and write to the same output file location. However, each element of a job array is distinguished by its array index.

Before you begin

Before you can use a job array, you need to prepare your input files.

IBM Spectrum LSF Application Center provides methods for coordinating individual input and output files for the multiple jobs created when submitting a job array. These methods require your input files to be prepared uniformly. To accommodate an executable that uses standard input and standard output, IBM Spectrum LSF Application Center provides variables that are resolved at runtime: %I (job array index) and %J (job ID).

All input files for your job array must be located in the same directory. Specify an absolute path to the directory containing the input files when defining your job array.

Each file consists of two parts: a consistent name string and a variable integer that corresponds directly to an array index. For example, the following file names are valid input file names for a job array. They are made up of the consistent name input and integers that correspond to job array indices from 1 to 1000:

input.1, input.2, input.3, ..., input.1000

For additional information, see the “Job Arrays” chapter in Administering IBM Spectrum LSF.

The maximum number of jobs in a job array is defined with the LSF MAX_JOB_ARRAY_SIZE parameter in the LSF configuration file lsb.params. When a job array index is larger than the value defined by MAX_JOB_ARRAY_SIZE in lsb.params, the job array submission is rejected.

About this task

Tip: View the sample: Workload > Definitions > Flow Definitions > New > Flow Definition and select Importing an example definition, flowarray_eval.xml in the Create Draft Definition dialog.

Procedure

  1. To add a job array to your flow, in Flow Editor, select Insert > Job Array.
  2. Select the newly created job array, right-click and select Open Definition to define array elements.