4. Customize job submission, control, and query scripts

There are three files to map job submission, control, and query commands and options to the Other Batch System: submit.sh, control.sh, and query.sh. jfd modifies these scripts at runtime to set environment variables, then copies them over to the temporary directory on the Other Batch System host, where they are executed.

If you are using IBM® LoadLeveler® or Open Grid Scheduler/Grid Engine, there are ready-to-go scripts already provided in the $JS_TOP/$JS_VERSION/examples/conf/other_batch directory.

If you are using a different batch system than IBM LoadLeveler or Open Grid Scheduler/Grid Engine, you will need to modify these files and map job submission, control, and query commands for your specific batch system.

The scripts must be bash shell scripts.

submit.sh

Description Input Output Exit Code

Job submission command and options for the specific batch system. The script is executed on the Other Batch System host to submit jobs.

If you require additional submission options from the ones that are provided by default, you can list additional options in the configuration file submit.conf.

Runs as the user account defined in the Job Definition: the user account who submitted the job, or the user account specified in the Job Definition.

Each submission option in the script has an associated environment variable that is passed to the submission script.

Default job submission options exposed in the Job Definition are:

  • Command to run the job (required)
  • Job name (required)
  • Environment variables to be set for the job(optional)

Note that the user is required to provide values for required options.

Environment variables

$JS_EE_SUBMISSION_JOB_COMMAND(required)

$JS_EE_SUBMISSION_JOB_NAME(required)

$JS_EE_SUBMISSION_JOB_ENV_VARS(optional)

Additional custom environment variables specified in submit.conf, if used.

On Success: Job ID

On Error: Error messages if applicable

Zero on success.

Non-zero on failure. Error messages must be printed to standard error

submit.conf

Customized job submission options that are displayed in Flow Editor in the Job Definition.

The options listed in this file are displayed in addition to the default job submission options.

Note that after defining the environment variables in submit.conf, you will need to modify submit.sh to map the environment variables to actual options in the submit.sh file. You will need to add the new options to submit.sh.

Only labels and text fields (text strings) are supported as customized submission options. No type checking is enforced for the input.

Each line must contain three fields: Label, Environment Variable, and Required(1 indicates required, 0 optional). Lines that start with # are ignored.

When a user is creating a Job Definition, Flow Editor checks the value of the required fields. If any required field is empty, the Job Definition is not complete and the user will not be able to submit the flow. During job submission, jfd checks the value of the required fields. If a required field value is empty, job submission fails.

Example submit.conf file:

#Label                       #Environment Variable          #Required 
"Submit to queue"            JS_EE_SUBMISSION_QUEUE_NAME         0
"Run on host"                JS_EE_SUBMISSION_HOST_NAME          0
"Resource requirement"       JS_EE_SUBMISSION_RES_REQ            0

control.sh

Description Input Output Exit Code

Job control options for the specific batch system: kill, suspend, and resume.

Runs as the user account defined in the Job Definition: the user account who submitted the job, or the user account specified in the Job Definition.

The script takes two arguments: job control action and job ID:

  1. Job control action. Valid actions: KILL, SUSPEND, RESUME.
  2. ID of the job to control. A valid job ID is a positive integer.
Note: The job control script does not support controlling multiple jobs at the same time.

Command-line arguments:

$1: Job control actions: KILL, SUSPEND, RESUME(required)

$2: ID of the job to control(required)

On Success: Not required

On Error: Error messages if applicable

Zero on success.

Non-zero on failure. Error messages must be printed to standard error

query.sh

File Description Input Output Exit Code
query.sh

Job query script executed on the Other Batch System to retrieve the status of jobs.

Runs as the Other Batch System administrator account to query all jobs submitted to the Other Batch System.

The query.sh script is not required to print out the status of all jobs specified on the command-line argument. If a requested job status is not printed out, the status of that job is assumed to be unchanged by jfd.

For example, if query.sh has an input of job IDs 1, 2, and 3, and it only prints the status records for job 1 and job 2, jfd assumes that the status of job 3 has not changed.

Command-line arguments:

$1: Space-separated job IDs(required)

On Success: A list of job status records

On Error: Error messages if applicable

Zero on success.

Non-zero on failure. Error messages must be printed to standard error

Job status records must be printed in the following format:

  • Each job status record must start with BEGIN, and end with END, with name-value pairs in between.

  • Only JOB_ID and JOB_STATE are required. Other names, if no values exist, do not need to be listed.

  • jfd parses the output of query.sh record by record. If any name-value pair is not valid, it will be ignored. If any record does not contain the required name-value pairs, it will be ignored.

Format:

BEGIN

JOB_ID=

JOB_STATE=

JOB_EXIT_STATUS=

CPU_TIME=

DETAIL=

EXEC_HOST=

END

...

Valid names are:

  • JOB_ID: Required. A valid job ID is a positive integer.

  • JOB_STATE: Required. A valid job state is one of the following: PENDING, RUNNING, SUSPENDED, DONE and EXIT.

  • JOB_EXIT_STATUS: A valid job exit status is an integer from 0 to 255. Required when JOB_STATE is DONE or EXIT.

  • CPU_TIME: A valid CPU time is a non-negative floating-point number. Only read and checked by jfd when JOB_STATE is DONE or EXIT.

  • DETAIL: Optional. String with additional information to display in the job runtime attributes. You can use the DETAIL string is to provide additional information about a job status that is specific to a batch system. For example, a job in the Other Batch System might be in a PENDING state because it is on hold. In this case, the DETAIL string can say “On Hold”.

  • EXEC_HOST: Host on which the job is running.

Example output of a query:

username@tt-jj-194: /tmp/query.sh 174 172 171 176
BEGIN
JOB_ID=171
JOB_STATE=DONE
JOB_EXIT_STATUS=0
CPU_TIME=0.073
DETAIL=
EXEC_HOST=hostA
END
BEGIN
JOB_ID=172
JOB_STATE=EXIT
JOB_EXIT_STATUS=137
CPU_TIME=0.088
DETAIL=100 : after job
EXEC_HOST=hostB
END
BEGIN
JOB_ID=174
JOB_STATE=SUSPENDED
END
BEGIN
JOB_ID=176
JOB_STATE=PENDING
END