4. Customize job submission, control, and query scripts
There are three files to map job submission, control, and query commands and options to the Other Batch System: submit.sh, control.sh, and query.sh. jfd modifies these scripts at runtime to set environment variables, then copies them over to the temporary directory on the Other Batch System host, where they are executed.
If you are using IBM® LoadLeveler® or Open Grid
Scheduler/Grid Engine, there are ready-to-go scripts already provided
in the $JS_TOP/$JS_VERSION/examples/conf/other_batch
directory.
If you are using a different batch system than IBM LoadLeveler or Open Grid Scheduler/Grid Engine, you will need to modify these files and map job submission, control, and query commands for your specific batch system.
The scripts must be bash shell scripts.
submit.sh
Description | Input | Output | Exit Code |
---|---|---|---|
Job submission command and options for the specific batch system. The script is executed on the Other Batch System host to submit jobs. If you require additional submission options from the ones that are provided by default, you can list additional options in the configuration file submit.conf. Runs as the user account defined in the Job Definition: the user account who submitted the job, or the user account specified in the Job Definition. Each submission option in the script has an associated environment variable that is passed to the submission script. Default job submission options exposed in the Job Definition are:
Note that the user is required to provide values for required options. |
Environment variables
Additional custom environment variables specified in submit.conf, if used. |
On Success: Job ID On Error: Error messages if applicable |
Zero on success. Non-zero on failure. Error messages must be printed to standard error |
submit.conf
Customized job submission options that are displayed in Flow Editor in the Job Definition.
The options listed in this file are displayed in addition to the default job submission options.
Note that after defining the environment variables in submit.conf, you will need to modify submit.sh to map the environment variables to actual options in the submit.sh file. You will need to add the new options to submit.sh.
Only labels and text fields (text strings) are supported as customized submission options. No type checking is enforced for the input.
Each line must contain three fields: Label, Environment Variable, and Required(1 indicates required, 0 optional). Lines that start with # are ignored.
When a user is creating a Job Definition, Flow Editor checks the value of the required fields. If any required field is empty, the Job Definition is not complete and the user will not be able to submit the flow. During job submission, jfd checks the value of the required fields. If a required field value is empty, job submission fails.
Example submit.conf file:
#Label #Environment Variable #Required
"Submit to queue" JS_EE_SUBMISSION_QUEUE_NAME 0
"Run on host" JS_EE_SUBMISSION_HOST_NAME 0
"Resource requirement" JS_EE_SUBMISSION_RES_REQ 0
control.sh
Description | Input | Output | Exit Code |
---|---|---|---|
Job control options for the specific batch system: kill, suspend, and resume. Runs as the user account defined in the Job Definition: the user account who submitted the job, or the user account specified in the Job Definition. The script takes two arguments: job control action and job ID:
Note: The job control script does not support controlling multiple
jobs at the same time.
|
Command-line arguments: $1: Job control actions: KILL, SUSPEND, RESUME(required) $2: ID of the job to control(required) |
On Success: Not required On Error: Error messages if applicable |
Zero on success. Non-zero on failure. Error messages must be printed to standard error |
query.sh
File | Description | Input | Output | Exit Code |
---|---|---|---|---|
query.sh | Job query script executed on the Other Batch System to retrieve the status of jobs. Runs as the Other Batch System administrator account to query all jobs submitted to the Other Batch System. The query.sh script is not required to print out the status of all jobs specified on the command-line argument. If a requested job status is not printed out, the status of that job is assumed to be unchanged by jfd. For example, if query.sh has an input of job IDs 1, 2, and 3, and it only prints the status records for job 1 and job 2, jfd assumes that the status of job 3 has not changed. |
Command-line arguments: $1: Space-separated job IDs(required) |
On Success: A list of job status records On Error: Error messages if applicable |
Zero on success. Non-zero on failure. Error messages must be printed to standard error |
Job status records must be printed in the following format:
Each job status record must start with
BEGIN
, and end withEND
, with name-value pairs in between.Only JOB_ID and JOB_STATE are required. Other names, if no values exist, do not need to be listed.
jfd parses the output of query.sh record by record. If any name-value pair is not valid, it will be ignored. If any record does not contain the required name-value pairs, it will be ignored.
Format:
BEGIN
JOB_ID=
JOB_STATE=
JOB_EXIT_STATUS=
CPU_TIME=
DETAIL=
EXEC_HOST=
END
...
Valid names are:
JOB_ID: Required. A valid job ID is a positive integer.
JOB_STATE: Required. A valid job state is one of the following: PENDING, RUNNING, SUSPENDED, DONE and EXIT.
JOB_EXIT_STATUS: A valid job exit status is an integer from 0 to 255. Required when JOB_STATE is DONE or EXIT.
CPU_TIME: A valid CPU time is a non-negative floating-point number. Only read and checked by jfd when JOB_STATE is DONE or EXIT.
DETAIL: Optional. String with additional information to display in the job runtime attributes. You can use the DETAIL string is to provide additional information about a job status that is specific to a batch system. For example, a job in the Other Batch System might be in a PENDING state because it is on hold. In this case, the DETAIL string can say “On Hold”.
EXEC_HOST: Host on which the job is running.
Example output of a query:
username@tt-jj-194: /tmp/query.sh 174 172 171 176
BEGIN
JOB_ID=171
JOB_STATE=DONE
JOB_EXIT_STATUS=0
CPU_TIME=0.073
DETAIL=
EXEC_HOST=hostA
END
BEGIN
JOB_ID=172
JOB_STATE=EXIT
JOB_EXIT_STATUS=137
CPU_TIME=0.088
DETAIL=100 : after job
EXEC_HOST=hostB
END
BEGIN
JOB_ID=174
JOB_STATE=SUSPENDED
END
BEGIN
JOB_ID=176
JOB_STATE=PENDING
END