About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Troubleshooting
Problem
How to fix the issue about null value in column "end_time" in postgresql database log?
Symptom
The detail log is as below:
2018-01-26 00:00:00 CET xcatdb xcatadm ERROR: null value in column "end_time" violates not-null constraint
2018-01-26 00:00:00 CET xcatdb xcatadm STATEMENT: INSERT INTO LSF_FINISH_JOB (JOB_ID, JOB_ARRAY_INDEX, JOB_NAME, USER_NAME, JOB_NAME_FULL, QUEUE, SUBMIT_TIME, START_TIME, END_TIME, EXEC_HOSTSTR, PRIORITY, FROM_HOST, COMMAND, IN_FILE, OUT_FILE, PROJECT_NAME, CWD, EXECCWD, ASKHOSTSTR, APPLICATION, DESCRIPTION, CLUSTER_NAME, SUBMIT_TIME_GMT, START_TIME_GMT, USER_GROUP, JOB_GROUP, RES_REQ, EXIT_STATUS, FROM_HOST_CLUSTER, PEND_REASONS, JOBPID, LICENSEPROJECT, RESERVETIME, ESTIMATEDRUNTIME, EXECHOME, EXECUSER, MEM, SWAP, NTHREADS, PGIDSTR, PIDSTR, JOB_STATUS, NUM_PROCESSORS, RU_UTIME, RU_STIME, CPU_TIME, NUM_SLOTS, NUM_EXEC_PROCS, CPU_UTIL, WORKLOAD_TYPE, DURATION, PATH, CREATE_TIME_GMT, GRAPHIC_SESSION_ID, GRAPHIC_HOST, APPLICATION_NAME, PATH_ALIAS, NOTIFICATION, BPOST_MSG, RUNTIME, FLOW_ID, END_TIME_GMT, PATH_REMOVABLE, INSERT_SEQ) SELECT JOB_ID, JOB_ARRAY_INDEX, JOB_NAME, USER_NAME, JOB_NAME_FULL, QUEUE, SUBMIT_TIME, START_TIME, END_TIME, EXEC_HOSTSTR, PRIORITY, FROM_HOST, COMMAND, IN_FILE, OUT_FILE, PROJECT_NAME, CWD, EXECCWD, ASKHOSTSTR, APPLICATION, DESCRIPTION, CLUSTER_NAME, SUBMIT_TIME_GMT, START_TIME_GMT, USER_GROUP, JOB_GROUP, RES_REQ, EXIT_STATUS, FROM_HOST_CLUSTER, PEND_REASONS, JOBPID, LICENSEPROJECT, RESERVETIME, ESTIMATEDRUNTIME, EXECHOME, EXECUSER, MEM, SWAP, NTHREADS, PGIDSTR, PIDSTR, JOB_STATUS, NUM_PROCESSORS, RU_UTIME, RU_STIME, CPU_TIME, NUM_SLOTS, NUM_EXEC_PROCS, CPU_UTIL, WORKLOAD_TYPE, DURATION, PATH, CREATE_TIME_GMT, GRAPHIC_SESSION_ID, GRAPHIC_HOST, APPLICATION_NAME, PATH_ALIAS, NOTIFICATION, BPOST_MSG, RUNTIME, FLOW_ID, END_TIME_GMT, PATH_REMOVABLE, INSERT_SEQ FROM LSF_LIVE_JOB a WHERE a.JOB_ID = $1 AND a.JOB_ARRAY_INDEX = $2 AND a.CLUSTER_NAME = $3
Cause
When this issue happens, there are corresponding errors in commondataloader log as below:
2018-01-26 00:00:00.878 CET WARN com.platform.perf.dataloader.lsf.commonjobloader.RecoverJobs.recoverJobs(RecoverJobs.java:112) RecoverJobs: Exception when moving jobs. org.postgresql.util.PSQLException: ERROR: null value in column "end_time" violates not-null constraint
This issue belongs to the FinishedJobThread of commonjobloader.
Diagnosing The Problem
Because there are some dangling records with null end_time in LSF_LIVE_JOB table, to resolve this issue, the dangling records need be found out and be deleted.
Resolving The Problem
According to the below steps, the issue can be resolved:
1. Open the debug log, you can see the detail job ID information for the dangling records (Dangling job ID are 159017 and 159018).
2018-01-26 00:00:00.876 CET DEBUG com.platform.perf.dataloader.lsf.commonjobloader.RecoverJobs.recoverJobs(RecoverJobs.java:89) Dangling job found: 159017:0
2018-01-26 00:00:00.876 CET DEBUG com.platform.perf.dataloader.lsf.commonjobloader.RecoverJobs.recoverJobs(RecoverJobs.java:89) Dangling job found: 159018:0
2. Enter the database admin command, execute this command: #psql -U xcatadm xcatdb
3. Run the below command to find out the dangling records,
xcatdb=> select job_id,job_name,command,submit_time,start_time,end_time from LSF_LIVE_JOB where JOB_ID=159017;
job_id | job_name | command | submit_time | start_time | end_time
--------+----------+----------+---------------------+---------------------+---------------------
159017 | Wal_opti |#!/bin/bash;... | 2017-06-16 09:25:31 | 2017-06-16 09:25:32 |
xcatdb=> select job_id,job_name,command,submit_time,start_time,end_time from LSF_LIVE_JOB where JOB_ID=159018;
job_id | job_name | command | submit_time | start_time | end_time
--------+----------+----------+---------------------+---------------------+---------------------
159018 | Wal_opti |#!/bin/bash;.. | 2017-06-16 09:25:31 | 2017-06-16 09:25:32 |
4. As you see, <1> The commonjobloader debug log show this two records can't be moved to finished table; <2> According to the database checking, you can find that the submit_time and start_time of two records are very old from now. So you can make a conclusion that the two records are dangling records which need be removed from database. After the below database action, the issue is fixed:
xcatdb=> DELETE FROM LSF_LIVE_JOB WHERE JOB_ID = 159017;
xcatdb=> DELETE FROM LSF_LIVE_JOB WHERE JOB_ID = 159018;
[{"Product":{"code":"SSENRW","label":"Platform HPC for System x"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Database-Postgres","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.2","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
More support for:
Platform HPC for System x
Software version:
4.2
Operating system(s):
Linux
Document number:
634031
Modified date:
17 June 2018
UID
isg3T1026936