Yesterday One of my colleague pinged me regarding a problem she was facing with the DB2. As per her description, the application is trying to reset the database (which I interpreted as trying to clear out data so a lot of delete/updates), and during the operation, she is getting sqlcode of -964 which says transaction full. As per her she has done this operation a lot of time and never faced this problem. Whats new now ? what changed that the more logs then usual are getting generated ?. I saw the diag.log entry which directly indicating the same reason "transaction log full". I suggested her to increase transaction log my increasing the logfilesz. After trying a size of 10 times the the normal setting, the problem still persist. We tried increasing the number of primary log files too but the same result. This gave me a doubt that there is something wrong with the application. As she didn't have the code for the application, I suggested her to try out the infinite logging option. Unfortunately, that too failed as its started giving error disk full. There is surely something wrong with the application. When i saw the diag.log, here is the entries
2008-07-22-02.17.25.857213+360 I4618169C459 LEVEL: ErrorPID : 24342 TID : 1 PROC : db2logmgr (TRADEDB) 0INSTANCE: db2inst1 NODE : 000FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogFile, probe:3160MESSAGE : Failed to archive log file S0000000.LOG to /jdk/db2inst1/TRADEDB/NODE0000/C0000000/ from /home/db2inst1/db2inst1/NODE0000/SQL00004/SQLOGDIR/ with rc = -2029060079.
2008-07-22-02.17.25.891161+360 I4618629C323 LEVEL: ErrorPID : 24342 TID : 1 PROC : db2logmgr (TRADEDB) 0INSTANCE: db2inst1 NODE : 000FUNCTION: DB2 UDB, data protection, sqlpgArchivePendingLogs, probe:1500MESSAGE : Log archive failed with rc -2029060079 for LOGARCHMETH1.
2008-07-22-02.17.46.893751+360 I4618953C377 LEVEL: ErrorPID : 24342 TID : 1 PROC : db2logmgr (TRADEDB) 0INSTANCE: db2inst1 NODE : 000FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogDisk, probe:2620RETCODE : ZRC=0x870F0011=-2029060079=SQLO_PATH "an invalid path" DIA8514C An invalid file path, "", was specified.
These entries are coming repeatedly and increasing the size of the log and once the disk is full, it gives a dump and shut down the database. The path mentioned in the logarchmeth1 is a valid path and accessible by the user. I am not sure which path is invalid here.
Anyway reinstalling the application (which will create a new database too) solved her problem but this is not always possible specially in real time. Do we have any tool which will recognize the recursive pattern in the log file (it might be caused by a recursive pattern in the application code) and do some corrective action instead of just keep feeling the log untill disk full giving an impression that the system is hanged. Your suggestions are welcome.