Topic
  • 28 replies
  • Latest Post - ‏2008-12-24T11:27:28Z by SystemAdmin
SystemAdmin
SystemAdmin
5837 Posts

Pinned topic DB2 Crashing

‏2006-06-06T17:28:17Z |
Hi,

I am running DB2 on windows XP. My applications loads lots of data, there are concurrent inserts to the database. It ran fine for 7-8 hours but then crashed. I am using ODBC driver to connect to DB2 and using PHP to make odbc calls.

Following are the errors which i got.

IBMCLI Driver SQL1032N No start database manager command was issued. SQLSTATE=57019

As per my guess, this suggests that database must have been crashed. Because it ran fine for 7 - 8 hrs. In the event log (event viewer) there were many warnings related to locks. I have default settings for isolation levels i.e. cursor stability. But nothing related to crash.

I had my one CLP window open, using that window, i can still issue database command i.e. select etc.

When i do net start i get the following db2 services

DB2 - DB2
DB2 JDBC Applet Server
DB2 Security Server
DB2DAS - DB2DAS00

When i issued "db2 get dbm cfg" command i got the following

/*
Database Manager Configuration

Node type = Database Server with local and remote clients

Database manager configuration release level = 0x0a00

Maximum total of files open (MAXTOTFILOP) = 16000
CPU speed (millisec/instruction) (CPUSPEED) = 2.991513e-007

Max number of concurrently active databases (NUMDB) = 8
Data Links support (DATALINKS) = NO
Federated Database System Support (FEDERATED) = NO
Transaction processor monitor name (TP_MON_NAME) =

Default charge-back account (DFT_ACCOUNT_STR) =

Java Development Kit installation path (JDK_PATH) = C:\Program Files\IBM\SQLLIB\java

Diagnostic error capture level (DIAGLEVEL) = 3
Notify Level (NOTIFYLEVEL) = 3
Diagnostic data directory path (DIAGPATH) =

Default database monitor switches
Buffer pool (DFT_MON_BUFPOOL) = OFF
Lock (DFT_MON_LOCK) = OFF
Sort (DFT_MON_SORT) = OFF
Statement (DFT_MON_STMT) = OFF
Table (DFT_MON_TABLE) = OFF
Timestamp (DFT_MON_TIMESTAMP) = ON
Unit of work (DFT_MON_UOW) = OFF
Monitor health of instance and databases (HEALTH_MON) = ON

SYSADM group name (SYSADM_GROUP) =
SYSCTRL group name (SYSCTRL_GROUP) =
SYSMAINT group name (SYSMAINT_GROUP) =
SYSMON group name (SYSMON_GROUP) =

Client Userid-Password Plugin (CLNT_PW_PLUGIN) =
Client Kerberos Plugin (CLNT_KRB_PLUGIN) = IBMkrb5
Group Plugin (GROUP_PLUGIN) =
GSS Plugin for Local Authorization (LOCAL_GSSPLUGIN) =
Server Plugin Mode (SRV_PLUGIN_MODE) = UNFENCED
Server List of GSS Plugins (SRVCON_GSSPLUGIN_LIST) =
Server Userid-Password Plugin (SRVCON_PW_PLUGIN) =
Server Connection Authentication (SRVCON_AUTH) = NOT_SPECIFIED
Database manager authentication (AUTHENTICATION) = SERVER
Cataloging allowed without authority (CATALOG_NOAUTH) = NO
Trust all clients (TRUST_ALLCLNTS) = YES
Trusted client authentication (TRUST_CLNTAUTH) = CLIENT
Bypass federated authentication (FED_NOAUTH) = NO

Default database path (DFTDBPATH) = C:

Database monitor heap size (4KB) (MON_HEAP_SZ) = 66
Java Virtual Machine heap size (4KB) (JAVA_HEAP_SZ) = 512
Audit buffer size (4KB) (AUDIT_BUF_SZ) = 0
Size of instance shared memory (4KB) (INSTANCE_MEMORY) = AUTOMATIC
Backup buffer default size (4KB) (BACKBUFSZ) = 1024
Restore buffer default size (4KB) (RESTBUFSZ) = 1024

Agent stack size (AGENT_STACK_SZ) = 16
Minimum committed private memory (4KB) (MIN_PRIV_MEM) = 32
Private memory threshold (4KB) (PRIV_MEM_THRESH) = 20000

Sort heap threshold (4KB) (SHEAPTHRES) = 10000

Directory cache support (DIR_CACHE) = YES

Application support layer heap size (4KB) (ASLHEAPSZ) = 15
Max requester I/O block size (bytes) (RQRIOBLK) = 32767
DOS requester I/O block size (bytes) (DOS_RQRIOBLK) = 4096
Query heap size (4KB) (QUERY_HEAP_SZ) = 1000

Workload impact by throttled utilities(UTIL_IMPACT_LIM) = 10

Priority of agents (AGENTPRI) = SYSTEM
Max number of existing agents (MAXAGENTS) = 200
Agent pool size (NUM_POOLAGENTS) = 100(calculated)
Initial number of agents in pool (NUM_INITAGENTS) = 0
Max number of coordinating agents (MAX_COORDAGENTS) = MAXAGENTS
Max no. of concurrent coordinating agents (MAXCAGENTS) = MAX_COORDAGENTS
Max number of client connections (MAX_CONNECTIONS) = MAX_COORDAGENTS

Keep fenced process (KEEPFENCED) = YES
Number of pooled fenced processes (FENCED_POOL) = MAX_COORDAGENTS
Initial number of fenced processes (NUM_INITFENCED) = 0

Index re-creation time and redo index build (INDEXREC) = RESTART

Transaction manager database name (TM_DATABASE) = 1ST_CONN
Transaction resync interval (sec) (RESYNC_INTERVAL) = 180

SPM name (SPM_NAME) = PDD_DIPE
SPM log size (SPM_LOG_FILE_SZ) = 256
SPM resync agent limit (SPM_MAX_RESYNC) = 20
SPM log path (SPM_LOG_PATH) =

NetBIOS Workstation name (NNAME) =

TCP/IP Service name (SVCENAME) = db2c_DB2
Discovery mode (DISCOVER) = SEARCH
Discover server instance (DISCOVER_INST) = ENABLE

Maximum query degree of parallelism (MAX_QUERYDEGREE) = ANY
Enable intra-partition parallelism (INTRA_PARALLEL) = NO

No. of int. communication buffers(4KB)(FCM_NUM_BUFFERS) = 1024
Number of FCM request blocks (FCM_NUM_RQB) = AUTOMATIC
Number of FCM connection entries (FCM_NUM_CONNECT) = AUTOMATIC
Number of FCM message anchors (FCM_NUM_ANCHORS) = AUTOMATIC

*/

In the file db2diag.log file there is information like this

/*

2006-06-06-02.18.01.140000-420 I39433348H327 LEVEL: Event
PID : 2112 TID : 1984 PROC : db2fmp.exe
INSTANCE: DB2 NODE : 000
FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:100
START : Automatic Runstats: evaluation has started on database TEST

2006-06-06-02.18.01.562000-420 I39433677H328 LEVEL: Event
PID : 2112 TID : 1984 PROC : db2fmp.exe
INSTANCE: DB2 NODE : 000
FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:900
STOP : Automatic Runstats: evaluation has finished on database TEST

*/

Am i missing something in my settings above. I have kept the default settings and started working on it.

From where can i get proper logs which can give me more information about db2 manager?

Any help to figure this out will be highly appreciated.

Thanks!
Updated on 2008-12-24T11:27:28Z at 2008-12-24T11:27:28Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T17:34:12Z  
    Further, when i have issued command db2Start i get the message saying

    SQL1026N The database manager is already active

    I guess, my database is in really bad state now.
  • ocgstyles
    ocgstyles
    472 Posts

    Re: DB2 Crashing

    ‏2006-06-06T17:54:56Z  
    Further, when i have issued command db2Start i get the message saying

    SQL1026N The database manager is already active

    I guess, my database is in really bad state now.
    If anything crashed, you should see it mentioned in the db2diag.log. The messages should contain text like:
    code
    2006-05-29-21.50.00.107569-240 I6250891G359 LEVEL: Warning
    PID : 16333 TID : 3023025856 PROC : db2agent (LNX)
    INSTANCE: db2admin NODE : 000 DB : SAMPLE
    APPHDL : 0-7 APPID: *LOCAL.db2admin.060530014959
    FUNCTION: DB2 UDB, base sys utilities, sqledint, probe:30
    MESSAGE : Crash Recovery is needed.
    [/code]

    Did you see any messages like that?

    Since you mentioned locks, how about we check some of the DBM stats. Run this command:
    codedb2 get snapshot for all on <dbname>[/code]

    On the first page you should see something like this:
    code
    Locks held currently = 0
    [b]Lock waits = 18[/b]
    Time database waited on locks (ms) = Not Collected
    Lock list memory in use (Bytes) = 40
    [b]Deadlocks detected = 0[/b]
    Lock escalations = 0
    Exclusive lock escalations = 0
    Agents currently waiting on locks = 0
    Lock Timeouts = 0
    Number of indoubt transactions = 0
    etc...
    [/code]

    Anything stand out in the sections I bolded?

    • Keith
  • ocgstyles
    ocgstyles
    472 Posts

    Re: DB2 Crashing

    ‏2006-06-06T18:06:01Z  
    • ocgstyles
    • ‏2006-06-06T17:54:56Z
    If anything crashed, you should see it mentioned in the db2diag.log. The messages should contain text like:
    code
    2006-05-29-21.50.00.107569-240 I6250891G359 LEVEL: Warning
    PID : 16333 TID : 3023025856 PROC : db2agent (LNX)
    INSTANCE: db2admin NODE : 000 DB : SAMPLE
    APPHDL : 0-7 APPID: *LOCAL.db2admin.060530014959
    FUNCTION: DB2 UDB, base sys utilities, sqledint, probe:30
    MESSAGE : Crash Recovery is needed.
    [/code]

    Did you see any messages like that?

    Since you mentioned locks, how about we check some of the DBM stats. Run this command:
    codedb2 get snapshot for all on <dbname>[/code]

    On the first page you should see something like this:
    code
    Locks held currently = 0
    [b]Lock waits = 18[/b]
    Time database waited on locks (ms) = Not Collected
    Lock list memory in use (Bytes) = 40
    [b]Deadlocks detected = 0[/b]
    Lock escalations = 0
    Exclusive lock escalations = 0
    Agents currently waiting on locks = 0
    Lock Timeouts = 0
    Number of indoubt transactions = 0
    etc...
    [/code]

    Anything stand out in the sections I bolded?

    • Keith
    Actually, since the DBM was restarted. I don't think you are going to have that detail on the locks...

    • Keith
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T18:41:45Z  
    • ocgstyles
    • ‏2006-06-06T18:06:01Z
    Actually, since the DBM was restarted. I don't think you are going to have that detail on the locks...

    • Keith
    Hi,

    Even after restarting i m still getting error in accessing it from PHP odbc API.

    I am able to connect to DB2 using DSN directly from ODBC Data Sources (through control panel).

    Do i have to restart my machine? Any other way to investigate why it is behaving like this.

    Thanks!
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T18:46:54Z  
    Hi,

    Even after restarting i m still getting error in accessing it from PHP odbc API.

    I am able to connect to DB2 using DSN directly from ODBC Data Sources (through control panel).

    Do i have to restart my machine? Any other way to investigate why it is behaving like this.

    Thanks!
    Hi,

    I tried twice to send output of snapshot, even though there was nothing much in it. It is getting deleted, am i crossing word limit on each post or was it too lengthy thats why deleted by moderator.

    Thanks!
  • ocgstyles
    ocgstyles
    472 Posts

    Re: DB2 Crashing

    ‏2006-06-06T19:00:31Z  
    Hi,

    I tried twice to send output of snapshot, even though there was nothing much in it. It is getting deleted, am i crossing word limit on each post or was it too lengthy thats why deleted by moderator.

    Thanks!
    If I recall correctly I had the same problem on AIX once, where DB2 didn't seem to know if it was started or not. Rebooting the box helped. I think its worth a shot...

    (Oddly enough I received your lengthy post via email since I was watching the topic and couldn't find it anywhere on here! )

    • Keith
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T19:08:49Z  
    Have you looked at the db2diag.log some more? The two entries you posted do not seem to be errors; they are merely informational (see the "LEVEL: Event" tag).

    I would take a closer look at the admin log and the db2diag log to figure out what sort of events were occurring when the problems occurred.
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T19:09:34Z  
    • ocgstyles
    • ‏2006-06-06T19:00:31Z
    If I recall correctly I had the same problem on AIX once, where DB2 didn't seem to know if it was started or not. Rebooting the box helped. I think its worth a shot...

    (Oddly enough I received your lengthy post via email since I was watching the topic and couldn't find it anywhere on here! )

    • Keith
    Hi,

    Thanks for replying and sovling mystery of lengthy email....

    Well when i restarted my php and apache it started working.....

    I don't know where exactly my db2 crashed and for what specific reasons did it crash.....

    Are there any other logs which i am not able to find....

    I was reading about autorestart option for database manager, and by default it is on so when my database must have crashed why didn't it restarted...i mean it restarted but i was not able to access it through PHP using ODBC driver....

    To replicate this error i guess, i have to again start my script which might product this kind of behavior at the end of the day.....

    Thanks!
  • ocgstyles
    ocgstyles
    472 Posts

    Re: DB2 Crashing

    ‏2006-06-06T19:40:35Z  
    Hi,

    Thanks for replying and sovling mystery of lengthy email....

    Well when i restarted my php and apache it started working.....

    I don't know where exactly my db2 crashed and for what specific reasons did it crash.....

    Are there any other logs which i am not able to find....

    I was reading about autorestart option for database manager, and by default it is on so when my database must have crashed why didn't it restarted...i mean it restarted but i was not able to access it through PHP using ODBC driver....

    To replicate this error i guess, i have to again start my script which might product this kind of behavior at the end of the day.....

    Thanks!
    well... glad that worked!

    i don't know if what i am going to say is possible or not, or makes sense, but i'll offer it anyway. maybe something happened with a connection (apache) to the database and wasn't cleaned up properly at instance level. if DB2 was incorrectly reporting it, maybe it would spit out weird messages like we saw. and maybe stopping apache cleaned up that connection, then when apache restarted and you used your PHP app, which may have been the first connection to the database, caused crash recovery...

    The logs you'll be looking at are db2diag.log and in Windows Event Viewer. Was anything about crash recovery added to the log after you noticed it was fixed? An entry should be there, even when AUTORESTART is enabled or disabled.

    • Keith
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T20:37:16Z  
    • ocgstyles
    • ‏2006-06-06T19:40:35Z
    well... glad that worked!

    i don't know if what i am going to say is possible or not, or makes sense, but i'll offer it anyway. maybe something happened with a connection (apache) to the database and wasn't cleaned up properly at instance level. if DB2 was incorrectly reporting it, maybe it would spit out weird messages like we saw. and maybe stopping apache cleaned up that connection, then when apache restarted and you used your PHP app, which may have been the first connection to the database, caused crash recovery...

    The logs you'll be looking at are db2diag.log and in Windows Event Viewer. Was anything about crash recovery added to the log after you noticed it was fixed? An entry should be there, even when AUTORESTART is enabled or disabled.

    • Keith
    I got the following messages in db2 log

    /**

    2006-06-06-11.43.01.250000-420 I39443697H323 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:10
    START : Automatic reorg evaluation has started on database TEST

    2006-06-06-11.43.02.046000-420 I39444022H337 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:20
    START : Automatic reorg evaluation has finished successfully on database TEST

    2006-06-06-11.43.02.656000-420 I39444361H328 LEVEL: Event
    PID : 2112 TID : 880 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Health Monitor, db2HmonEvalBackup, probe:260
    STOP : Automatic Backup: evaluation has finished successfully on database TEST

    2006-06-06-12.43.01.906000-420 I39444691H327 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:100
    START : Automatic Runstats: evaluation has started on database TEST

    2006-06-06-12.43.02.578000-420 I39445020H328 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:900
    STOP : Automatic Runstats: evaluation has finished on database TEST
    */

    I am not sure whether it recovered or not because there are not statement saying "crash recovery" etc.

    I will start my application again to see whether it repeats the behavior or not.

    By default autorestart is on right? How do i check this attribute?

    Thanks!
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T20:40:07Z  
    Have you looked at the db2diag.log some more? The two entries you posted do not seem to be errors; they are merely informational (see the "LEVEL: Event" tag).

    I would take a closer look at the admin log and the db2diag log to figure out what sort of events were occurring when the problems occurred.
    After i restarted apache and php, it started working and i got the following messages in my db2 log

    /**
    2006-06-06-11.43.01.250000-420 I39443697H323 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:10
    START : Automatic reorg evaluation has started on database TEST

    2006-06-06-11.43.02.046000-420 I39444022H337 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:20
    START : Automatic reorg evaluation has finished successfully on database TEST

    2006-06-06-11.43.02.656000-420 I39444361H328 LEVEL: Event
    PID : 2112 TID : 880 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Health Monitor, db2HmonEvalBackup, probe:260
    STOP : Automatic Backup: evaluation has finished successfully on database TEST

    2006-06-06-12.43.01.906000-420 I39444691H327 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:100
    START : Automatic Runstats: evaluation has started on database TEST

    2006-06-06-12.43.02.578000-420 I39445020H328 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:900
    STOP : Automatic Runstats: evaluation has finished on database TEST

    */

    I am not sure whether it actually recovered or not because there is no mention of recovery here. By admin log you mean event log? If you mean event log, there is nothing in event log.

    I am trying to replicate the problem my starting my test script which will upload data for n number of hours continuously.

    Does it have to do anything with backup processing which is goin on by default?

    Thanks!
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T20:42:46Z  
    I got the following messages in db2 log

    /**

    2006-06-06-11.43.01.250000-420 I39443697H323 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:10
    START : Automatic reorg evaluation has started on database TEST

    2006-06-06-11.43.02.046000-420 I39444022H337 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:20
    START : Automatic reorg evaluation has finished successfully on database TEST

    2006-06-06-11.43.02.656000-420 I39444361H328 LEVEL: Event
    PID : 2112 TID : 880 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Health Monitor, db2HmonEvalBackup, probe:260
    STOP : Automatic Backup: evaluation has finished successfully on database TEST

    2006-06-06-12.43.01.906000-420 I39444691H327 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:100
    START : Automatic Runstats: evaluation has started on database TEST

    2006-06-06-12.43.02.578000-420 I39445020H328 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:900
    STOP : Automatic Runstats: evaluation has finished on database TEST
    */

    I am not sure whether it recovered or not because there are not statement saying "crash recovery" etc.

    I will start my application again to see whether it repeats the behavior or not.

    By default autorestart is on right? How do i check this attribute?

    Thanks!
    Hmm I am sure that your db2diag.log would contain far more information than this. And these are just, as before, informational messages.
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T20:48:35Z  
    After i restarted apache and php, it started working and i got the following messages in my db2 log

    /**
    2006-06-06-11.43.01.250000-420 I39443697H323 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:10
    START : Automatic reorg evaluation has started on database TEST

    2006-06-06-11.43.02.046000-420 I39444022H337 LEVEL: Event
    PID : 2112 TID : 2368 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:20
    START : Automatic reorg evaluation has finished successfully on database TEST

    2006-06-06-11.43.02.656000-420 I39444361H328 LEVEL: Event
    PID : 2112 TID : 880 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Health Monitor, db2HmonEvalBackup, probe:260
    STOP : Automatic Backup: evaluation has finished successfully on database TEST

    2006-06-06-12.43.01.906000-420 I39444691H327 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:100
    START : Automatic Runstats: evaluation has started on database TEST

    2006-06-06-12.43.02.578000-420 I39445020H328 LEVEL: Event
    PID : 2112 TID : 4868 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalStats, probe:900
    STOP : Automatic Runstats: evaluation has finished on database TEST

    */

    I am not sure whether it actually recovered or not because there is no mention of recovery here. By admin log you mean event log? If you mean event log, there is nothing in event log.

    I am trying to replicate the problem my starting my test script which will upload data for n number of hours continuously.

    Does it have to do anything with backup processing which is goin on by default?

    Thanks!
    > After i restarted apache and php, it started working
    > and i got the following messages in my db2 log
    >
    > /**
    > 2006-06-06-11.43.01.250000-420 I39443697H323
    > LEVEL: Event

    snip

    > 2006-06-06-11.43.02.046000-420 I39444022H337
    > LEVEL: Event

    snip

    > 2006-06-06-11.43.02.656000-420 I39444361H328
    > LEVEL: Event

    snip

    > 2006-06-06-12.43.01.906000-420 I39444691H327
    > LEVEL: Event

    snip

    > 2006-06-06-12.43.02.578000-420 I39445020H328
    > LEVEL: Event

    snip

    > */

    Note that all of the messages are the same. None of these indicate any error whatsoever.
    > I am not sure whether it actually recovered or not
    > because there is no mention of recovery here. By
    > admin log you mean event log? If you mean event log,
    > there is nothing in event log.
    >
    > I am trying to replicate the problem my starting my
    > test script which will upload data for n number of
    > hours continuously.
    >
    In another message, you asked about autorestart. It's a DBM config. You can find information here:

    http://publib.boulder.ibm.com/infocenter/db2luw/v8//topic/com.ibm.db2.udb.doc/admin/r0000257.htm
    > Does it have to do anything with backup processing
    > which is goin on by default?

    Well, without the db2diag.log and the admin log, it's hard to say.
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-06T20:59:07Z  
    > After i restarted apache and php, it started working
    > and i got the following messages in my db2 log
    >
    > /**
    > 2006-06-06-11.43.01.250000-420 I39443697H323
    > LEVEL: Event

    snip

    > 2006-06-06-11.43.02.046000-420 I39444022H337
    > LEVEL: Event

    snip

    > 2006-06-06-11.43.02.656000-420 I39444361H328
    > LEVEL: Event

    snip

    > 2006-06-06-12.43.01.906000-420 I39444691H327
    > LEVEL: Event

    snip

    > 2006-06-06-12.43.02.578000-420 I39445020H328
    > LEVEL: Event

    snip

    > */

    Note that all of the messages are the same. None of these indicate any error whatsoever.
    > I am not sure whether it actually recovered or not
    > because there is no mention of recovery here. By
    > admin log you mean event log? If you mean event log,
    > there is nothing in event log.
    >
    > I am trying to replicate the problem my starting my
    > test script which will upload data for n number of
    > hours continuously.
    >
    In another message, you asked about autorestart. It's a DBM config. You can find information here:

    http://publib.boulder.ibm.com/infocenter/db2luw/v8//topic/com.ibm.db2.udb.doc/admin/r0000257.htm
    > Does it have to do anything with backup processing
    > which is goin on by default?

    Well, without the db2diag.log and the admin log, it's hard to say.
    I am getting only these messages. If i have missed before there are the backup messages...
    /*
    2006-06-06-13.43.02.359000-420 I39445675H337 LEVEL: Event
    PID : 2112 TID : 3456 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:20
    START : Automatic reorg evaluation has finished successfully on database TEST

    2006-06-06-13.43.03.218000-420 I39446014H328 LEVEL: Event
    PID : 2112 TID : 1312 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Health Monitor, db2HmonEvalBackup, probe:260
    STOP : Automatic Backup: evaluation has finished successfully on database TEST
    */

    Thanks!
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-07T15:11:23Z  
    I am getting only these messages. If i have missed before there are the backup messages...
    /*
    2006-06-06-13.43.02.359000-420 I39445675H337 LEVEL: Event
    PID : 2112 TID : 3456 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Automatic Table Maintenance, db2HmonEvalReorg, probe:20
    START : Automatic reorg evaluation has finished successfully on database TEST

    2006-06-06-13.43.03.218000-420 I39446014H328 LEVEL: Event
    PID : 2112 TID : 1312 PROC : db2fmp.exe
    INSTANCE: DB2 NODE : 000
    FUNCTION: DB2 UDB, Health Monitor, db2HmonEvalBackup, probe:260
    STOP : Automatic Backup: evaluation has finished successfully on database TEST
    */

    Thanks!
    ok, without, at a minimum, the db2diag.log entries around the time (before and during) the error, it would be impossible to say what the error is. These are recent entries and just indicate normal operation from what I can tell.
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-07T18:37:32Z  
    ok, without, at a minimum, the db2diag.log entries around the time (before and during) the error, it would be impossible to say what the error is. These are recent entries and just indicate normal operation from what I can tell.
    Hi,

    I ran my script again last night and apache was throwing error (memory error and i hope with high end system that will go away). But this disconnected my clients somehow and it stopped uploading data.

    One strange thing which i have observed today is that my database is in inconsistent state. But i am able to connect, upload and retrieve data. I went to control center, and clicked on configure database logging and there it shows the database state as inconsistent. I am not sure why it is showing like this, any clue?
    I have got the snapshot of the database which is around 1.6 MB

    I did get the following warnings in my event logs
    /**
    2006-06-06-22.59.39.312001 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:2060 Appid:*LOCAL.DB2.060607072302
    data management sqldEscalateLocks Probe:3 Database:TEST

    ADM5502W The escalation of "730" locks on table "DIPESH .TEST_SCRIPT" to lock
    intent "X" was successful.
    2006-06-06-22.59.39.312000 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:2060 Appid:*LOCAL.DB2.060607072302
    data management sqldEscalateLocks Probe:2 Database:TEST

    ADM5500W DB2 is performing lock escalation. The total number of locks
    currently held is "1128", and the target number of locks to hold is "564".

    2006-06-06-22.07.23.468000 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:1364 Appid:*LOCAL.DB2.060607062326
    access plan manager sqlra_cache_mem_please Probe:20 Database:TEST

    ADM4500W A package cache overflow condition has occurred. There is no error
    but this indicates that the package cache has exceeded the configured maximum
    size. If this condition persists, you may want to adjust the PCKCACHESZ DB
    configuration parameter.

    2006-06-06-17.10.00.218000 Instance:DB2 Node:000
    PID:468(db2fmp.exe) TID:6024 Appid:none
    Health Monitor HealthIndicator::update Probe:500

    ADM10512W Health indicator "Statistics Collection Required"
    ("db.tb_runstats_req") is in state "Update statistics required" on "database"
    "DB2.TEST ". Collection (Object Name, Timestamp, Value, Detail): "
    (SYSIBM.SYSCOLUMNS, , 2, RUNSTATS), (SYSIBM.SYSCOLPROPERTIES, , 2, RUNSTATS),
    (SYSIBM.SYSPLAN, , 2, RUNSTATS), (SYSIBM.SYSSECTION, , 2, RUNSTATS),
    (SYSIBM.SYSSTMT, , 2, RUNSTATS), (SYSIBM.SYSDBAUTH, , 2, RUNSTATS),
    (SYSIBM.SYSROUTINES, , 2, RUNSTATS), (SYSIBM.SYSCHECKS, , 2, RUNSTATS),
    (DIPESH.SYSTEM_GROUP, , 2, RUNSTATS), (SYSIBM.SYSCOLCHECKS, , 2, RUNSTATS),
    (SYSIBM.SYSEVENTMONITORS, , 2, RUNSTATS), (SYSIBM.SYSTABLESPACES, , 2,
    RUNSTATS), (SYSIBM.SYSTRIGGERS, , 2, RUNSTATS), (SYSIBM.SYSBUFFERPOOLS, , 2,
    RUNSTATS), (SYSIBM.SYSBUFFERPOOLNODES, , 2, RUNSTATS), (SYSIBM.SYSPROCOPTIONS, ,
    2, RUNSTATS), (SYSIBM.SYSSEQUENCES, , 2, RUNSTATS), (SYSIBM.SYSVERSIONS, , 2,
    RUNSTATS), (SYSIBM.SYSROUTINEPROPERTIESJAVA, , 2, RUNSTATS),
    (SYSIBM.SYSCODEPROPERTIES, , 2, RUNSTATS), (DIPESH.MANUFACTURER, , 2, RUNSTATS),
    (SYSTOOLS.HMON_ATM_INFO, , 2, RUNSTATS), (SYSTOOLS.POLICY, , 2, RUNSTATS),
    (DIPESH.SYSTEM_CONFIGURATION, , 2, RUNSTATS), (DIPESH.SYSTEM_PHASE, , 2,
    RUNSTATS), (DIPESH.SYSTEM, , 2, RUNSTATS), (DIPESH.STRING_252, , 2, RUNSTATS),
    (DIPESH.COMPONENT_IMP, , 2, RUNSTATS), (DIPESH.SYSTEM_HIDDEN_STATE, , 2,
    RUNSTATS), (DIPESH.VERSIONS, , 2, RUNSTATS), (DIPESH.ICON, , 2, RUNSTATS),
    (DIPESH.CAPABILITY, , 2, RUNSTATS), (DIPESH.PROPERTY_CATEGORY, , 2, RUNSTATS),
    (DIPESH.PROPERTY_IMP, , 2, RUNSTATS), (DIPESH.PROPERTY_VALUE, , 2, RUNSTATS),
    (DIPESH.SYSTEM_ALIAS, , 2, RUNSTATS), (DIPESH.USERS, , 2, RUNSTATS),
    (DIPESH.SCRIPT_FILE, , 2, RUNSTATS), (DIPESH.TEST_RESULT, , 2, RUNSTATS),
    (DIPESH.MISSING_MSG, , 2, RUNSTATS), (DIPESH.TEST_SCRIPT, , 2, RUNSTATS),
    (DIPESH.PCD_SYSTEM_SCRIPTS, , 2, RUNSTATS), (DIPESH.CAPABILITY_LIST, , 2,
    RUNSTATS), (DIPESH.COMPONENT_LIST, , 2, RUNSTATS), (DIPESH.PROPERTY_VALUE_LIST,
    , 2, RUNSTATS), (DIPESH.MODULE, , 2, RUNSTATS), (DIPESH.TEST_DESCRIPTION, , 2,
    RUNSTATS), (DIPESH.TEST, , 2, RUNSTATS), (DIPESH.CAPABILITY_TEST_LIST, , 2,
    RUNSTATS), (DIPESH.SYSTEM_TEST_LIST, , 2, RUNSTATS), (DIPESH.TEST_RUN, , 2,
    RUNSTATS), (DIPESH.PARAMETERS, , 2, RUNSTATS), (DIPESH.PCD_SYSTEM_KEYS, , 2,
    RUNSTATS), (DIPESH.PCD_SYSTEM_TIMES, , 2, RUNSTATS), (DIPESH.DEVICE_STACK, , 2,
    RUNSTATS), (DIPESH.PROPERTY_STACK, , 2, RUNSTATS), (DIPESH.MESSAGEBOX, , 2,
    RUNSTATS), (DIPESH.PCD_LOG, , 2, RUNSTATS), (DIPESH.LOGGER, , 2, RUNSTATS)".
    2006-06-06-17.04.59.125000 Instance:DB2 Node:000
    PID:468(db2fmp.exe) TID:4656 Appid:none
    Health Monitor HealthIndicator::update Probe:500

    ADM10502W Health indicator "Database Backup Required" ("db.db_backup_req") is
    in state "Manual backup required" on "database" "DB2.TEST ".

    */

    Do i need to tweak some parameters like package cache size, what should i set it with? What are these health monitor warnings, do i need to do some settings?

    I have enabled maintainence to fully automated from the the control center.

    Any help to figure out above issues will be highly appreicated.

    Thanks!
    Regards,

    Dipesh
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-07T18:49:12Z  
    Hi,

    I ran my script again last night and apache was throwing error (memory error and i hope with high end system that will go away). But this disconnected my clients somehow and it stopped uploading data.

    One strange thing which i have observed today is that my database is in inconsistent state. But i am able to connect, upload and retrieve data. I went to control center, and clicked on configure database logging and there it shows the database state as inconsistent. I am not sure why it is showing like this, any clue?
    I have got the snapshot of the database which is around 1.6 MB

    I did get the following warnings in my event logs
    /**
    2006-06-06-22.59.39.312001 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:2060 Appid:*LOCAL.DB2.060607072302
    data management sqldEscalateLocks Probe:3 Database:TEST

    ADM5502W The escalation of "730" locks on table "DIPESH .TEST_SCRIPT" to lock
    intent "X" was successful.
    2006-06-06-22.59.39.312000 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:2060 Appid:*LOCAL.DB2.060607072302
    data management sqldEscalateLocks Probe:2 Database:TEST

    ADM5500W DB2 is performing lock escalation. The total number of locks
    currently held is "1128", and the target number of locks to hold is "564".

    2006-06-06-22.07.23.468000 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:1364 Appid:*LOCAL.DB2.060607062326
    access plan manager sqlra_cache_mem_please Probe:20 Database:TEST

    ADM4500W A package cache overflow condition has occurred. There is no error
    but this indicates that the package cache has exceeded the configured maximum
    size. If this condition persists, you may want to adjust the PCKCACHESZ DB
    configuration parameter.

    2006-06-06-17.10.00.218000 Instance:DB2 Node:000
    PID:468(db2fmp.exe) TID:6024 Appid:none
    Health Monitor HealthIndicator::update Probe:500

    ADM10512W Health indicator "Statistics Collection Required"
    ("db.tb_runstats_req") is in state "Update statistics required" on "database"
    "DB2.TEST ". Collection (Object Name, Timestamp, Value, Detail): "
    (SYSIBM.SYSCOLUMNS, , 2, RUNSTATS), (SYSIBM.SYSCOLPROPERTIES, , 2, RUNSTATS),
    (SYSIBM.SYSPLAN, , 2, RUNSTATS), (SYSIBM.SYSSECTION, , 2, RUNSTATS),
    (SYSIBM.SYSSTMT, , 2, RUNSTATS), (SYSIBM.SYSDBAUTH, , 2, RUNSTATS),
    (SYSIBM.SYSROUTINES, , 2, RUNSTATS), (SYSIBM.SYSCHECKS, , 2, RUNSTATS),
    (DIPESH.SYSTEM_GROUP, , 2, RUNSTATS), (SYSIBM.SYSCOLCHECKS, , 2, RUNSTATS),
    (SYSIBM.SYSEVENTMONITORS, , 2, RUNSTATS), (SYSIBM.SYSTABLESPACES, , 2,
    RUNSTATS), (SYSIBM.SYSTRIGGERS, , 2, RUNSTATS), (SYSIBM.SYSBUFFERPOOLS, , 2,
    RUNSTATS), (SYSIBM.SYSBUFFERPOOLNODES, , 2, RUNSTATS), (SYSIBM.SYSPROCOPTIONS, ,
    2, RUNSTATS), (SYSIBM.SYSSEQUENCES, , 2, RUNSTATS), (SYSIBM.SYSVERSIONS, , 2,
    RUNSTATS), (SYSIBM.SYSROUTINEPROPERTIESJAVA, , 2, RUNSTATS),
    (SYSIBM.SYSCODEPROPERTIES, , 2, RUNSTATS), (DIPESH.MANUFACTURER, , 2, RUNSTATS),
    (SYSTOOLS.HMON_ATM_INFO, , 2, RUNSTATS), (SYSTOOLS.POLICY, , 2, RUNSTATS),
    (DIPESH.SYSTEM_CONFIGURATION, , 2, RUNSTATS), (DIPESH.SYSTEM_PHASE, , 2,
    RUNSTATS), (DIPESH.SYSTEM, , 2, RUNSTATS), (DIPESH.STRING_252, , 2, RUNSTATS),
    (DIPESH.COMPONENT_IMP, , 2, RUNSTATS), (DIPESH.SYSTEM_HIDDEN_STATE, , 2,
    RUNSTATS), (DIPESH.VERSIONS, , 2, RUNSTATS), (DIPESH.ICON, , 2, RUNSTATS),
    (DIPESH.CAPABILITY, , 2, RUNSTATS), (DIPESH.PROPERTY_CATEGORY, , 2, RUNSTATS),
    (DIPESH.PROPERTY_IMP, , 2, RUNSTATS), (DIPESH.PROPERTY_VALUE, , 2, RUNSTATS),
    (DIPESH.SYSTEM_ALIAS, , 2, RUNSTATS), (DIPESH.USERS, , 2, RUNSTATS),
    (DIPESH.SCRIPT_FILE, , 2, RUNSTATS), (DIPESH.TEST_RESULT, , 2, RUNSTATS),
    (DIPESH.MISSING_MSG, , 2, RUNSTATS), (DIPESH.TEST_SCRIPT, , 2, RUNSTATS),
    (DIPESH.PCD_SYSTEM_SCRIPTS, , 2, RUNSTATS), (DIPESH.CAPABILITY_LIST, , 2,
    RUNSTATS), (DIPESH.COMPONENT_LIST, , 2, RUNSTATS), (DIPESH.PROPERTY_VALUE_LIST,
    , 2, RUNSTATS), (DIPESH.MODULE, , 2, RUNSTATS), (DIPESH.TEST_DESCRIPTION, , 2,
    RUNSTATS), (DIPESH.TEST, , 2, RUNSTATS), (DIPESH.CAPABILITY_TEST_LIST, , 2,
    RUNSTATS), (DIPESH.SYSTEM_TEST_LIST, , 2, RUNSTATS), (DIPESH.TEST_RUN, , 2,
    RUNSTATS), (DIPESH.PARAMETERS, , 2, RUNSTATS), (DIPESH.PCD_SYSTEM_KEYS, , 2,
    RUNSTATS), (DIPESH.PCD_SYSTEM_TIMES, , 2, RUNSTATS), (DIPESH.DEVICE_STACK, , 2,
    RUNSTATS), (DIPESH.PROPERTY_STACK, , 2, RUNSTATS), (DIPESH.MESSAGEBOX, , 2,
    RUNSTATS), (DIPESH.PCD_LOG, , 2, RUNSTATS), (DIPESH.LOGGER, , 2, RUNSTATS)".
    2006-06-06-17.04.59.125000 Instance:DB2 Node:000
    PID:468(db2fmp.exe) TID:4656 Appid:none
    Health Monitor HealthIndicator::update Probe:500

    ADM10502W Health indicator "Database Backup Required" ("db.db_backup_req") is
    in state "Manual backup required" on "database" "DB2.TEST ".

    */

    Do i need to tweak some parameters like package cache size, what should i set it with? What are these health monitor warnings, do i need to do some settings?

    I have enabled maintainence to fully automated from the the control center.

    Any help to figure out above issues will be highly appreicated.

    Thanks!
    Regards,

    Dipesh
    Hi,

    My current pckcachesz = 8 * maxAppls
    and maxAppls = automatic (169)

    Since i was getting some warnings in the event viewer, do i need to change it?

    Thanks!
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-07T18:59:26Z  
    Hi,

    I ran my script again last night and apache was throwing error (memory error and i hope with high end system that will go away). But this disconnected my clients somehow and it stopped uploading data.

    One strange thing which i have observed today is that my database is in inconsistent state. But i am able to connect, upload and retrieve data. I went to control center, and clicked on configure database logging and there it shows the database state as inconsistent. I am not sure why it is showing like this, any clue?
    I have got the snapshot of the database which is around 1.6 MB

    I did get the following warnings in my event logs
    /**
    2006-06-06-22.59.39.312001 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:2060 Appid:*LOCAL.DB2.060607072302
    data management sqldEscalateLocks Probe:3 Database:TEST

    ADM5502W The escalation of "730" locks on table "DIPESH .TEST_SCRIPT" to lock
    intent "X" was successful.
    2006-06-06-22.59.39.312000 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:2060 Appid:*LOCAL.DB2.060607072302
    data management sqldEscalateLocks Probe:2 Database:TEST

    ADM5500W DB2 is performing lock escalation. The total number of locks
    currently held is "1128", and the target number of locks to hold is "564".

    2006-06-06-22.07.23.468000 Instance:DB2 Node:000
    PID:408(db2syscs.exe) TID:1364 Appid:*LOCAL.DB2.060607062326
    access plan manager sqlra_cache_mem_please Probe:20 Database:TEST

    ADM4500W A package cache overflow condition has occurred. There is no error
    but this indicates that the package cache has exceeded the configured maximum
    size. If this condition persists, you may want to adjust the PCKCACHESZ DB
    configuration parameter.

    2006-06-06-17.10.00.218000 Instance:DB2 Node:000
    PID:468(db2fmp.exe) TID:6024 Appid:none
    Health Monitor HealthIndicator::update Probe:500

    ADM10512W Health indicator "Statistics Collection Required"
    ("db.tb_runstats_req") is in state "Update statistics required" on "database"
    "DB2.TEST ". Collection (Object Name, Timestamp, Value, Detail): "
    (SYSIBM.SYSCOLUMNS, , 2, RUNSTATS), (SYSIBM.SYSCOLPROPERTIES, , 2, RUNSTATS),
    (SYSIBM.SYSPLAN, , 2, RUNSTATS), (SYSIBM.SYSSECTION, , 2, RUNSTATS),
    (SYSIBM.SYSSTMT, , 2, RUNSTATS), (SYSIBM.SYSDBAUTH, , 2, RUNSTATS),
    (SYSIBM.SYSROUTINES, , 2, RUNSTATS), (SYSIBM.SYSCHECKS, , 2, RUNSTATS),
    (DIPESH.SYSTEM_GROUP, , 2, RUNSTATS), (SYSIBM.SYSCOLCHECKS, , 2, RUNSTATS),
    (SYSIBM.SYSEVENTMONITORS, , 2, RUNSTATS), (SYSIBM.SYSTABLESPACES, , 2,
    RUNSTATS), (SYSIBM.SYSTRIGGERS, , 2, RUNSTATS), (SYSIBM.SYSBUFFERPOOLS, , 2,
    RUNSTATS), (SYSIBM.SYSBUFFERPOOLNODES, , 2, RUNSTATS), (SYSIBM.SYSPROCOPTIONS, ,
    2, RUNSTATS), (SYSIBM.SYSSEQUENCES, , 2, RUNSTATS), (SYSIBM.SYSVERSIONS, , 2,
    RUNSTATS), (SYSIBM.SYSROUTINEPROPERTIESJAVA, , 2, RUNSTATS),
    (SYSIBM.SYSCODEPROPERTIES, , 2, RUNSTATS), (DIPESH.MANUFACTURER, , 2, RUNSTATS),
    (SYSTOOLS.HMON_ATM_INFO, , 2, RUNSTATS), (SYSTOOLS.POLICY, , 2, RUNSTATS),
    (DIPESH.SYSTEM_CONFIGURATION, , 2, RUNSTATS), (DIPESH.SYSTEM_PHASE, , 2,
    RUNSTATS), (DIPESH.SYSTEM, , 2, RUNSTATS), (DIPESH.STRING_252, , 2, RUNSTATS),
    (DIPESH.COMPONENT_IMP, , 2, RUNSTATS), (DIPESH.SYSTEM_HIDDEN_STATE, , 2,
    RUNSTATS), (DIPESH.VERSIONS, , 2, RUNSTATS), (DIPESH.ICON, , 2, RUNSTATS),
    (DIPESH.CAPABILITY, , 2, RUNSTATS), (DIPESH.PROPERTY_CATEGORY, , 2, RUNSTATS),
    (DIPESH.PROPERTY_IMP, , 2, RUNSTATS), (DIPESH.PROPERTY_VALUE, , 2, RUNSTATS),
    (DIPESH.SYSTEM_ALIAS, , 2, RUNSTATS), (DIPESH.USERS, , 2, RUNSTATS),
    (DIPESH.SCRIPT_FILE, , 2, RUNSTATS), (DIPESH.TEST_RESULT, , 2, RUNSTATS),
    (DIPESH.MISSING_MSG, , 2, RUNSTATS), (DIPESH.TEST_SCRIPT, , 2, RUNSTATS),
    (DIPESH.PCD_SYSTEM_SCRIPTS, , 2, RUNSTATS), (DIPESH.CAPABILITY_LIST, , 2,
    RUNSTATS), (DIPESH.COMPONENT_LIST, , 2, RUNSTATS), (DIPESH.PROPERTY_VALUE_LIST,
    , 2, RUNSTATS), (DIPESH.MODULE, , 2, RUNSTATS), (DIPESH.TEST_DESCRIPTION, , 2,
    RUNSTATS), (DIPESH.TEST, , 2, RUNSTATS), (DIPESH.CAPABILITY_TEST_LIST, , 2,
    RUNSTATS), (DIPESH.SYSTEM_TEST_LIST, , 2, RUNSTATS), (DIPESH.TEST_RUN, , 2,
    RUNSTATS), (DIPESH.PARAMETERS, , 2, RUNSTATS), (DIPESH.PCD_SYSTEM_KEYS, , 2,
    RUNSTATS), (DIPESH.PCD_SYSTEM_TIMES, , 2, RUNSTATS), (DIPESH.DEVICE_STACK, , 2,
    RUNSTATS), (DIPESH.PROPERTY_STACK, , 2, RUNSTATS), (DIPESH.MESSAGEBOX, , 2,
    RUNSTATS), (DIPESH.PCD_LOG, , 2, RUNSTATS), (DIPESH.LOGGER, , 2, RUNSTATS)".
    2006-06-06-17.04.59.125000 Instance:DB2 Node:000
    PID:468(db2fmp.exe) TID:4656 Appid:none
    Health Monitor HealthIndicator::update Probe:500

    ADM10502W Health indicator "Database Backup Required" ("db.db_backup_req") is
    in state "Manual backup required" on "database" "DB2.TEST ".

    */

    Do i need to tweak some parameters like package cache size, what should i set it with? What are these health monitor warnings, do i need to do some settings?

    I have enabled maintainence to fully automated from the the control center.

    Any help to figure out above issues will be highly appreicated.

    Thanks!
    Regards,

    Dipesh
    > Hi,
    >
    > I ran my script again last night and apache was
    > throwing error (memory error and i hope with high end
    > system that will go away). But this disconnected my
    > clients somehow and it stopped uploading data.
    >
    > One strange thing which i have observed today is that
    > my database is in inconsistent state. But i am able
    > to connect, upload and retrieve data. I went to
    > control center, and clicked on configure database
    > logging and there it shows the database state as
    > inconsistent. I am not sure why it is showing like
    > this, any clue?

    Hmm I don't know how you managed to connect and use an inconsistent database. Don't you need to replay the appropriate logs (e.g. via crash recovery) before you can use an inconsistent database?

    You should probably try to make the full db2diag.log and admin log available. The snippets you are providing don't really provide any useful information about what happened during the time of the crash.
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-07T19:58:04Z  
    > Hi,
    >
    > I ran my script again last night and apache was
    > throwing error (memory error and i hope with high end
    > system that will go away). But this disconnected my
    > clients somehow and it stopped uploading data.
    >
    > One strange thing which i have observed today is that
    > my database is in inconsistent state. But i am able
    > to connect, upload and retrieve data. I went to
    > control center, and clicked on configure database
    > logging and there it shows the database state as
    > inconsistent. I am not sure why it is showing like
    > this, any clue?

    Hmm I don't know how you managed to connect and use an inconsistent database. Don't you need to replay the appropriate logs (e.g. via crash recovery) before you can use an inconsistent database?

    You should probably try to make the full db2diag.log and admin log available. The snippets you are providing don't really provide any useful information about what happened during the time of the crash.
    Hi,

    Do i need to change my logging scheme to archive logging. By default it is circular and right now the file is approximately 39 MB.

    Are you referring to event viewer or some other admin log? Where do i find it?

    Thanks!
  • ocgstyles
    ocgstyles
    472 Posts

    Re: DB2 Crashing

    ‏2006-06-07T23:23:35Z  
    Hi,

    Do i need to change my logging scheme to archive logging. By default it is circular and right now the file is approximately 39 MB.

    Are you referring to event viewer or some other admin log? Where do i find it?

    Thanks!
    Hi all...

    Lots of thoughts comin'....

    To comment on consistency problem, database consistency is most often "NO" while there are connections to the database. For example, starting with no connections to the database, make a new connection to the database. If you immediately execute the command "db2 get db cfg for <dbname>", you will see database is consistent = YES. As soon as you modify some data (INSERT/DELETE/UPDATE), you will see that value listed as NO, and will probably be like that until all connections disconnect from the database. I say probably because I've never been mad enough to prove otherwise... :)

    Now, the lock escalations. I would recommend keeping an eye on that, especially if this is happening during peak usage times. You should look at the LOCKLIST and MAXLOCKS database config parameters. If this only happens in the middle of the night when no one is connected, you may try explicitly and exclusively (X) locking the table to stop lock escalations, or temporarily update the MAXLOCKS and/or LOCKLIST parameters before you start processing.

    Here's a writeup on the MAXLOCKS parameter, http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000268.htm.

    As for the logging, if point-in-time recovery is not needed, then there's no need to enable archival logging. With your current setup, you should be concerned if you start seeing messages to the nature of "LOG FULL". If your processing is transaction based, and you are committing often, this probably won't be a problem.

    If we've gone over what we think is the entire db2diag.log, then maybe the problem isn't with DB2. But first, I think we should start with a fresh log, so we can be sure we are not overlooking anything. Just rename the db2diag.log to db2diag.log.old. A new one will be created when DB2 attempts to write to it. If that turns up clean after you re-run that process, we can try increasing the database manager NOTIFYLEVEL to 4 (default is 3) to increase the amount of logging to see if that gives any more clues.

    Does anything else start up on the box around the time of failure? Maybe a virus scan? Backups or shadow-copying?

    How many inserts statements are you using? How many complete? Can all these inserts be accomplished using LOAD?

    Maybe it has nothing to do with DB2 at all...

    Hoping we'll figure this out...

    • Keith
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-08T20:18:36Z  
    • ocgstyles
    • ‏2006-06-07T23:23:35Z
    Hi all...

    Lots of thoughts comin'....

    To comment on consistency problem, database consistency is most often "NO" while there are connections to the database. For example, starting with no connections to the database, make a new connection to the database. If you immediately execute the command "db2 get db cfg for <dbname>", you will see database is consistent = YES. As soon as you modify some data (INSERT/DELETE/UPDATE), you will see that value listed as NO, and will probably be like that until all connections disconnect from the database. I say probably because I've never been mad enough to prove otherwise... :)

    Now, the lock escalations. I would recommend keeping an eye on that, especially if this is happening during peak usage times. You should look at the LOCKLIST and MAXLOCKS database config parameters. If this only happens in the middle of the night when no one is connected, you may try explicitly and exclusively (X) locking the table to stop lock escalations, or temporarily update the MAXLOCKS and/or LOCKLIST parameters before you start processing.

    Here's a writeup on the MAXLOCKS parameter, http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000268.htm.

    As for the logging, if point-in-time recovery is not needed, then there's no need to enable archival logging. With your current setup, you should be concerned if you start seeing messages to the nature of "LOG FULL". If your processing is transaction based, and you are committing often, this probably won't be a problem.

    If we've gone over what we think is the entire db2diag.log, then maybe the problem isn't with DB2. But first, I think we should start with a fresh log, so we can be sure we are not overlooking anything. Just rename the db2diag.log to db2diag.log.old. A new one will be created when DB2 attempts to write to it. If that turns up clean after you re-run that process, we can try increasing the database manager NOTIFYLEVEL to 4 (default is 3) to increase the amount of logging to see if that gives any more clues.

    Does anything else start up on the box around the time of failure? Maybe a virus scan? Backups or shadow-copying?

    How many inserts statements are you using? How many complete? Can all these inserts be accomplished using LOAD?

    Maybe it has nothing to do with DB2 at all...

    Hoping we'll figure this out...

    • Keith
    Hi Keith,

    Thanks for digging into this problem.

    Thanks for informing about consistency status. I guess, there are some connections which are still open after things stop working and because of that it shows it in a inconsistent state.

    Last night, when i ran my process there were lots of lock escalation and i did get messages like this

    /**

    2006-06-07-19.42.21.484000-420 I39841570H621 LEVEL: Warning
    PID : 408 TID : 816 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : TEST
    APPHDL : 0-126 APPID: *LOCAL.DB2.060608022510
    FUNCTION: DB2 UDB, database monitor, sqm___log_event, probe:41
    DATA #1 : String, 263 bytes
    Dlconn for event monitor DB2DETAILDEADLOCK truncated by 154095 bytes!
    Deadlock ID: 147 Participant:2 Original locks in list: 950.
    Redefine the event monitor, increasing the BUFFERSIZE by 38 pages.
    Increase DIAGLEVEL to 4 to see the truncated Locks in this log.

    */

    Current value of my lock related parameters are
    • MAXLOCKS 22
    • LOCKLIST 50

    Before i start testing today i will rename my log file so that I will be able to get fresh locks.

    As of now I am running this on workstation and checking whether DB2 works properly or not and right now my apache is crashing because of insufficient memory I am not able test it properly. That is why i m packaging db2 installation and will give a shot on server with higher RAM and better CPU speed. But right now there is not shadow copying or back ups in action.

    My application is like this -
    I have created a simulation by which i can spawn 'n' number of clients which are uploading data (same data as it is simulation and in real time it will be different) to the database. Well i am not sure whether load will accomplish that or not because right now I have xml data which i am using to create sql statement (call procedure insert_system etc.) by transforming it using xsl and generating string of sql statement which I am firing using ODBC api of PHP. So for example if i am calling 15 procedures i do start transaction before the begining of each batch and commit it after the batch is done. i.e. it commits after 15 procedures are done inserting the data. Right now since many threads are doing the same thing (as per my simulation, i am getting deadlocks). Few things are coming to mind regarding handling deadlocks (but i guess this is not what is the cause of the problem)

    • using Read Stability than Cursor Stability (this will reduce number of deadlocks)
    • Adjusting lock parameters
    • Commit it for each statement (each procedure call but this will make it very very slow)

    You might be right that the current problem has nothing to do DB2 but since my other database (Firebird) which i was using till now was crashing in Windows during huge load, I want to make sure that with the current setting by changing database things will work fine for me.

    Thanks for the inforamtion.

    Regards,

    Dipesh
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-14T18:55:02Z  
    Hi,

    Even after moving to a system with 2 GB of RAM, something went wrong and database was in hung state.

    I took the snapshot of it and it is more than 1 MB, there are lots of locks and rollbacks on few tables. But i am not sure why DB2 can't handle this.

    Further I got the memory visualiser file and it shows the plenty of information and few of which caught my attention are

    Lock Manager Heap => 100 % under DBM Shared memory.

    There are lots of errors in DB2 log which says it failed to escalte the lock.
    /**

    2006-06-13-12.35.01.812000-420 E1141650H486 LEVEL: Error
    PID : 976 TID : 6392 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-218 APPID: *LOCAL.DB2.060613193445
    FUNCTION: DB2 UDB, data management, sqldEscalateLocks, probe:4
    MESSAGE : ADM5503E The escalation of "185" locks on table
    "ADMINISTRATOR.SYSTEM" to lock intent "X" has failed. The SQLCODE is
    "-911".

    2006-06-13-17.26.35.578000-420 E8054347H486 LEVEL: Error
    PID : 2580 TID : 3324 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-1107 APPID: *LOCAL.DB2.060614103328
    FUNCTION: DB2 UDB, data management, sqldEscalateLocks, probe:4
    MESSAGE : ADM5503E The escalation of "282" locks on table
    "ADMINISTRATOR.SYSTEM" to lock intent "X" has failed. The SQLCODE is
    "-911".

    2006-06-13-17.26.55.593000-420 I8079857H428 LEVEL: Error
    PID : 2580 TID : 3872 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-439 APPID: *LOCAL.DB2.060614103330
    FUNCTION: DB2 UDB, catalog services, sqlrlScanOpen, probe:800
    RETCODE : ZRC=0x8510000A=-2062548982=SQLP_LFUL "Lock list full"
    DIA8310C Lock list was full.

    2006-06-14-09.42.05.015000-420 I8125284H464 LEVEL: Error
    PID : 2580 TID : 3288 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-825 APPID: *LOCAL.DB2.060617042354
    FUNCTION: DB2 UDB, catalog services, sqlrlScanOpen, probe:800
    RETCODE : ZRC=0x80100002=-2146435070=SQLP_LDED "Dead lock detected"
    DIA8002C A deadlock has occurred, recovering transaction "".
    */

    If anyone wants to have a look at these files i.e. snapshot(1 MB), memory visi file (52 KB) and db2diag.log (8 MB) I can send you by email.

    I am not sure why DB2 is not able to handle lot of data. Right now my concurrency level is cursor stability. I am not sure which all parameters I need to tweak to make it fully functional and working.

    Thanks
  • rfchong
    rfchong
    484 Posts

    Re: DB2 Crashing

    ‏2006-06-14T22:05:51Z  
    Hi,

    Even after moving to a system with 2 GB of RAM, something went wrong and database was in hung state.

    I took the snapshot of it and it is more than 1 MB, there are lots of locks and rollbacks on few tables. But i am not sure why DB2 can't handle this.

    Further I got the memory visualiser file and it shows the plenty of information and few of which caught my attention are

    Lock Manager Heap => 100 % under DBM Shared memory.

    There are lots of errors in DB2 log which says it failed to escalte the lock.
    /**

    2006-06-13-12.35.01.812000-420 E1141650H486 LEVEL: Error
    PID : 976 TID : 6392 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-218 APPID: *LOCAL.DB2.060613193445
    FUNCTION: DB2 UDB, data management, sqldEscalateLocks, probe:4
    MESSAGE : ADM5503E The escalation of "185" locks on table
    "ADMINISTRATOR.SYSTEM" to lock intent "X" has failed. The SQLCODE is
    "-911".

    2006-06-13-17.26.35.578000-420 E8054347H486 LEVEL: Error
    PID : 2580 TID : 3324 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-1107 APPID: *LOCAL.DB2.060614103328
    FUNCTION: DB2 UDB, data management, sqldEscalateLocks, probe:4
    MESSAGE : ADM5503E The escalation of "282" locks on table
    "ADMINISTRATOR.SYSTEM" to lock intent "X" has failed. The SQLCODE is
    "-911".

    2006-06-13-17.26.55.593000-420 I8079857H428 LEVEL: Error
    PID : 2580 TID : 3872 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-439 APPID: *LOCAL.DB2.060614103330
    FUNCTION: DB2 UDB, catalog services, sqlrlScanOpen, probe:800
    RETCODE : ZRC=0x8510000A=-2062548982=SQLP_LFUL "Lock list full"
    DIA8310C Lock list was full.

    2006-06-14-09.42.05.015000-420 I8125284H464 LEVEL: Error
    PID : 2580 TID : 3288 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NF5
    APPHDL : 0-825 APPID: *LOCAL.DB2.060617042354
    FUNCTION: DB2 UDB, catalog services, sqlrlScanOpen, probe:800
    RETCODE : ZRC=0x80100002=-2146435070=SQLP_LDED "Dead lock detected"
    DIA8002C A deadlock has occurred, recovering transaction "".
    */

    If anyone wants to have a look at these files i.e. snapshot(1 MB), memory visi file (52 KB) and db2diag.log (8 MB) I can send you by email.

    I am not sure why DB2 is not able to handle lot of data. Right now my concurrency level is cursor stability. I am not sure which all parameters I need to tweak to make it fully functional and working.

    Thanks
    Hello,

    First of all, thanks Keith for the amazing job of helping out with this problem!

    "Deepdata", in one of your updates I saw the following:

    > Current value of my lock related parameters are
    > - MAXLOCKS 22
    > - LOCKLIST 50
    > Before i start testing today i will rename my log file so that I will be
    > able to get fresh locks.

    I think you misunderstood Keith. Renaming the db2diag.log will just allow a new db2diag.log to be generated when an error happens so that you can start with a fresh db2diag.log. This will not give you "fresh locks". (unless this was a typo from your part). Anyway, I'd suggest you increase the above parameters as follows:

    db2 update db cfg for <dbname> using LOCKLIST 10000
    db2 update db cfg for <dbname> using MAXLOCKS 50

    Ensure to have all connections terminated before the above changes take effect on the next first connection to the database.

    This means that you will now have 10000 x 4K = 40Mb or memory for locks (you can raise this value higher if needed, and if you have the memory... the max value is 524,288).
    If you use MAXLOCKS of 50, this means that when one single application uses 50% of the locklist memory, a lock escalation will happen. A lock escalation is not recommended as it will cause concurrency issues.

    As Keith mentioned in other updates, issue a:
    db2 get snapshot for all on <dbname> every hour as your application runs. Then look for the locking section (deadlocks detected, lock escalations, lock timeouts). If you see lock escalations increasing from hour to hour you can either increase LOCKLIST or MAXLOCKS.

    Also as Keith mentioned, you may want to issue COMMITs more often to avoid deadlocks. Deadlocks is normally an application issue. COMMITs are sort of cheap comparing to rollbacks in DB2.

    With respect to an earlier question indicating that the database is "inconsistent", my understanding is that this reflects that changes made to the database have not yet been "externalized", that is, you may have committed transactions which have been savely stored in the DB2 transaction logs (active logs) which are on disk, but they have not yet been externalized (written to the database files on disk) yet.

    Hope this helps,

    Raul F. Chong
    IBM DB2 UDB Express Community Facilitator
    Email: db2x@ca.ibm.com
  • SystemAdmin
    SystemAdmin
    5837 Posts

    Re: DB2 Crashing

    ‏2006-06-14T23:04:58Z  
    • rfchong
    • ‏2006-06-14T22:05:51Z
    Hello,

    First of all, thanks Keith for the amazing job of helping out with this problem!

    "Deepdata", in one of your updates I saw the following:

    > Current value of my lock related parameters are
    > - MAXLOCKS 22
    > - LOCKLIST 50
    > Before i start testing today i will rename my log file so that I will be
    > able to get fresh locks.

    I think you misunderstood Keith. Renaming the db2diag.log will just allow a new db2diag.log to be generated when an error happens so that you can start with a fresh db2diag.log. This will not give you "fresh locks". (unless this was a typo from your part). Anyway, I'd suggest you increase the above parameters as follows:

    db2 update db cfg for <dbname> using LOCKLIST 10000
    db2 update db cfg for <dbname> using MAXLOCKS 50

    Ensure to have all connections terminated before the above changes take effect on the next first connection to the database.

    This means that you will now have 10000 x 4K = 40Mb or memory for locks (you can raise this value higher if needed, and if you have the memory... the max value is 524,288).
    If you use MAXLOCKS of 50, this means that when one single application uses 50% of the locklist memory, a lock escalation will happen. A lock escalation is not recommended as it will cause concurrency issues.

    As Keith mentioned in other updates, issue a:
    db2 get snapshot for all on <dbname> every hour as your application runs. Then look for the locking section (deadlocks detected, lock escalations, lock timeouts). If you see lock escalations increasing from hour to hour you can either increase LOCKLIST or MAXLOCKS.

    Also as Keith mentioned, you may want to issue COMMITs more often to avoid deadlocks. Deadlocks is normally an application issue. COMMITs are sort of cheap comparing to rollbacks in DB2.

    With respect to an earlier question indicating that the database is "inconsistent", my understanding is that this reflects that changes made to the database have not yet been "externalized", that is, you may have committed transactions which have been savely stored in the DB2 transaction logs (active logs) which are on disk, but they have not yet been externalized (written to the database files on disk) yet.

    Hope this helps,

    Raul F. Chong
    IBM DB2 UDB Express Community Facilitator
    Email: db2x@ca.ibm.com
    Hi Raul,

    Thanks a ton for replying and informing me about parameter settings.

    Regarding first part, it was a typo on my part. Sorry about that.

    I will change the configuration as soon as i create my database.

    >If you use MAXLOCKS of 50, this means that when one single application uses 50% of >the locklist memory, a lock escalation will happen. A lock escalation is not >recommended as it will cause concurrency issues.

    By one single application you mean apache (as my php processes are emebedded in apache as a thread i.e. using PHP as a module) or you mean one connection?

    To get snapshot can i automate it to get fired hourly and drop a file for me because i m planning to run it overnight on two server(2 GB and 4GB RAM)? Or can i use any other inbuilt DB2 features i.e. activity monitor, health indicator to get this thing done?

    I am using ODBC to connect to DB2 and using PHP API and using following to commit my transaction

    odbc_autocommit($this->database, FALSE);

    execute stament i.e. odbc_exec($this->database, $sql_element );
    or
    query statement i.e. @odbc_do($this->database, $sql);

    $comRetVal = @odbc_commit($this->database);

    This ensures that I am committing after every transaction.

    In one of the post I have mentioned how my application behaves -
    /*
    I have simulated client which connect to php process and ask it to upload data to the database. This php request goes to apache and apache handles each php request as a thread. Each of this thread then connects to database and uploads data.

    In my database, i have arragned data in such a way that i keep unique strings in the table (there are few tables like this). I have written my inserts like this,

    -> get me the id of the string in this table. If it is there it will give the id of the string already in the database.
    -> If the string is not inserted before, then insert this string.

    With the above scheme I restrict the size of the database and use normalization.

    The problem begins when 100 or more clients start doing the same thing. I have set up a simulator which does this job for n number of clients.

    */

    I would like to know will this type of arrangment work properly in DB2? Am I doing something wrong?

    One more query regarding "heap size" which i have mentioned in my previous email that was getting used 100% (and it was showing as 267 KB). Do i need to change this parameter also?

    I agree with you regarding "inconsistent" nature of my database.

    Thanks a lot for throwing light on many issues.

    -Dipesh