Troubleshooting
Problem
This technote identifies a defect where IBM® Rational® ClearCase MultiSite® imports fail if certain database files (such as the .k02 key file) grow beyond 32 gigabytes on Microsoft® Windows®, Linux® and UNIX®.
Symptom
There are various symptoms and errors as seen below that are related to this issue:
- Syncreplica -import operations are failing with RPC errors such as those listed below.
Note: Re-running the imports will gradually import a few oplogs at a time after repeated attempts but may eventually stop importing.
%>multitool syncreplica -import packet1
db_replica_get_epoch_row_V3: RPC: Unable to receive; errno = Connection reset by peer
db_abort_trans_V3: RPC: Unable to receive; errno = Connection reset by peer
multitool: Error: multitool: Error: Trouble communicating with VOB database: "/vobs/my_tools".
Check database log on VOB host "myhost02".
multitool: Error: multitool: Error: INTERNAL ERROR detected and logged in "/var/adm/rational/clearcase/log/error_log".
multitool: Error: multitool: Error: Trouble communicating with VOB database: "/vobs/my_tools".
Check database log on VOB host "myhost02".
multitool: Error: multitool: Error: Unable to replay oplog entry
345359402: error detected by ClearCase subsystem.
345359402:
op= mklabel
replica_oid= 37ed600a.a21e11d3.a38c.00:01:80:9e:3a:5d (xx20_my_tools)
oplog_id= 12824657 op_time= 31-Aug-06.21:33:39UTC create_time=
31-Aug-06.23:11:57UTC event comment= ""
data size= 36 data= 0xce3e8
------------
obj_oid= f77fcabd.da4011d7.b8ef.00:01:83:2b:c3:96 (version: *no view*)
lbtype_oid= 25ed4ea3.393811db.896f.00:01:83:f2:a1:92
(MAKEFILE_TEST)
The following errors are also reported in the logs for the same VOB experiencing the import errors listed above:
2006-09-01T01:00:28-04 db_server(12826): Ok:
2006-09-01T01:00:28-04 db_server(12826): Ok: *** db_VISTA database error -901 - system error
2006-09-01T01:00:28-04 db_server(12826): Ok: DBMS error in "../db__replica.c" line 485
2006-09-01T01:00:28-04 db_server(12826): Error: DBMS error in /export/vbs02/my_tools.vbs/db.
2006-09-01T01:00:28-04 db_server(12826): Error: db_VISTA error -901 (errno == "Resource temporarily unavailable")
2006-09-01T01:01:14-04 db_server(14042): Ok:
A listing of the VOB's db directory in the VOBs storage directory indicates that one of the database files (vob_db.dXX and or vob_db_kXX) has grown to a size of >32GB.
...
-rw-r--r-- 1 vobadm ae598 30668988416 Nov 27 04:30 vob_db.d01
-rw-r--r-- 1 vobadm ae598 23491280896 Nov 27 03:28 vob_db.d02
-rw-r--r-- 1 vobadm ae598 8192 Nov 27 03:28 vob_db.d03
-rw-r----- 1 vobadm ae598 25204 Mar 24 2003 vob_db.dbd
-rw-r--r-- 1 vobadm ae598 8328757248 Nov 27 03:28 vob_db.k01
-rw-r--r-- 1 vobadm ae598 34360016896 Nov 27 03:28 vob_db.k02
-rw-r--r-- 1 vobadm ae598 8192 Mar 17 2006 vob_db.k03
-rw-r--r-- 1 vobadm ae598 2473984 Aug 17 12:32 vob_db.k04
...
If you are experiencing all of the symptoms listed above, the problem may be related to defect APAR PK32214.
- A syncreplica -import command may also fail as described below:
> multitool syncreplica -import sync_INTERACTIVES_sha_from_2u_0703002111
db_label_apply_V3: RPC: Unable to receive; errno = Connection reset by
peer
db_abort_trans_V3: RPC: Unable to receive; errno = Connection reset by
peer
multitool: Error: multitool: Error: Trouble communicating with VOB
database: "/
fw/test/INTERACTIVES".
Check database log on VOB host "shanvob1".
multitool: Error: multitool: Error: INTERNAL ERROR detected and logged
in "/var
/adm/rational/clearcase/log/error_log".
multitool: Error: multitool: Error: Trouble communicating with VOB
database: "/fw/test/INTERACTIVES".
Check database log on VOB host "shanvob1".
multitool: Error: multitool: Error: Unable to replay oplog entry
396723863: error detected by ClearCase subsystem.
396723863: op= mklabel
replica_oid= 2c4ae334.3cb811d4.b93e.00:60:b0:c4:22:c3 (INTERACTIVES_1u)
oplog_id= 175263310
op_time= 02-Jul-07.22:32:36UTC create_time= 02-Jul-07.22:40:21UTC
event comment= ""
data size= 36 data= 0x40016590
Sat Jul 7 04:04:47 2007. host "shanvob1", pid 3848, user "rootboi"
Internal Error detected in "../map_db.c" line 969
multitool: Error: Abort transaction failed: "/fw/test/INTERACTIVES".
The entry in the albd log file is:
07/07/07 04:04:47 albd_server(1917): Server db_server(3849) exited due
to signal 11
Cause
A file in the VOB database has exceeded 32GB in size.
Note: In the example above, the file in question is the vob_db.k02 key file
-rw-r--r-- 1 vobadm users 34359762944 Jul 11 14:39 vob_db.k02
Defect APAR PK32214 has been opened to investigate this issue.
Diagnosing The Problem
Check the sizes of the files in the VOBs database directory to see if they exceed 32GB.
#ls -al /vobs/test.vbs/db
total 195627664
drwxr-xr-x 2 vobadm users 16384 Jul 11 13:37 logs/
-rw-r--r-- 1 vobadm users 5146 Jul 11 14:39 vista.taf
-rw-r--r-- 1 vobadm users 8192 Jul 11 14:34 vista.tcf
-rw-r--r-- 1 vobadm users 58941440 Jul 11 14:39 vista.tjf
-rw-r--r-- 1 vobadm users 29325287424 Jul 11 14:39 vob_db.d01
-rw-r--r-- 1 vobadm users 26699145216 Jul 11 14:39 vob_db.d02
-rw-r--r-- 1 vobadm users 8192 Mar 2 18:17 vob_db.d03
-rw-r----- 1 vobadm users 25204 Dec 7 2005 vob_db.dbd
-rw-r--r-- 1 vobadm users 9450242048 Jul 11 14:39 vob_db.k01
-rw-r--r-- 1 vobadm users 34359762944 Jul 11 14:39 vob_db.k02
-rw-r--r-- 1 vobadm users 8192 Feb 26 19:56 vob_db.k03
-rw-r--r-- 1 vobadm users 32768 Jul 11 14:25 vob_db.k04
-rw-r--r-- 1 vobadm users 123164386 Jul 11 14:17 vob_db.str_file
-rw-r--r-- 1 vobadm users 3 Feb 26 19:56 vob_db_schema_version
Resolving The Problem
This defect has been resolved in the following updates:
7.0
ClearCase 7.0.0.1 (FixPack 1)
2003.06.00
ClearCase | Windows | |
ClearCase | UNIX and Linux | |
ClearCase LT® | Windows | |
ClearCase LT | UNIX and Linux |
Note: The patches should be installed at all replica locations as this database issue will be at all locations.
Related Information
Was this topic helpful?
Document Information
Modified date:
16 June 2018
UID
swg21250896