Troubleshooting
Problem
LVM log file too large causes varyonvg failure of a concurrent volume group.
Symptom
This issue may result when attempting a varyonvg of a concurrent volume group in a cluster environment or when bringing a resource group online.
For example:
Performing a "Start Cluster Services" from the smit hacmp C-SPOC menus may result in the resource group in an ERROR state as shown by clRGinfo
or
This issue may be seen from the command line running a command such as
varyonvg -c -P -O datavg
0516-1751 varyonvg: The command /usr/sbin/tellclvmd returned the following error:
GSCHILD:ha_gs_init failed
/usr/sbin/tellclvmd: request failed rc = 18 [UNKNOWN rc]
Cause
The issue may occur when the size of an LVM log file, for example /tmp/lvmt.log, is set to larger than 256MB. For example, the issue would be seen if an LVM log file was increased to a size of 1GB (1073741824), to capture data for problem diagnosis using a command such as:
alog -C -t lvmt -s 1073741824
The LVM log files are mapped into shared memory when the group services gsclvmd daemon starts in a cluster environment, and log sizes over 256MB will not leave enough space for ODM classes and other files to be mapped into shared memory.
Diagnosing The Problem
To diagnose the issue, the varyonvg command results in the following logs and trace data:
- a varyonvg failure (rc=1) and gsclvmd failure (rc=18) in /var/adm/ras/lvmcfg.log
- the HA_GS_SOCK_INIT_FAILED error is logged to /tmp/lvmgs.log
- the EMFILE error is logged in kernel trace data
The alog -t lvmcfg -o command output contains:
[S 30802054 21692640 05/19/16-16:45:19:934 varyonvg.c 1059] varyonvg -c -P -O datavg
[E 23265358 0:005 gschild.c 517] /usr/sbin/gsclvmd: exited with rc=18
[E 30802054 3:003 varyonvg.c 1440] varyonvg: exited with rc=1
The alog -t lvmgs -o command output contains:
[3 23265358 1 05/19/16-16:45:22:164 gschild.c 459]
failure_init: rc=18[HA_GS_SOCK_INIT_FAILED], msg=GSCHILD:ha_gs_init failed
The kernel trace contains:
104 varyonvg 4 46792706 36045151 shmat 0.018430808
0.000377 return from shmat. error EMFILE [2 usec]
Resolving The Problem
To avoid this issue, the LVM log sizes should be less that the maximum shared memory segment size of 256MB. Note, to change the size of an LVM log, the log must be removed to allow it to take on the new size. For example, these commands would set each LVM log to 20MB:
alog -C -t lvmt -s 20971520
alog -C -t lvmcfg -s 20971520
alog -C -t lvmgs -s 20971520
rm /tmp/lvmt.log
rm /var/adm/ras/lvmcfg.log
rm /tmp/lvmgs.log
When logging starts, fixed-sized log files will be created, for example:
# ls -l /tmp/lvmt.log
-rw-r--r-- 1 root system 20971520 Aug 22 10:41 /tmp/lvmt.log
The size of a log file on a system can be checked using the following command:
alog -L -t lvmt
#file:size:verbosity
/tmp/lvmt.log:204800:3
This output shows the /tmp/lvmt.log file at its default size of 204800.
The default sizes of the lvmcfg and lvmgs logs are 51200 and 204800 respectively.
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
isg3T1024055