Flashes (Alerts)
Abstract
This document provide several items to check to get clstat working.
Content
In order to troubleshoot clstat, you need to narrow down the problem.
First:
With the cluster up and clinfo running:
/usr/es/sbin/cluster/utilities/cldump
If cldump works and clstat does not, the issue is with clstat.
This would probably be a defect with clstat but I think it is pretty stable at 5.4.1.1 so I don't expect this to be the issue.
If cldump is not working perform EACH of the following on both nodes before running the next command:
stopsrc -s clinfoES
stopsrc -s snmpd
refresh -s clstrmgrES
startsrc -s snmpd
startsrc -s clinfoES
If clstat is still not working, you have to troubleshoot SNMP.
Often one of the other daemons, like dipid2 is not running or snmpd.conf
or snmpdv3.conf (depending on version of the SNMP agent) is not configured correctly.
The following entries are required in the snmpv3.conf
file for clstat to work correctly
Substitute the actual community name for the "hacmp" in the below entries:
VACM_GROUP group1 SNMPv1 hacmp -
TARGET_PARAMETERS trapparms1 SNMPv1 SNMPv1 hacmp noAuthNoPriv -
COMMUNITY hacmp hacmp noAuthNoPriv 0.0.0.0 0.0.0.0 -
Introduction and Documentation for
Simple Network Management Protocol (SNMP)
on AIX
The following information is provided to give you a quick overview
of SNMP on AIX. For a more in-depth understanding of SNMP
Please see the following two document resources:
(1) IBM Redbook Managing AIX Server Farms
This is an great book for managing AIX servers. It's normally
recommend for users looking for OpenSSH information but it
also contains great info on SNMP.
http://w3.itso.ibm.com/abstracts/sg246606.html?Open
(2) IBM AIX InfoCenter
A quick search for "snmp" will bring up a great deal of information on
the topic. Hopefully, reading the quick docs below will help with
refining the searches in InfoCenter.
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp
Gather Important System Information
The following information is important should you have the need
to contact IBM regarding your SNMP configuration.
AIX Operating System level
# oslevel -r
AIX Fileset Information for SNMP
The following fileset contains SNMP. The version number of the
fileset will help IBM determine if there are any problems and/or
fixes available to you.
# lslpp -l bos.net.tcp.client
(e.g.) 5.3.0.61
Check the Current Version of SNMP daemon Running on AIX
# ls -l /usr/sbin/snmpd
Examples:
linked to snmpd -> snmpdv3ne <------ for snmpd v3
linked to snmpd -> snmpdv1 <----- for snmpd v1
Note
Use the /usr/sbin/snmpv3_ssw command to switch between
the two versions of snmpd. V1 uses configuration file
/etc/snmp.conf for it's settings, and V3 uses configuration
file /etc/snmpdv3.conf.
snmpv3_ssw Examples:
Use the following command to change from v3 to v1
# snmpv3_ssw -1
Use the following to switch to encrypted version of snmpdv3 agent
# snmpv3_ssw -e
Use the following to switch to non-encrypted version of snmpdv3 agent
# snmpv3_ssw -n
ABOUT SNMPD DAEMONS
Management Information Base Daemons (MIBD) contain
information pertinent to network management. Information
contained in the MIBs depends upon the RFC which defines
them. The following RFC's define SNMP:
RFC 1155 Structure and Identification of Management
Information for TCP/IP
RFC 1157 a SNMP
RFC 1213 MIB for Network Managmeent of TCP/IP-based
internets: MIB-II
RFC 1227 SNMP single multiplexer (SMUX) protocol and MIB
RFC 1229 Extensions to the Generic-Interface MIB
RFC 1231 IEEE 802.5 Token Ring MIB
RFC 1398 Definitions of managed objects for Ethernet-like
interfaces
RFC 1512 FDDI Management Information Base (MIB)
RFC 1514 Host Resources Management Information Base
RFC 1592 SNMP distribution protocol interface v2
RFC 1907 MIB for snmp v2
RFC 2572 Message processing and dispatching for snmp
RFC 2573 SNMP applications
RFC 2574 User-based Security Model (USM) for snmp v3
RFC 2575 View-based Access Control Model (VACM)
snmpd
Receives and authenticates SNMP requests from network monitors.
Processes requests and returns results to the originating monitor.
Sends trap notifications to all hosts listed in the configuration file.
dpid2
Acts as a DPI 2.0 to SMUX converter. It is used to allow DPI sub-agents, such as /usr/sbin/hostmibd, to talk with the AIX SNMP version 1 agent. The converter changes DPI2 messages into SMUX protocol messages and vice-versa. dpid2 itself is implemented as SMUX peer. It will connect with the TCP port 199 of the SMUX server which is part of snmpd agent.
To a DPI2 sub-agent (e.g. /usr/sbin/hostmibd), dpid2 behaves as a DPI2 agent. It listens on an arbitrary TCP port for a connection request from a DPI2 sub-agent. This is port number is registered by dpid2 daemon with the snmpd agent through MIB variable dpiPortForTCP (1.3.6.1.4.1.2.2.1.1.1). The DPI2 sub-agent learns this port number from the snmpd agent by sending a get-request query for the dpiPortForTCP.0 (1.3.6.1.4.1.2.2.1.1.1.0) instance to the snmpd agent. After the DPI2 sub-agent knows the TCP port number which the DPI2 agent is listening on, it will then try to connect to it.
hostmibd
Acts as a dpi2 sub-agent to communicate with the dpi2 agent through
dpiPortForTCP.0 Basically you can have 3rd party smux agents
(ie:oracle) that need to talk to hostmibd..If dpid2 is not running then
hostmibd won't be able to do it's job because dpid2 is like a
"translator" for hostmibd to relay messages to snmpd.
aixmibd
Provides the AIX Enterprise Management Information Base (MIB) extension subagent, for use with the Simple Network Management Protocol (SNMP)version 3 agent, that collects data from system for variables defined in the AIX Enterprise Specific MIB. Also, aixmibd
does not require hostmibd or dpid2, it talks directly with snmpd.
To use aixmibd, you need to modify /etc/snmpdv3.conf file.
Because aixmibd managed MIBs are excluded by default.
The entry looks like this before changing
# exclude aixmibd managed MIBs from the default view
VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 - excluded -
Change the "- excluded -" entry to be
"- included -"
The new line should look like the following:
VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 - included -
save the file then stop and restart the daemons.
Configuration Files and Parameters
- /etc/hostmibd.conf:
Defines the configuration parameters for hostmibd command.
- /etc/mib.defs:
Defines the Management Information Base (MIB) variables
the SNMP agent and manager should recognize and handle
- /etc/aixmibd.conf:
Contains the configuration file for the aixmibd subagent.
- /usr/samples/snmpd/aixmibd_security_readme:
Contains the example configurations for different views and
information about related security issues. Also contains
information describing how to set the variables in
/etc/aixmibd.conf.
- /usr/samples/snmpd/aixmibd.my:
Contains the MIB definitions for the aixmibd subagent.
- /usr/samples/snmpd/hr.my:
Contains more MIB definitions.
- /etc/snmpd.conf:
Specify smux peer entry in snmpd v1 agent configuration file.
- /etc/snmpd.peers:
Specify the configuration for smux peer.
- /etc/clsnmp.conf:
Configuration file for the clsnmp command.
- /etc/snmpdv3.conf:
Defines a sample configuration file for the snmpdv3 agent.
Flags for snmpinfo
Example
# snmpinfo -m dump -c <community_name>
The following flags are commonly used with snmpinfo:
-c community
Specifieds the cumminity name. The default
community is public.
-d 1 throught 4
Debug level
-h hostname
Host name of the SNMP server. NOt required
if client is the same machine as the server.
-m option
Specifies the mode of operation: get, next,
set and dump.
-o ObjectsFile
Objects definition file. The default is /etc/mib.defs.
-v
Specifies verbose mode. Without this you may
just see the OID, but with it you will see the
description. OID stands for Object ID, and is
assigned by a standards body.
General SNMP Information
This first example shows the next option to look at the next MIB after system:
# snmpinfo -m next -v system
sysDescr.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 00CEE6FE4C00
Base Operating System Runtime AIX version: 05.03.0000.0040
TCP/IP Client Support version: 05.03.0000.0040"
without -v (below)
# snmpinfo -m next system
1.3.6.1.2.1.1.1.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 00CEE6FE4C00
Base Operating System Runtime AIX version: 05.03.0000.0040
TCP/IP Client Support version: 05.03.0000.0040"
The next example shows how use the get option to retrieve both
system and interface information.
# snmpinfo -m get -v sysDescr.0 ifDescr.1
sysDescr.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 00CEE6FE4C00
Base Operating System Runtime AIX version: 05.03.0000.0040
TCP/IP Client Support version: 05.03.0000.0040"
ifDescr.1 = "ent1"
# snmpinfo -c public -m get sysDescr.0
1.3.6.1.2.1.1.1.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 00CEE6FE4C00
Base Operating System Runtime AIX version: 05.03.0000.0040
TCP/IP Client Support version: 05.03.0000.0040"
# snmpinfo -c public -m get ifDescr.1
1.3.6.1.2.1.2.2.1.2.1 = "ent1"
# snmpinfo -v -c public -m get ifDescr.1
ifDescr.1 = "ent1"
The aixProcStatus shows the process list.
# snmpinfo -md -v aixProcStatus
aixProcStatus.1 = 2
aixProcStatus.90132 = 2
aixProcStatus.94318 = 2
aixProcStatus.98504 = 2
aixProcStatus.102508 = 2
The aixProcCPU shows process utilization by PID.
# snmpinfo -md -v aixProcCPU
aixProcCPU.1 = 0
aixProcCPU.90132 = 0
aixProcCPU.94318 = 0
aixProcCPU.639030 = 7
aixProcCPU.643134 = 0
The third example shows how you would use the
dump option to retrieve a series of storage information
# snmpinfo -v -m dump -c public -h localhost hrStorage
hrMemorySize.0 = 786432
hrStorageIndex.1 = 1
hrStorageIndex.2 = 2
hrStorageIndex.3 = 3
...
hrStorageDescr.1 = "/dev/hd4"
hrStorageDescr.2 = "/dev/hd2"
hrStorageDescr.3 = "/dev/hd9var"
hrStorageDescr.4 = "/dev/hd3"
hrStorageDescr.5 = "/dev/hd1"
hrStorageDescr.6 = "/dev/hd10opt"
hrStorageDescr.7 = "/dev/lv00"
hrStorageDescr.8 = "/dev/lv01"
hrStorageDescr.9 = "/dev/lv02"
hrStorageDescr.10 = "/dev/lv03"
hrStorageDescr.11 = "/dev/lv04"
hrStorageDescr.12 = "/dev/fslv00"
hrStorageDescr.13 = "/dev/fslv01"
hrStorageDescr.14 = "/dev/hd6"
hrStorageDescr.15 = "System RAM"
hrStorageAllocationUnits.1 = 4096
hrStorageAllocationUnits.2 = 4096
...
hrStorageSize.1 = 16384
hrStorageSize.2 = 720896
hrStorageSize.3 = 16384
hrStorageSize.4 = 16384
hrStorageSize.5 = 65536
hrStorageSize.6 = 81920
hrStorageSize.7 = 1294336
...
hrStorageUsed.1 = 4427
hrStorageUsed.2 = 681560
hrStorageUsed.3 = 4052
...
hrStorageAllocationFailures.1 = 0
hrStorageAllocationFailures.2 = 0
hrStorageAllocationFailures.3 = 0
Performance Information
In this section, we get the the issue of how do we
get the performance information back from SNMP.
The first example illustrates how use can use the
hostmibd style data to look at the percentage
utilization of the invidual processors. In this example,
I ran it on a shared processor LPAR with two virtual
processors and a total of four logical (SMT) processors.
The load was ballanced by multiple threads. This uses
the dump option so all of the processors are returned.
Note that you can use the -md instead of -m dump.
# snmpinfo -v -m dump -c public hrProcessorLoad
hrProcessorLoad.1 = 20
hrProcessorLoad.2 = 20
hrProcessorLoad.3 = 19
hrProcessorLoad.4 = 19
# snmpinfo -md hrProcessorLoad
1.3.6.1.2.1.25.3.3.1.2.1 = 1
1.3.6.1.2.1.25.3.3.1.2.2 = 1
1.3.6.1.2.1.25.3.3.1.2.3 = 2
1.3.6.1.2.1.25.3.3.1.2.4 = 0
One at a time:
# snmpinfo -v -m get hrProcessorLoad.1
hrProcessorLoad.1 = 25
How to Stop the SNMP Daemons on AIX (snmpd v1)
Once you finished configuring your snmpd.conf file, you can stop the
daemons in the following order. Once the daemons are stopped,
registered the new community name with hostmibd using the
"chssys" command and restart them in the reverse order.
# stopsrc -s dpid2
# stopsrc -s hostmibd
# stopsrc -s snmpd
REGISTER the COMMUNITY Name to hostmibd
Register your community name used in the snmpd.conf file, with
ODM. Doing so eliminates the need to continually use the
community name in the startsrc command.
Note that there's no need to register with dpid2 since it does not
allow mapping to any other community other than "public".
# chssys -s hostmibd -a "-c <community_name>"
How to Start the SNMP Daemons
The daemons can be started in any order as long as we start
"snmpd" FIRST.
# startsrc -s snmpd
# startsrc -s hostmibd
# startsrc -s dpid2
How to Stop the SNMP Daemons on AIX (snmpd v3)
Once you finished configuring your snmpdv3.conf file, we stopped the daemons
in the following order. Once the daemons were stopped, we registered the
new community name with them using the "chssys" command and restarted
them in the reverse order.
# stopsrc -s hostmibd
# stopsrc -s snmpmibd
# stopsrc -s aixmibd
# stopsrc -s snmpd
REGISTER the COMMUNITY Name to the Daemons
Register your community name used in the snmpdv3.conf file, with ODM.
This is done so that the community names do not need to be linked
every time with startsrc.
# chssys -s hostmibd -a "-c <community_name>"
# chssys -s snmpmibd -a "-c <community_name>"
# chssys -s aixmibd -a "-c <community_name>"
How to Start the SNMP Daemons
We found that they can be started in any order as long as we start
"snmpd" FIRST.
# startsrc -s snmpd
# startsrc -s hostmibd
# startsrc -s aixmibd
# startsrc -s snmpmibd
Check process table
# ps -ef | grep snmpmibd
root 426052 139418 0 15:36:54 - 0:00 /usr/sbin/snmpmibd -c <community_name>
VIEW RESULTS
Use the "snmpinfo" command to view results of the snmpd queries.
# snmpinfo -m dump -v -c comname > /tmp/snmp.out
The output will be in the /tmp/snmp.out file.
If SNMP is working, you may want to run a trace on clinfo.
The following is from Appendix C of the PowerHA 5.4.1 Troubleshooting Guide.
Enabling Tracing on SRC-controlled Daemons
To enable tracing on the following SRC-controlled daemons (clstrmgrES or clinfoES):
1. Enter: smit hacmp
2. Select Problem Determination Tools > HACMP Trace Facility and press Enter.
3. Select Enable/Disable Tracing of HACMP for AIX Daemons and press Enter.
4. Select Start Trace and press Enter. SMIT displays the Start Trace panel. Note that you only
use this panel to enable tracing, not to actually start a trace session. It indicates that you
want events related to this particular daemon captured the next time you start a trace
session. See Starting a Trace Session for more information.
5. Enter the PID of the daemon whose trace data you want to capture in the Subsystem
PROCESS ID field. Press F4 to see a list of all processes and their PIDs. Select the daemon
and press Enter. Note that you can select only one daemon at a time. Repeat these steps for
each additional daemon that you want to trace.
Disabling Tracing on SRC-controlled Daemons
To disable tracing on the clstrmgrES or clinfoES daemons:
1. Enter: smit hacmp
2. Select Problem Determination Tools > HACMP Trace Facility > Enable/Disable
Tracing of HACMP for AIX Daemons > Stop Trace. SMIT displays the Stop Trace
panel. Note that you only use this panel to disable tracing, not to actually stop a trace
session. It indicates that you do not want events related to this particular daemon captured
the next time you run a trace session.
3. Enter the PID of the process for which you want to disable tracing in the Subsystem
PROCESS ID field. Press F4 to see a list of all processes and their PIDs. Select the process
for which you want to disable tracing and press Enter. Note that you can disable only one
daemon at a time. To disable more than one daemon, repeat these steps.
4. Press Enter to disable the trace. SMIT displays a panel that indicates that tracing for the
specified daemon has been disabled.
Starting a Trace Session
Starting a trace session triggers the actual recording of data on system events into the system
trace log from which you can later generate a report.
Remember, you can start a trace on the clstrmgrES and clinfoES daemons only if you have
previously enabled tracing for them.
To start a trace session:
1. Enter: smit hacmp
2. Select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report
Tracing of HACMP for AIX Services > Start Trace. SMIT displays the Start Trace
panel.
3. Enter the trace IDs of the daemons that you want to trace in the ADDITIONAL event IDs
to trace field.
Press F4 to see a list of the trace IDs. (Press Ctrl-v to scroll through the list.) Move the
cursor to the first daemon whose events you want to trace and press F7 to select it. Repeat
this process for each event that you want to trace. When you are done, press Enter. The
values that you selected are displayed in the ADDITIONAL event IDs to trace field. The
HACMP daemons have the following trace IDs:
clstrmgrES 910
clinfoES 911
4. Enter values as necessary into the remaining fields and press Enter. SMIT displays a panel
that indicates that the trace session has started.
Stopping a Trace Session
You need to stop a trace session before you can generate a trace report. A trace session ends
when you actively stop it or when the log file is full.
To stop a trace session.
1. Enter smit hacmp
2. In SMIT, select Problem Determination Tools > HACMP Trace Facility >
Start/Stop/Report Tracing of HACMP for AIX Services > Stop Trace. SMIT displays
the Command Status panel, indicating that the trace session has stopped.
Generating a Trace Report
A trace report formats the information stored in the trace log file and displays it in a readable
form. The report displays text and data for each event according to the rules provided in the
trace format file.
When you generate a report, you can specify:
• Events to include (or omit)
• The format of the report.
To generate a trace report:
1. Enter: smit hacmp
2. In SMIT, select Problem Determination Tools > HACMP Trace Facility >
Start/Stop/Report Tracing of HACMP for AIX Services > Generate a Trace Report.
A dialog box prompts you for a destination, either a filename or a printer.
3. Indicate the destination and press Enter. SMIT displays the Generate a Trace Report panel.
4. Enter the trace IDs of the daemons whose events you want to include in the report in the
IDs of events to INCLUDE in Report field.
5. Press F4 to see a list of the trace IDs. (Press Ctrl-v to scroll through the list.) Move the
cursor to the first daemon whose events you want to include in the report and press F7 to
select it. Repeat this procedure for each event that you want to include in the report. When
you are done, press Enter. The values that you selected are displayed in the IDs of events
to INCLUDE in Report field.The HACMP daemons have the following trace IDs:
6. Enter values as necessary in the remaining fields and press Enter.
7. When the information is complete, press Enter to generate the report. The output is sent to
the specified destination. For an example of a trace report, see the following Sample Trace
Report section.
Was this topic helpful?
Document Information
Modified date:
25 September 2022
UID
isg3T1011362