Retrieving platform state information with SNMP
This task describes how to retrieve Cloud Pak for Data System hardware and software registries, opened and closed issues, and events using SNMP. snmpget, snmptable, and snmpwalk are used to retrieve Cloud Pak for Data System state.
Before you begin
- The system has to have
snmpd.serviceup and running. - User needs to get credentials, which will allow to communicate with
snmpd.service. - User needs to start SNMP sub-agent, which is responsible for responding to requests.
Procedure
-
Verify that the
snmpd.serviceis up and running on Cloud Pak for Data System:[apuser@e1n1 root]$ service snmpd status Redirecting to /bin/systemctl status snmpd.service ● snmpd.service - Simple Network Management Protocol (SNMP) Daemon. Loaded: loaded (/usr/lib/systemd/system/snmpd.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2019-07-12 11:29:22 UTC; 3 weeks 4 days ago Main PID: 172815 (snmpd) Tasks: 1 Memory: 11.3M CGroup: /system.slice/snmpd.service └─172815 /usr/sbin/snmpd -LS0-6d -f - Get the credentials, which will allow to communicate with
snmpd.service:Note: Superuser access in required for this step.- From the output of the service snmpd status command in the previous
step, you can read that
snmpdwas started with default configuration file snmpd.conf which is located in /etc/snmp/snmpd.conf (172815 /usr/sbin/snmpd -LS0-6d -f).For details, see snmpd(1) man pages section called CONFIGURATION FILES. In case some custom configuration file was used for the service, you would see the following options:
-C -c <some-path>in the line/usr/sbin/snmpd -LS0-6d -f. The-Cmeans: do not read any configuration files except the ones optionally specified by the-coption.If there was -C -c /root/.snmp/snmpd.conf, it would mean that the service reads configuration from the file located in /root/.snmp/, omitting the one located in /etc/snmp/.
- Find the file that is used for configuration on your system, and search for the lines
beginning with
rocommunity. There is a community string defined, which allows to requestsnmpd.service:[root@e1n1 ~]# grep -m 1 rocommunity /etc/snmp/snmpd.conf rocommunity ****** <some_ip> defaultCommunity strings were masked on the above snippet intentionally.
- From the output of the service snmpd status command in the previous
step, you can read that
- Start SNMP sub-agent, which is responsible for responding to requests:
Inventory retrieving is provided by SNMP sub-agent called
magneto-snmp-agent.service, which should be up and running on Cloud Pak for Data System. The main daemonsnmpd.servicewill delegate SNMP requests aiming the OIDs defined in IBM-GTv2-MIB module to the sub-agent. You use the apsnmpagent utility to manage the state of the sub-agent. Note that it allows you to switch on/switch off the sub-agent and the mainsnmpd.serviceas well. Use it carefully to avoid stoppingsnmpd.serviceby mistake.To enable and startsnmpd.serviceafter using the command apsnmpagent, run the command with optional argument--snmpd_onlyor-sas follows:
In the above example, the first command stops and disables servicesapsnmpagent off && apsnmpagent on --snmpd_onlymagneto-snmp-agentandsnmpd, and the second command enables and startssnmpd.Called with argument
state, apsnmpagent collects information ifmagneto-snmp-agentis enabled on all nodes of hadomain1. If its state is inconsistent, a proper information is printed. apsnmpagent acts on all active nodes of hadomain1. - Use Net-SNMP snmptable application to retrieve tables defined in
IBM-GTv2-MIB.txt. IBM-GTv2-MIB.txt is located in
/usr/share/snmp/mibs. You can use snmptranslate -Tp -IR
IBM-GTv2-MIB::iias to see its structure as a tree. A node called
applianceTablescontains defined tables which could be used to retrieve system state provided by the ap command with different arguments. The following table shows the mapping:| +--moduleTables(2) | +--hardwareTable(11) -> ap hw -d | +--softwareTable(21) -> ap sw -d | +--openIssuesTable(31) -> ap issues -i | +--closedIssuesTable(41) -> ap issues -c | +--eventsTable(51) -> ap issues -e | +--nodesTable(61) -> ap node -d | +--sharedFSTable(71) -> ap df (Shared filesystem utilization section) | +--localFSTable(81) -> ap df (Node local files systems utilization section) | +--gpfsTable(91) -> ap fs (GPFS filesystems section) | +--mountsTable(101) -> ap fs (Mounts section)Examples:
Using snmpget or snmpwalk withhwInventoryTable, you can get thedetailscolumn, where useful information is gathered in a in comma-separated format. For example, you can find node average power consumption:[root@node0101 ~]# snmpwalk -v2c -c $(grep rocommunity -m 1 /etc/snmp/snmpd.conf | cut -d' ' -f2) address:port IBM-GTv2-MIB::hwUnitDetails.\"hadomain1\".\"node1\".\"\" IBM-GTv2-MIB::hwUnitDetails."hadomain1"."node1"."" = STRING: cpu_clock_exp:3325.0MHz,inlet_temp_celsius:21,power:on,unrecoverable_events:0,cpu_clock_avg:3325.0MHz, led:on,base_temp1_celsius:28,cpu_clock_tuned:active,cpu_smt_config:SMT=4,memsize:512GB,cpu_cores_enab:24, position:P1,avgpwr:650 Watts,base_temp3_celsius:33,base_temp2_celsius:35In the following example, information from ap issues is retrieved:[root@e1n1 ~]# ap issues Open alerts (issues) and unacknowledged events +------+---------------------+--------------------+-----------------------------------------------------+----------------------+----------+--------------+ | ID | Date (CEST) | Type | Reason Code and Title | Target | Severity | Acknowledged | +------+---------------------+--------------------+-----------------------------------------------------+----------------------+----------+--------------+ | 1002 | 2021-10-08 04:42:42 | SW_NEEDS_ATTENTION | 451: Webconsole service is not ready | sw://webconsole | WARNING | N/A | | 1018 | 2021-10-08 13:12:29 | SW_NEEDS_ATTENTION | 436: Failed to collect status from resource manager | node@hw://e2n4.fbond | MAJOR | N/A | +------+---------------------+--------------------+-----------------------------------------------------+----------------------+----------+--------------+ Generated: 2021-10-13 13:05:37You can use snmpget, snmpwalk and snmptable to get the same information as above:[root@e1n1 ~]# snmpget -c$(grep -m1 rocommunity /etc/snmp/snmpd.conf | awk '{ print $2 }') -v2c address:port IBM-GTv2-MIB::issueDate.1018 IBM-GTv2-MIB::issueDate.1018 = STRING: 2021-10-08 13:12:29 [root@e1n1 ~]# snmpget -c$(grep -m1 rocommunity /etc/snmp/snmpd.conf | awk '{ print $2 }') -v2c address:port IBM-GTv2-MIB::issueTarget.1018 IBM-GTv2-MIB::issueTarget.1018 = STRING: node@hw://e2n4.fbond [root@e1n1 ~]# snmpget -c$(grep -m1 rocommunity /etc/snmp/snmpd.conf | awk '{ print $2 }') -v2c address:port IBM-GTv2-MIB::issueSeverity.1018 IBM-GTv2-MIB::issueSeverity.1018 = STRING: MAJOR [root@e1n1 ~]#[root@e1n1 ~]# snmpwalk -c$(grep -m1 rocommunity /etc/snmp/snmpd.conf | awk '{ print $2 }') -v2c address:port IBM-GTv2-MIB::openIssuesTable IBM-GTv2-MIB::issueDate.1002 = STRING: 2021-10-08 04:42:42 IBM-GTv2-MIB::issueDate.1018 = STRING: 2021-10-08 13:12:29 IBM-GTv2-MIB::issueType.1002 = STRING: SW_NEEDS_ATTENTION IBM-GTv2-MIB::issueType.1018 = STRING: SW_NEEDS_ATTENTION IBM-GTv2-MIB::issueReasonCode.1002 = Gauge32: 451 IBM-GTv2-MIB::issueReasonCode.1018 = Gauge32: 436 IBM-GTv2-MIB::issueTitle.1002 = STRING: Webconsole service is not ready IBM-GTv2-MIB::issueTitle.1018 = STRING: Failed to collect status from resource manager IBM-GTv2-MIB::issueTarget.1002 = STRING: sw://webconsole IBM-GTv2-MIB::issueTarget.1018 = STRING: node@hw://e2n4.fbond IBM-GTv2-MIB::issueSeverity.1002 = STRING: WARNING IBM-GTv2-MIB::issueSeverity.1018 = STRING: MAJOR IBM-GTv2-MIB::issueAcknowledged.1002 = STRING: N/A IBM-GTv2-MIB::issueAcknowledged.1018 = STRING: N/A [root@e1n1 ~]#[root@e1n1 ~]# snmptable -Ci -c$(grep -m1 rocommunity /etc/snmp/snmpd.conf | awk '{ print $2 }') -v2c address:port IBM-GTv2-MIB::openIssuesTable SNMP table: IBM-GTv2-MIB::openIssuesTable index issueDate issueType issueReasonCode issueTitle issueTarget issueSeverity issueAcknowledged 1002 2021-10-08 04:42:42 SW_NEEDS_ATTENTION 451 Webconsole service is not ready sw://webconsole WARNING N/A 1018 2021-10-08 13:12:29 SW_NEEDS_ATTENTION 436 Failed to collect status from resource manager node@hw://e2n4.fbond MAJOR N/A [root@e1n1 ~]#Tip: It might be convenient to use-Cloptional argument in snmptable command to set 'left justify' to the output.