Determining the health of integrated SMB server

There are some IBM Storage Scale commands to determine the health of the SMB server.

The following commands can be used to determine the health of SMB services:
  • To check the overall CES cluster state, issue the following command:
    mmlscluster --ces
    The system displays output similar to this:
    GPFS cluster information
    ========================
      GPFS cluster name:         boris.nsd001st001
      GPFS cluster id:           3992680047366063927
    
    Cluster Export Services global parameters
    -----------------------------------------
      Shared root directory:                /gpfs/fs0
      Enabled Services:                     NFS SMB
      Log level:                            2
      Address distribution policy:          even-coverage
    
     Node  Daemon node name            IP address       CES IP address list
    -----------------------------------------------------------------------
       4   prt001st001                 172.31.132.1     10.18.24.25 10.18.24.32 10.18.24.34 10.18.24.36 9.11.102.89
       5   prt002st001                 172.31.132.2     9.11.102.90 10.18.24.19 10.18.24.21 10.18.24.23 10.18.24.30
       6   prt003st001                 172.31.132.3     10.18.24.38 10.18.24.39 10.18.24.41 10.18.24.42 9.11.102.43
       7   prt004st001                 172.31.132.4     9.11.102.37 10.18.24.26 10.18.24.28 10.18.24.18 10.18.24.44
       8   prt005st001                 172.31.132.5     9.11.102.36 10.18.24.17 10.18.24.33 10.18.24.35 10.18.24.37
       9   prt006st001                 172.31.132.6     9.11.102.41 10.18.24.24 10.18.24.20 10.18.24.22 10.18.24.40
      10   prt007st001                 172.31.132.7     9.11.102.42 10.18.24.31 10.18.24.27 10.18.24.29 10.18.24.43
    

    This shows at a glance whether nodes are failed or whether they host public IP addresses. For successful SMB operation at least one CES node must be HEALTHY and hosting at least one IP address.

  • To show which services are enabled, issue the following command:
    mmces service list
    The system displays output similar to this:
    Enabled services: NFS SMB
    NFS is running, SMB is running

    For successful SMB operation, SMB needs to be enabled and running.

  • To determine the overall health state of SMB on all CES nodes, issue the following command:
    mmces state show SMB -a
    The system displays output similar to this:
    NODE        SMB
    prt001st001 HEALTHY
    prt002st001 HEALTHY
    prt003st001 HEALTHY
    prt004st001 HEALTHY
    prt005st001 HEALTHY
    prt006st001 HEALTHY
    prt007st001 HEALTHY
    
  • To show the reason for a currently active (failed) state on all nodes, issue the following command:
    mmces events active SMB -a
    The system displays output similar to this:
    NODE COMPONENT  EVENT NAME SEVERITY   DETAILS

    In this case nothing is listed because all nodes are healthy and so there are no active events. If a node was unhealthy it would look similar to this:

    NODE        COMPONENT EVENT NAME SEVERITY   DETAILS
    prt001st001 SMB       ctdb_down  ERROR      CTDB process not running
    prt001st001 SMB       smbd_down  ERROR      SMBD process not running
  • To show the history of events generated by the monitoring framework, issue the following command:
    mmces events list SMB
    The system displays output similar to this:
    NODE        TIMESTAMP                           EVENT NAME     SEVERITY  DETAILS
    prt001st001 2015-05-27 14:15:48.540577+07:07MST smbd_up        INFO      SMBD process now running
    prt001st001 2015-05-27 14:16:03.572012+07:07MST smbport_up     INFO      SMB port 445 is now active
    prt001st001 2015-05-27 14:28:19.306654+07:07MST ctdb_recovery  WARNING   CTDB Recovery detected
    prt001st001 2015-05-27 14:28:34.329090+07:07MST ctdb_recovered INFO      CTDB Recovery finished
    prt001st001 2015-05-27 14:33:06.002599+07:07MST ctdb_recovery  WARNING   CTDB Recovery detected
    prt001st001 2015-05-27 14:33:19.619583+07:07MST ctdb_recovered INFO      CTDB Recovery finished
    prt001st001 2015-05-27 14:43:50.331985+07:07MST ctdb_recovery  WARNING   CTDB Recovery detected
    prt001st001 2015-05-27 14:44:20.285768+07:07MST ctdb_recovered INFO      CTDB Recovery finished
    prt001st001 2015-05-27 15:06:07.302641+07:07MST ctdb_recovery  WARNING   CTDB Recovery detected
    prt001st001 2015-05-27 15:06:21.609064+07:07MST ctdb_recovered INFO      CTDB Recovery finished
    prt001st001 2015-05-27 22:19:31.773404+07:07MST ctdb_recovery  WARNING   CTDB Recovery detected
    prt001st001 2015-05-27 22:19:46.839876+07:07MST ctdb_recovered INFO      CTDB Recovery finished
    prt001st001 2015-05-27 22:22:47.346001+07:07MST ctdb_recovery  WARNING   CTDB Recovery detected
    prt001st001 2015-05-27 22:23:02.050512+07:07MST ctdb_recovered INFO      CTDB Recovery finished
  • To retrieve monitoring state from health monitoring component, issue the following command:
    mmces state show
    The system displays output similar to this:
    NODE        AUTH     NETWORK NFS     OBJECT   SMB      CES
    prt001st001 DISABLED HEALTHY HEALTHY DISABLED DISABLED HEALTHY
  • To check the monitor log, issue the following command:
    grep smb /var/adm/ras/mmsysmonitor.log | head -n 10
    The system displays output similar to this:
    2016-04-27T03:37:12.2 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:37:27.2 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:37:42.3 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:37:57.2 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:38:12.4 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:38:27.2 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:38:42.5 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:38:57.2 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:39:12.2 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
    2016-04-27T03:39:27.6 prt2st1 I Monitor smb service LocalState:HEALTHY Events:0 Entities:0 - Service.monitor:596
  • The following logs can also be checked:
    /var/adm/ras/*
    /var/log/messages