SMI-S provider connectivity issues

Follow these instructions if you encounter performance issues or failures when performing actions that involve EMC VMAX storage arrays, including the SMI-S provider that proxies requests to the array.

Issue 1: Connections are exhausted

This problem can occur when the active workloads stress the SMI-S Provider (ECOM) configured total or host external connections limits.

Investigation

Verify the environment's demands on the EMC VMAX storage array.

Example
  1. Check the ECOM server by using the Testsmiprovider, which indicates that the ECOM is not responding to queries. For example: /opt/emc/ECIM/ECOM/bin/TestSmiProvider
    ########################################################################
    Built with EMC SMI-S Provider: V4.6.2
    Namespace: root/emc
    repeat count: 1
    (localhost:5988) ? ein
    Class: Symm_StorageVolume
    ++++ Testing EnumerationInstanceNames: Symm_StorageVolume ++++
    Error: Connection closed by CIM Server.
    
    Retrieve and Display data - 1 Iteration(s) In 0.004082 Seconds
  2. Check the cimom log for errors that indicate that the wbemserver has reached the socket limit. For example: grep "limit, closing connection" /opt/emc/ECIM/ECOM/log/cimomlog.txt | tail
    13-Nov-2014 00:25:04.018 -3292804864-E- WebServer: The webserver hit its connect
    ion limit, closing connection.
    13-Nov-2014 00:25:04.051 -3292804864-E- WebServer: The webserver hit its connect
    ion limit, closing connection.
    13-Nov-2014 00:25:08.902 -3292804864-E- WebServer: The webserver hit its connect
    ion limit, closing connection.
    13-Nov-2014 00:25:09.049 -3292804864-E- WebServer: The webserver hit its connect
    ion limit, closing connection.
  3. Check the number of ECOM process sockets.
    [root@oc4777208076 fd]# ls -l /proc/21861/fd | grep socket
    lrwx------. 1 root root 64 Nov 12 16:51 10 -> socket:[40320434]
    lrwx------. 1 root root 64 Nov 12 23:35 102 -> socket:[40459268]
    lrwx------. 1 root root 64 Nov 12 23:35 105 -> socket:[40459224]
    lrwx------. 1 root root 64 Nov 12 23:35 106 -> socket:[40459214]
    lrwx------. 1 root root 64 Nov 12 16:51 11 -> socket:[40320437]
    lrwx------. 1 root root 64 Nov 12 23:35 111 -> socket:[40459334]
    lrwx------. 1 root root 64 Nov 12 23:35 114 -> socket:[40459350]
    lrwx------. 1 root root 64 Nov 12 23:35 117 -> socket:[40459429]
    lrwx------. 1 root root 64 Nov 12 16:51 12 -> socket:[40320440]
    lrwx------. 1 root root 64 Nov 12 23:35 120 -> socket:[40459630]
    lr-x------. 1 root root 64 Nov 12 23:35 121 -> socket:[40459649]
    lrwx------. 1 root root 64 Nov 12 23:35 126 -> socket:[40459662]
    lrwx------. 1 root root 64 Nov 12 16:51 13 -> socket:[40320443]
    lr-x------. 1 root root 64 Nov 12 23:36 130 -> socket:[40460954]
    lrwx------. 1 root root 64 Nov 12 23:36 132 -> socket:[40459962]
    lrwx------. 1 root root 64 Nov 12 23:36 135 -> socket:[40459968]
    lrwx------. 1 root root 64 Nov 12 23:36 138 -> socket:[40460104]
    lrwx------. 1 root root 64 Nov 12 23:37 141 -> socket:[40460619]
    lr-x------. 1 root root 64 Nov 12 23:37 144 -> socket:[40460731]
    lrwx------. 1 root root 64 Nov 12 23:37 147 -> socket:[40461572]
    lr-x------. 1 root root 64 Nov 12 23:37 148 -> socket:[40461464]
    lr-x------. 1 root root 64 Nov 12 16:51 15 -> socket:[40320854]
    lrwx------. 1 root root 64 Nov 12 23:37 150 -> socket:[40460988]
    
    [root@oc4777208076 fd]# ls -l /proc/21861/fd | grep socket | | wc -l
    112
  4. Verify connections against the configured connections limits. Use any editor to view connections limits in the settings file. For example: /opt/emc/ECIM/ECOM/conf/Security_settings.xml
    <!-
    *******************************************************************************
    *ExternalConnectionLimit:
    * The maximum number of active client connections allowed by ECOM.
    * Once this limit is reached, subsequent connections will be rejected.
    * The default value is 100.
    * A Value of zero implies no limit (not recommended).
    *******************************************************************************
    ->
    <ECOMSetting Name="ExternalConnectionLimit" Type="uint32" Value="100"/>

Resolution

Resolve external connections issues or increase the connections limits as necessary. To increase the allowed total external connections, use ExternalConnectionsLimit. To increase the allowed external connections limits per host, use ExternalConnectionLimitPerHost.

Issue 2: ECOM Services are down or CPU is pegged

On the VMAX SMI-S provider system, the ECOM or slp service might not be running. This will result in registration or operation failures from the PowerVC management server.

Alternatively, either of these two processes might consumes a large amount of CPU resources for a period of time. This might cause PowerVC storage requests to take a long time or time out.

Investigation

To determine whether the services are running on a Linux system, run these commands:
ps -ef | grep "ECOM -d"
ps -ef | grep slpd

To monitor ECOM and slpd process resources, you can run a tool such as Top.

Resolution

In either situation, start or restart the services.

If the service is running, stop the service process first by running these commands, then restart the services.
kill -s TERM <slpd-pid>
kill -s TERM <ecom-pid>
To restart the services, run these commands from the appropriate directory. For example, if the default installation directory was used on a Linux system with Solutions Enabler, you would run these commands:
/opt/emc/ECIM/ECOM/bin/ECOM -d 
/opt/emc/ECIM/slp/lib/slpd

Issue 3: HTTPS port is blocked by the firewall

This issue causes registration or storage requests from PowerVC to time out. Verify that the SMI-S provider port (typically 5989) is open on the SMI-providerer system. On the PowerVC server, ensure that outbound traffic is also allowed for that port.

Resolution

Add a firewall rule for the SMI-S provider (ECOM) port. For example, on Linux, the following command displays the iptables firewall rules:
service iptables status
You should see a rule like the following before any rule that would DROP traffic to these ports:
78  ACCEPT  tcp -- 0.0.0.0/0     0.0.0.0/0     tcp dpts:5988:5989