IBM Support

Best practices: using the IBM Spectrum Symphony recursive workload feature

Technical Blog Post


Abstract

Best practices: using the IBM Spectrum Symphony recursive workload feature

Body

image

The IBM Spectrum Symphony recursive workload feature allows continuous workflow on a single application, so that resources are fully utilized. A parent task can temporarily yield its slots to allow child tasks to use those slots to complete jobs. Resources are then fully utilized while the parent task waits for the results from child sessions. The parent task can later regain back those resources. The recursive workload feature is supported on all platforms supported by IBM Spectrum Symphony Advanced Edition.

 

Recursive workload is powerful in reusing cluster resources; this blog article highlights seven best practices when using the recursive workload feature. Use these tips to optimize cluster stability and application performance.

 

Let's start with a typical recursive workload parent-child hierarchy:

image
image

imageHere, the root session creates 4000 tasks. Each of the 4000 task creates a second-layer child session with two tasks. The second layer has 8000 task created in total. One second-layer task creates another third-layer child session with about 110 tasks; the other task does not create third-layer child session. The third layer has 458224 tasks created. In total, there are 8001 sessions and 470224 tasks in this recursive job. The hierarchy can be complex, so this blog article explores various useful tips to best stabilize a cluster and optimize performance.

 

Best practice 1: Keep the recursiveBackfillThreshold value to less than 5

The recursiveBackfillThreshold parameter, configured in the Consumer section of the application profile, controls the number of times a slot can be backfilled to run tasks. The default threshold is five tasks. With the recursive workload feature, the parent task can yield its slot to run child tasks when it invokes the yield() function. Child tasks can further yield slots to run other tasks. Once a new task is backfilled to run in a slot, a new service instance is started; the old service instance of the yielded task remains running. This means the number of running service instances in a cluster could be as high as recursiveBackfillThreshold x numOfCores. Unlike non-recursive workload, the number of running service instances is not limited by the number of cores.

 

For example, in an 1800 core cluster with recursiveBackfillThreshold=5 set, there could be as many as 9000 (that is 5x1800=9000) service instances running at the same time. This is not a small number if you consider the maximum number of service instances supported for one application is 40000. With a recursive application pattern, each service instance is likely to become a new client creating a new child session to SSM. The maximum number of supported live sessions for an application is 10000. Therefore, as a best practice, keep the recursiveBackfillThreshold value less than five so the number of live sessions does not go beyond the system boundary. Depending on different recursive workload patterns, the actual number of live sessions that can be submitted to the cluster without performance issues could be smaller. This will be explained further in a later section.

 

The recursiveBackfillThreshold parameter is configured in the Consumer section of the application profile. To determine the value defined for this parameter, run the soamview app appname –p | findstr recursiveBackfillThreshold command (or grep recursiveBackfillThreshold for Linux). If nothing is shown, that indicates that the default value (5) is used.

 

To determine the number of service instances running in the cluster, run the soamview app appname –l command, and add up the number of running tasks, yielded tasks, and resuming tasks. For example, in the following screenshot of the soamview command output, there are only 16 slots in the cluster, but 64 service instances are actually running:

image

 

Best practice 2: Set the PLATCOMMDRV_BACKLOG_SIZE_ON_WINDOWS value to 5000 and avoid using a large PLATCOMMDRV_POOL_SIZE value

In recursive workload, each service instance can become a new client submitting a new session to SSM. As explained previously, in a cluster with 1800 cores and a configuration of  recursiveBackfillThreshold=5, there could be as many as 9000 (5x1800) service instances running in the cluster at the same time. This means the concurrent number of clients trying to connect to SSM could be as high as 9000. With such a large scale, clients may experience a SYN flood denial of service attack. SSM listens on one TCP port for incoming connections and the default Windows TCP/IP backlog queue is only 200. If there are too many incoming TCP connections at the same time, the Windows TCP/IP backlog queue could be overwhelmed, which cause some client connections to be rejected with error code 10061.

 

To prevent a denial of service situation for SSM, two settings can be configured in the SSM section of the application profile: PLATCOMMDRV_BACKLOG_SIZE_ON_WINDOWS and PLATCOMMDRV_POOL_SIZE, as follows:

<SSM resReq="" resourceGroupName="ManagementHostsSSM" workDir="${SOAM_HOME}/work">

<osTypes>

<osType name="all">

            <env name="PLATCOMMDRV_BACKLOG_SIZE_ON_WINDOWS">5000</env>

            <env name="PLATCOMMDRV_POOL_SIZE">4</env>

      </osType>

 </osTypes>

 

PLATCOMMDRV_BACKLOG_SIZE_ON_WINDOWS can be used to override the system’s default backlog size to a larger value. PLATCOMMDRV_POOL_SIZE controls how many front- and back-end communication threads started in SSM. The front-end communication threads handle the requests from clients; the back-end communication threads handle messages between SSM and compute hosts. To prevent an SSM denial of service situation, always set PLATCOMMDRV_BACKLOG_SIZE_ON_WINDOWS=5000.

 

Additionally, avoid setting the PLATCOMMDRV_POOL_SIZE environment variable to a large value as that introduces excessive CPU usage for SSM when using the recursive workload feature. For example, if you set PLATCOMMDRV_POOL_SIZE=15, SSM could consume half of the CPU cores, up to 100 percent, in an extreme situation. Removing PLATCOMMDRV_POOL_SIZE value can reduce SSM CPU usage to under ten to 20 percent with the same workload. The default value for PLATCOMMDRV_POOL_SIZE is 4.

 

Best practice 3: Keep the maximum number of concurrently open sessions under 5000

The maximum supported live sessions per application is 10000. During a stress test with a four-layered parent-child job, we observed that when the number of concurrently open sessions approaches 5000, the response of running the soamview command becomes very slow. In our four-layered parent-child job, the structure was as follows:
Root Session ID = 369485
Level   Pending Running Yielded Resuming        Done    Error   Canceled
0       0       0       0       0               50      0       0
1       0       0       0       0               850     0       0
2       0       0       0       0               14450   0       0
3       0       0       0       0               245650  0       0

 

The job contained 50 first-layer child sessions, 850 (that is, 17x50) second-layer child sessions, and 144450 (that is, 17x17x50) third-layer child sessions.

 

With such recursive workload patterns, SSM is stressed due to many sessions opening and closing dynamically, which reduces the response rate to the command queries. Therefore, keep concurrently open sessions under 5000 for one recursive application.

 

To determine the number of open sessions with SSM, run the soamview app appname –l command.

 

Best practice 4: Only call the yield function when the parent task is idle

Once the yield function is called, the parent task will yield its slot and allow other tasks to run in the slot but the service instance running the parent task will not exit. The parent task should avoid consuming CPU cycle after calling yield; otherwise, the compute host may be overloaded. Recall that there could be as many as recursiveBackfillThreshold x numOfCores service instances running at a compute host.

 

Best practice 5: Enable the abortSessionIfClientDisconnect parameter for all child session types

In a recursive workload pattern, the parent task will create new connections to SSM and submit new child sessions from the service instance. If the service instance suddenly exits, child sessions will stay open unless the abortSessionIfClientDisconnect value is set to true. Unless you have logic to reconnect to the same session from another service instance, set abortSessionIfClientDisconnect="true" in the SessionTypes section of the application profile to avoid occupying resources in SSM.

 

Best practice 6: Disable resource reclaiming between consumers to avoid sessions aborting

Resource reclaim happens between different consumers in a purely shared model or ownership model with borrowing and lending enabled. When reclaiming happens, the service instance terminates, and the task is scheduled to rerun later. For recursive workload, the service instances could be new clients submitting new sessions and the child sessions will be aborted if the service instances are terminated. To avoid unnecessary sessions aborting, disable reclaiming between consumers. For an ownership model, disable borrowing and lending; for a purely shared model, configure a very large resource reclaim  timeout value to effectively prevent reclaiming.

 

This example shows the consumer level reclaim grace period configured as 36000 seconds:

image

 

Additionally, in the SessionTypes section of the application profile, the taskGracePeriod value is set to 86400000:

image

 

The effective reclamation timeout will be MIN(taskGracePeriod, reclaim grace period)=36000 seconds.

 

Best practice 7: Use the allsessions.vbs script tool to display recursive session hierarchy

IBM Spectrum Symphony support has developed a script tool, called allsessions.vbs, to display recursive session hierarchy. The tool shows the root session ID and the number of tasks in different states for each recursive layer.

 

For example, the following script tool’s output shows two root sessions: root session 1 and root session 4914. Root session 1 has finished all tasks, as all are in done state. Root session 4914 is running various tasks, as indicated by the tasks in pending, running, yielded, resuming, and done states:

image

 

Contact IBM Spectrum Symphony support, or this blog article's author, to obtain the allsessions.vbs tool.

 

Final thoughts

The IBM Spectrum Symphony recursive workload feature allows continuous workflow on a single application; it’s a powerful feature to help fully utilize cluster resources. Coupled with the best practices in this blog article, you can maximize cluster stability and application performance using recursive workload.

 

For more details about the recursive workload feature, refer to IBM Knowledge Center: https://www.ibm.com/support/knowledgecenter/SSZUMP_7.2.0/development_sym/chap_recursive_workload_apps.html

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSZUMP","label":"IBM Spectrum Symphony"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16163995