Colocation, colocation, colocation! Does colocating your application workloads on the same z Systems physical machine (CPC) really matter? In some cases colocation really can make a big difference. When you have application workloads that have communication patterns that are network intensive, meaning they either frequently communicate (i.e. exchanging many messages) in order to complete a single transaction (e.g. multi-tiered application workloads) or they exchange large amounts of data (bulk, streaming or other big data type solutions such as analytics related workloads), then the physical location or proximity of the applications can make a difference. The differences can impact your cost and your overall results.
The IBM System z13™ and z13s™ introduced new technology that offers an opportunity for clients to take a closer look at this aspect of colocation of IBM z/OS application workloads. IBM introduced z Systems technology called Internal Shared Memory (ISM). The ISM technology allows one z/OS instance to directly access (share) virtual memory within another z/OS instance (e.g. LPAR or guest virtual machine) within the same physical machine. The ISM architecture enables direct memory access (DMA) capability for software exploitation.
With ISM, IBM also announced Shared Memory Communications – Direct Memory Access (SMC-D). SMC-D exploits ISM which enables applications to directly and transparently communicate with other applications executing in other z/OS instances running in other Logical Partitions on the same physical z13 System. The direct communications is provided transparently for applications using TCP sockets.
Some history will help give perspective. Prior to ISM, z Systems provided a very efficient technology called HiperSockets. HiperSockets provides a logical internal (logical) LAN within z Systems allowing the operating system to communicate using numerous protocols such as TCP/IP, UDP, SNA etc. Communications with HiperSockets is accomplished by creating, exchanging, and processing standard IEEE 802.3 packets (frames) in software. HiperSockets provides a very efficient memory to memory transfer (of standard packets) without requiring physical networking hardware.
SMC-D with ISM goes beyond HiperSockets by eliminating all packets along with all of the TCP/IP protocol and packet related processing. SMC-D provides a direct socket to socket transfer of data. This model provides significant savings in host network processing which translates to significant savings in CPU, latency, and throughput.
In addition to HiperSockets, z/OS instances on the same CPC can use other network technology to communicate to same-CPC z/OS instances, such as Ethernet using the IBM OSA-Express family of adapters. While there are several options, typically HiperSockets would provide the most optimal option. While HiperSockets will continue to be an important technology (i.e. due to its versatility), the benefits of SMC-D are compelling.
Shared Memory Communications architecture now has two variations:
- Shared Memory Communications – RDMA (SMC-R for cross platform using RoCE)
- Shared Memory Communications – DMA (SMC-D for same platform using ISM)
Both forms of SMC can be used concurrently. The protocol dynamically selects the appropriate variation based on proximity of the peer hosts (i.e. same CPC instances use SMC-D).
So what are the benefits or differences of SMC-D? Benchmarks results comparing the technologies have shown that SMC-D using ISM provides significant savings in CPU, latency, and throughput. Here is a quick performance summary of Request / Response patterns (transactional) and streaming (bulk) workloads highlighting the differences in performance when comparing SMC-D to HiperSockets:
- Request/Response Summary for Workloads with 1k/1k – 4k/4k Payloads:
- Latency: Up to 48% reduction in latency
- Throughput: Up to 91% increase in throughput
- CPU cost: Up to 47% reduction in network-related CPU cost
- Request/Response Summary for Workloads with 8k/8k – 32k/32k Payloads:
- Latency: Up to 82% reduction in latency
- Throughput: Up to 475% (~6x) increase in throughput
- CPU cost: Up to 82% reduction in network-related CPU cost
- Streaming Workload:
- Latency: Up to 89% reduction in latency
- Throughput: Up to 800% (~9x) increase in throughput
- CPU cost: Up to 89% reduction in network-related CPU cost
As you can see the benefits of SMC-D with ISM are compelling. If you currently exploit HiperSockets, then the applicability of SMC-D is easy for you to evaluate. If you are not sure if you have or could have the z/OS network traffic patterns that apply to your environment, then you can evaluate your workload network patterns using the SMC-Applicability Tool (SMC-AT).
With the potential for this type of savings it is easy to see how colocation of network intensive workloads on the IBM z13 or IBM z13s using SMC-D with ISM can make a difference.
 Benchmarks results shown here are from a controlled IBM internal lab using standard tools. Your actual results may vary. Performance information is provided “AS IS” and no warranties or guarantees are expressed or implied by IBM.
 Additional information about SMC-R, SMC-D, and SMC-AT can be found at