Be smart with virtualization: Part 2. Best practices with IBM Rational Software

If you're currently using virtualization methods with IBM Rational software, is everything working as smoothly as you expected? Three IBM experts explain the Rational perspective on virtualization and the key requirements for virtualized environments to get optimal performance from Rational applications. In part two, the experts present two more case studies and troubleshooting tips.

Share:

Mike Donati (mjdonati@us.ibm.com), ClearCase Performance Team Lead, IBM

author photoMike Donati lives outside of Boston, where he works on IBM Rational ClearCase performance and customer deployments, including virtualization strategies. When not working, he divides his time among traveling with his family, cooking, photography, and attending his daughters' sporting events.



Grant Covell (gcovell@us.ibm.com), Senior Development Manager, Rational Performance Engineering, IBM

Grant Covell photoGrant Chu Covell has been working for IBM Rational software on performance-related things for nearly 10 years. He's now the Senior Performance Obsessor on the Jazz Jumpstart team. Before that, he managed the Rational Performance Engineering team. Years ago, he did software development work on typefaces, music notation software, and automatic language translation. He lives outside of Boston. You can follow his Jumpstart team blog, called Ratl Perf Land.



Ryan Smith (smithr1@us.ibm.com), Software Performance Analyst, IBM

author photoRyan Smith has been working in performance engineering for the past eight years. He lives in a small town in the farmlands of western Tennessee, where he collaborates remotely with colleagues around the world on the performance and reliability of the Rational solution for Collaborative Application Lifecycle Management (CLM). Professionally, his interests are in agile and lean software development, performance testing, Java and web technologies, data analysis, and data visualization. In his free time, he hunts, fishes, enjoys reading about sustainability, leadership, and organizing, and spends time with his wife and church activities.



08 October 2013

Also available in Chinese

The four dimensions of virtualization

CPU
Memory
Disk Input/Output (I/O) and Storage
Network

This two-part article discusses virtualization's pros and cons with concrete examples. In Part 1 we discussed virtualization at a high level especially as it relates to IBM Rational's products. We covered how four dimensions of virtualization, CPU, Memory, Disk I/O, Storage, and Network, must be properly managed with affinity (dedicated resources) and without overcommitment. We gave examples of how poorly managed virtualization can drastically affect the performance of IBM® Rational® products. Specifically, we showed two case studies in which IBM® Rational Team Concert™ and IBM® Rational® ClearCase® performance suffered when they were hosted in poorly configured virtualized environments with virtual machines (VMs) configured without affinity.

In Part 2 we look deeper at the tradeoffs of overcommitment. Drawing upon our experience testing Rational products and advising customers, we offer suggestions and tips, troubleshooting strategies and vendor-specific examples to help you better manage your virtualized infrastructure. Troubleshooting situations and suggestions appear in the jazz.net Deployment wiki

Case Study No. 3. Exploring overcommitment with ClearCase

Case Study 2 demonstrated how IBM Rational ClearCase performance suffered when hosted on VMs without dedicated resources. We suggest affinity and dedicated resources for IBM Rational products and to avoid overcommitment wherever possible. However, we recognize that managed overcommitment can be an essential value proposition of virtualization. Case Study 3 looks at different degrees of overcommitment.

For one of our tests, we took an Intel Westmere-EX server with 4 eight-core CPUs, and 64 GB memory. This server had hyperthreading enabled so the 32-core server appeared to the hypervisor as 64 logical processors (64 vCPUs). ClearCase CM server was installed on a VM allocated with 4 vCPUs and 8 GB RAM, but without dedicated resources or affinity.

On the same hypervisor we also created 96 VMs (64-bit RHEL 5.5) with 4 vCPUs and 4 GB RAM. These 96 images were used to generate background load. The 96 images were organized into six groups of 16 VMs. Each 16-VM group consists of 64 vCPUs and 64 GB of RAM which corresponds to the hardware dimensions of the Westmere-EX server itself. Therfore each 16-VM group represents 100% of the Westmere-EX's hardware allocation.

To capture baseline average response time data (shown in Table 1, column a)), we simulated a 100-user UCM load and delivered it to the ClearCase CM server. All six groups of VMs were idle.

For the next tests (columns b through g in Table 1), we generated background load using the 6 groups of 16 VMs. Each VM hosted a home-grown program which ran multi-threaded square root calculations and allocated memory. This "hog" program ensured that each VM client would consume 100% of its allocated processor and RAM. Each test increased the background load by 16 VMs or the equivalent of 100% of the Westmere-EX physical hardware.

Column b shows the average response time data of the 100-user CC CM Server test with the equivalent of 100% Westmere-EX load (one group of 16 VMs running the "hog" program). Columns c through g show the average response time data of the 100-user CC CM server test with increasing groups of 16 VMs (increasing the equivalent Westmere-EX load by 100% at each step).

Column g shows the average response time for our 100-user CC CM Server test with all 96 VMs running the "hog" program, equivalent to 600% of the physical Westmere-EX capacity. Response times are terrible. Our overcommitted server can't service the 100-user CC load at reasonable response times.

The only way for our server to guarantee any reasonable performance is to be on a VM with dedicated resources. Column h shows the same test as column g except that the CC CM Server now has affinity and dedicated resources. At 600% load, the CC CM server responds with acceptable performance.

Table 1: Performance tests using ClearCase showing effects of affinity
 a
physical machine
b
100% load no affinity
c
200% load no affinity
d
300% load no affinity
e
400% load no affinity
f
500% load no affinity
g
600% load no affinity
h
600% load with affinity
Make stream1.03 1.57 1.87 3.48 29.81 39.76 153.29 1.88
Make activity0.33 0.50 0.55 1.64 24.84 44.35 176.21 0.61
Set activity0.59 0.92 1.04 3.59 68.55 69.45 159.14 0.99
Make folder1.98 2.78 3.04 4.93 64.72 70.43 161.33 3.35
Check-out folder0.72 0.91 0.97 2.08 30.82 38.17 125.30 1.12
Check-in folder0.61 0.78 0.82 1.35 11.86 13.73 68.03 0.91
Make file1.40 1.78 1.93 3.23 33.10 37.54 95.58 2.28
Check-out file0.91 1.25 1.45 1.51 3.81 5.42 68.12 1.47

This example shows several things. Without affinity, product performance will degrade drastically to the point of being unusable. Furthermore, a ClearCase administrator with access only to the CC CM server is helpless to understand, or even guess at, what is going on. This background load may be extreme, but it clearly demonstrates the effects of overcommitment.

However, virtualization is not a hopeless proposition. Compare columns c and h in Table 1. It shows 200% load on the hypervisor when the CC CM server has no affinity vs. 600% load on the hypervisor when the CC CM server has affinity. The response times are similar enough to suggest that if dedicated resources are not possible, and hypervisor capacity doesn't exceed 200%, then response times could be acceptable compared to a configuration where resources do have affinity. This suggests that overcommitment can be a viable option as long as it is properly managed.


Case Study No. 4. Overcommit or undercommit: Performance vs. Capacity

Case Study 4 compares the ClearCase response times between two different VM configurations on the same ESX server, an Intel SandyBridge server (E5-2680 @ 2.70GHz) with 32 vCPUs and 32 GB RAM. Configuration A uses 100% of the VMware ESX server's capacity, and Configuration B uses 150%.

In Configuration A, the ESX server hosts a VOB server, a ClearCase Remote Client (CCRC) server and two VMs running the "hog" programs described in Case Study 3. Each VM on the ESX server running RHEL 5.6, and has been allocated with 8 vCPUs and 8 GB RAM. Configuration A has dedicated 100% of the ESX hardware resources.

Configuration B uses the same four VMs of Configuration A; however, an additional two VMs are added to the ESX server. These two additional VMs are sized with 8 vCPUs and 8 GB RAM to match the other four VMs. When all six images are in use (48 vCPUs and 46 GB RAM), the ESX server is allocated at 150% capacity. The two additional VMs create a secondary CC region and perform activities between each other to create load on the ESX server. The activities include importing, mklabels and build operations and run continuously during the test.

The background load in this case study consists of two other ClearCase virtual machines: one VOB server and one ClearCase client acting as a view and build server and also performing mklabel and import operations. A dedicated 1 GB network connects these servers and their images.

This ClearCase test environment is a replica of the actual ClearCase development VOBs. 100 VOBs are spread across two servers. The 10 highest volume VOBs are hosted on the VM image (VOB Server). The remaining 90 VOBs are hosted on a separate physical server with the license server and registry server.

This comparison's workload simulates approximately 250 simultaneous users during a twelve-hour period. The background workload consists of:

  • 200 CCRC users performing 15-transactions-per-hour
  • 50 Dynamic view users performing 15-transactions-per-hour
  • 38 Continuous clearmake builds that run on 12 different additional build hosts (Unix and Windows)
  • 1 Independent Unix client running integration tasks

Table 2. Two ESX server configurations

  Configuration A
(ESX server at 100% capacity)
Configuration B
(ESX server at 150% capacity)
ESX server
(32 vCPUs, 32 GB RAM)
Server hosts 4 images:
  • 1 VOB server
  • 1 CCRC server
  • 2 "hog" servers
Server hosts 6 images:
  • 1 VOB server
  • 1 CCRC server
  • 2 "hog" servers
  • 2 CC servers in separate region

Figure 1 compares the two configurations' averaged response times over twelve hours. Compared against Configuration A, Configuration B was 35% slower for base ClearCase operations, and 25% slower for UCM operations. Build times were also 22% slower.

Figure 1: Comparing two ClearCase environments
ClearCase is slower when ESX server capacity is 150%

More about affinity and reservations

In Part 1, we defined affinity as the ability to dedicate one or more resources on a virtual machine to the corresponding resources on the hypervisor. In some hypervisor systems there is also the concept of reservation. Reservation is similar in spirit to what we mean by affinity. In these systems affinity signifies something more specific in that a VM's CPUs can be precisely assigned to physical cores. If you assign your dedicated VM to specific CPUs, you should also assign the rest of your VMs to different CPUs. VM's with dedicated CPU affinity may perform worse because they may unable to schedule multi-threaded tasks.

It is our recommendation that VMs have access to dedicated resources on the hypervisor. In situations where the hypervisor is hyperthreaded or if the hypervisor is designed to perform automatic load balancing of resources, there are some additional concerns to keep in mind.

CPU affinity considerations

If using CPU affinity, consider these issues:

  • If your hypervisor is using automatic load balancing, CPU affinity may prevent the hypervisor from working efficiently.
  • CPU affinity on one VM can prevent the other VMs on the same hypervisor from working efficiently.
  • Be careful when moving a VM with CPU affinity from one hypervisor to another as the hypervisors may have different processor configurations.
  • CPU affinity on multicore or hyperthreaded machines may actually prevent VMs from scheduling multi-threaded tasks because its requests are limited to specific cores.

Summary and Conclusions

This two-part article discussed virtualization's pros and cons and used concrete examples specific to IBM Rational products.

In Part 1, we covered four important dimensions whose parameters must be precisely determined when using virtualization: CPU, memory, disk I/O and storage, and network. We emphasized the importance of affinity (dedicated resources) and demonstrated what can happen when resources are overcommitted.

We provided examples of how poorly managed virtualization can drastically effect the performance of IBM Rational products. We showed two case studies in which IBM Rational Team Concert and IBM Rational ClearCase performance suffered when they were hosted in poorly configured virtualized environments whose VMs were configured without affinity.

In Part 2, we looked deeper at the tradeoffs around overcommitment.

Drawing upon our experience testing IBM Rational products and advising our customers, we offer suggestions and tips, troubleshooting strategies and vendor-specific examples to help manage your virtualized infrastructure. Troubleshooting situations and suggestions appear in the jazz.net Deployment wiki here:

Virtualization’s key advantages

  • The current hardware offerings in the market lend themselves well to being divided and used as hypervisors to host multiple VMs. These new machines save space and power consumption, and are generally very resource efficient.
  • Virtualized infrastructure can increase the speed at which fresh VMs (copies of existing VMs or new ready-to-use VMs) can be deployed.
  • High-availability (HA) and disaster recovery (DR) solutions can be integrated with virtualization for a more complete and cost-effective enterprise configuration. However, note that a single hypervisor hosting multiple VMs can become a single point of failure. You can work around this specific area of concern by using SAN or NAS storage for the VM images and/or readying standby VMs on an alternate hypervisor.
  • VMs and their hypervisors can be managed through consoles from anywhere (not just from within a lab) which can lead to optimization and reduced administration costs.

Given the possible pitfalls of poorly managed virtualization, you may wonder whether it's worth the investment and trouble. The answer is a resounding Yes! Virtualization is a worthwhile investment, but as we have emphasized, virtualization must be properly managed. In some organizations, virtualization is inevitable and permanent. Gone are the days of dedicated physical hardware, single servers dedicated to hosting a single application. Given that hardware vendors are trending towards producing platforms with increased processors and increased RAM, slicing new hardware into VMs is one of the best ways to ensure resource efficiency.

Key principles discussed in this series

  • Whenever possible utilize dedicated resources for CPU, memory, and network. Make sure there is ample access to a co-located storage via dedicated i/o.
  • Wherever possible, consider CPU and memory affinity. In some cases this may cause other VMs served by the same host to perform worse. In some cases where the hypervisor may be part of a cluster, pinning resources might prevent the entire clan of VMs from performing optimally. There are performance tradeoffs for all VMs when CPU and memory resources can't be dedicated or reserved.
  • Whenever possible, manage your virtualization resources by monitoring resource consumption. Understand what other products are hosted on the same VM and what other VMs hosted by the same hypervisor are doing.
  • Whenever possible, avoid resource overcommitment. Consequently, the resources of any VM or combination of VMs should never exceed the hypervisor's physical resources.
  • If you suspect virtualization-related problems, collect specific data about the VM configuration, the hypervisor and the other VMs being hosted by the hypervisor. Avoid anecdotal information and collect specific, perhaps even periodic measurements using scripts.

Special IBM Rational product considerations

Different software products behave differently. Virtualization parameters which work well for one product on a VM may not work well for another product. In this series of articles, we examined Rational Team Concert and Rational ClearCase. Other Rational products may perform similarly or different, which is why we emphasize understanding virtualization's key dimensions.

Complex multi-tier applications such as the Rational Collaborative Lifecycle Management products or Rational ClearCase require near constant access to dedicated resources. We have worked with customers who experienced poor performance when using Rational products on virtualized environments only to discover the key principles listed above were not enforced.


Credits and Acknowledgements

The authors would like to thank our colleagues Tim Lee, Chetna Warade, David Schlegel, Paul Weiss, Matthias Lee, Samir Shah, Harry Abadi and Poornima Seetharamaiah, our colleagues in Rational Support and Development, and our business partners at Intel, NetApp, and VMware.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Rational software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational
ArticleID=947476
ArticleTitle=Be smart with virtualization: Part 2. Best practices with IBM Rational Software
publish-date=10082013