IBM Support

Performance Hints and Tips (tuning) when using virtual (VMWare / ESX / Hyper-V) platforms to host Controller servers

Troubleshooting


Problem

The customer has decided to deploy Controller using a server architecture that includes at least one virtual hardware server. Customer would like suggestions to optimise/maximize the performance of a Controller server system running inside a virtual hardware (for example VMWare) environment.

Symptom

Slow performance when using Controller, specifically when it is hosted on a virtual hardware platform.

Cause

There are a large number of potential causes for slow performance in a Controller system. Many of these potential causes have nothing specifically to do with virtual platforms. For example, one potential cause is that your SQL database indexes/statistics are out of date, causing slowdowns in queries. These sorts of causes are dealt with elsewhere in the IBM Technotes knowledgebase.

Instead, this IBM Technote shall only list the hints and tips that are specifically relevant to using a virtual (for example ESX) host.

Such potential causes are:

  • Scenario #1 - Virtual environment host server is overloaded
    • For example, host server is underpowered, or is configured to host too many images, for the resources (CPU/memory) that it physically owns
    • For more details, see separate IBM Technote #1413456.
  • Scenario #2 - Physical hard drive (containing the Virtual image) is fragmented
  • Scenario #3 - Memory settings on host device not optimised
  • Scenario #4 - Third party (Microsoft / virtual system vendor) incompatibility causing very slow network speed
    • For example, VMWare acknowledge that there is a problem using 'bridged networking' when 'TCP Offload Engine' enabled
  • Scenario #5 - VMWare host system is not configured to be dedicated to Cognos/Controller-only use via a 'Resource Pool'
    • Cognos software is a time-sensitive application that cannot wait for the Virtual Center to broker resources. Other third-party applications may be capable of living in a shared environment but Cognos cannot. Resources must be dedicated to the Cognos application permanently. Once the Cognos application begins to Page it will not stop, after which the Host becomes tied up in managing the Pages rather than the allocation of virtual memory.
    • A Resource Pool creates the following opportunities: *Allocates processor and memory resources to virtual machines running on the same host or cluster, *Establishes minimum, maximum, and proportional resource shares for CPU and memory, *Modifies allocations while virtual machines are running, *Sets all Cognos servers in the resource pool to High priority for resource consumption.
  • Scenario #6 - Third party miscellaneous settings
  • Scenario #7 - CPU cores of host system (e.g. host ESX server) configured to be shared between multiple virtual machines (not dedicated to Controller-only use)
  • Scenario #8 - Servers not using all of the configured CPU cores due to host (e.g. ESX) software incorrectly configured

Environment

There are many different third-party products (for example VMWare, ESX, Virtual PC, Hyper V etc.) which could be used to host the Controller servers.

Resolving The Problem

    NOTE: Some of these tips may not be valid for all environments. For example:
    • Some may relate directly only to one particular third-party platform (for example VMWare)
    • other tips may only be applicable when your virtual system has been configured in a particular way (for example using 'bridged' networking)

    Therefore, please ensure that you review the following tips and tricks with an experienced third-party (non-IBM) virtual platform software expert (for example VMWare expert) before/after implementing the advice.


Scenario #1 - Virtual environment host server is overloaded
    • Ensure that the host device has sufficient CPU/RAM resources to cope with the number of images that it is hosting.
For example, one customer upgraded their VMWare host hardware from HP DL580 G5 servers (based on Intel Xeon E7330 @ 2.40GHz) to newer HP DL380 G7 servers (Intel Xeon x5670 @ 2.93GHz), and received performance approximately twice as fast:
    • Standard (non AFC) consolidation times decreased from 5 minutes to only 2.5 minutes
    • Consolidation including AFC decreased from 8.5 minutes to only 5 minutes
    • Opening PL decreased from 11-12 seconds to 7-8 seconds

Scenario #2 - Physical hard drive (containing the Virtual image) is fragmented
    • Defragment your virtual OS hard disk using the native Disk Defragmenter on your host operating system.

Scenario #3 - Memory settings on host device not optimised
WARNING: Not all of the following changes will be suitable for all environments. The following changes are examples of some things that have helped some environments (for example consultants running VMWare workstation images on their local laptop).
    Example A - In some VMWare environments, top performance is gained by disabling virtual machine memory swapping:
    1. Ensure that the host device has sufficient 'real' memory
    2. In VMware workstation, from the menu, select Edit -> Preferences.
    3. Open the "Memory" tab
    4. Under "Additional Memory" ensure that "Fit all virtual machine memory into reserved host RAM" is enabled.

    Example B - The following have been known to help in some situations:
    • Turn off Windows pagefile entirely in the guest machine (virtual image).
      • NOTE: To cope with not having a swap-file, the guest machine must be allocated sufficient 'real' memory!
    • Depending on the how much memory the host machine have, memory page trimming can be either on or off
      • Enabling memory page trimming (on) can cause some performance slowdown but the upside is that it can potentially free up more memory for other guest machines.

Scenario #4 - Third party (Microsoft / virtual system vendor) incompatibility causing very slow network speed


Disable 'TCP Offload Engine' by using the "DisableTaskOffload" registry key on the host device

    Steps:
    1. Obtain some downtime
    2. Logon to the host device
    3. Click Start > Run.
    4. Type regedit and press Enter.
    5. Browse to the following location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
    6. Create the key DisableTaskOffload (type: DWORD).
    7. Set the value of the key to 1
    8. Close the Registry Editor and restart the computer.

Scenario #5 - VMWare host system is not configured to be dedicated to Cognos/Controller-only use via a 'resource pool'
Create a Resource Pool to dedicate CPU/cores, RAM, and Disk to the Cognos servers
  • For more details, see separate IBM Technote #1425494.


Scenario #6 - Third party miscellaneous settings
For example, there is a known issue in VMWare/ESX when hosting Windows 2008 servers, caused by the default graphics driver (display driver) that VMWare tools gives to its images. This can give massive performance slowdown.
  • The solution is to change the graphics display driver on the VMWare/ESX image
    • For more details, see third party links below (such as VMWare article KB 1011709).

Scenario #7 - Configure the CPU cores to be dedicated to the Controller-related virtual machines
For example, in one real-life example, the customer had two separate environments ("test" and "production"), which were both based on identical hardware running on ESXi v4.1.
    • DEV (CPU cores not dedicated to the virtual machines) - Excel link report (XLS) ran in 26 seconds
    • PROD (CPU cores dedicated to the virtual machines) - same Excel link report (XLS) ran in 14 seconds

Scenario #8 - Servers not using all of the configured CPU cores due to host (e.g. ESX) software incorrectly configured
Consider the scenario where:
  • Customer's ESX host server is based on quad-CPU core CPUs (4 cores per CPU socket)
  • The virtual server is configured to have 8 virtual CPU cores
  • The server software (e.g. the SQL server) is based on a 'standard' editions (e.g. Windows 2008 standard edition) which only supports up to 4 CPU sockets (but unlimited cores per CPU)
By default, the Windows server might think that it has been given 8 CPU sockets (unsupported by the licensing), rather than 2 sockets (4 cores in each socket) which *is* supported by the licensing model. This will cause the server (e.g. see Task Manager graphs) to only utilise 4 of the 8 virtual CPUs.
  • The solution is to modify the "cpuid.coresPerSocket" setting (in ESX) as appropriate (for example change this value to 4)
  • In other words, tell the VMWare images that you have multiple (e.g. 4) cores per CPU socket

If using a different virtual platform, then the setting will be different (but the concept is the same).
  • For example, if using Fedora's KVM system, then the solution is to modify the guest's XML configuration, using the "topology sockets" parameter, such as:
    • <cpu>
      <topology sockets='4' cores='4' threads='1'/>
      </cpu>

[{"Product":{"code":"SS9S6B","label":"IBM Cognos Controller"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Controller","Platform":[{"code":"PF033","label":"Windows"}],"Version":"8.5.1;8.5;8.4;8.3;10.1;10.1.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Historical Number

1040008

Document Information

Modified date:
15 June 2018

UID

swg21365257