This section contains processor-allocation guidelines for both dedicated processor partitions and shared processor partitions.
Because Ethernet running MTU size of 1500 bytes consumes more processor cycles than Ethernet running Jumbo frames (MTU 9000), the guidelines are different for each situation. In general, the processor utilization for large packet workloads on jumbo frames is approximately half that required for MTU 1500.
If MTU is set to 1500, provide one processor (1.65 Ghz) per Gigabit Ethernet adapter to help reach maximum bandwidth. This equals ten 100-Mb Ethernet adapters if you are using smaller networks. For smaller transaction workloads, plan to use one full processor to drive the Gigabit Ethernet workload to maximum throughput. For example, if two Gigabit Ethernet adapters will be used, allocate up to two processors to the partition.
If MTU is set to 9000 (jumbo frames), provide 50% of one processor (1.65 Ghz) per Gigabit Ethernet adapter to reach maximum bandwidth. Small packet workloads should plan to use one full processor to drive the Gigabit Ethernet workload. Jumbo frames have no effect on the small packet workload case.
The sizing provided is divided into two workload types: TCP streaming and TCP request and response. Both MTU 1500 and MTU 9000 networks were used in the sizing, which is provided in terms of machine cycles per byte of throughput for streaming or per transaction for request/response workloads.
The data in the following tables was derived using the following formula:
(number of processors × processor_utilization × processor clock frequency) / Throughput rate in bytes per second or transaction per second = cycles per Byte or transaction.
For the purposes of this test, the numbers were measured on a logical partition with one 1.65 Ghz processor with simultaneous multi-threading (SMT) enabled.
For other processor frequencies, the numbers in these tables can be scaled by the ratio of the processor frequencies for approximate values to be used for sizing. For example, for a 1.5 Ghz processor speed, use 1.65/1.5 × cycles per byte value from the table. This example would result in a value of 1.1 times the value in the table, thus requiring 10% more cycles to adjust for the 10% slower clock rate of the 1.5 Ghz processor.
To use these values, multiply your required throughput rate (in bytes or transactions) by the cycles per byte value in the following tables. This result will give you the required machine cycles for the workload for a 1.65 Ghz speed. Then adjust this value by the ratio of the actual machine speed to this 1.65 Ghz speed. To find the number of processors, divide the result by 1,650,000,000 cycles (or the cycles rate if you adjusted to a different speed machine). You would need the resulting number of processors to drive the workload.
For example, if the Virtual I/O Server must deliver 200 MB of streaming throughput, the following formula would be used:
200 × 1024 × 1024 × 11.2 = 2,348,810,240 cycles / 1,650,000,000 cycles per processor = 1.42 processors.
In round numbers, it would require 1.5 processors in the Virtual I/O Server to handle this workload. Such a workload could then be handled with either a 2-processor dedicated partition or a 1.5-processor shared-processor partition.
The following tables show the machine cycles per byte for a TCP-streaming workload.
| Type of Streaming | MTU 1500 rate and processor utilization | MTU 1500, cycles per byte | MTU 9000 rate and processor utilization | MTU 9000, cycles per byte |
|---|---|---|---|---|
| Simplex | 112.8 MB at 80.6% processor | 11.2 | 117.8 MB at 37.7% processor | 5 |
| Duplex | 162.2 MB at 88.8% processor | 8.6 | 217 MB at 52.5% processor | 3.8 |
| Type of Streaming | MTU 1500 rate and processor utilization | MTU 1500, cycles per byte | MTU 9000 rate and processor utilization | MTU 9000, cycles per byte |
|---|---|---|---|---|
| Simplex | 112.8 MB at 66.4% processor | 9.3 | 117.8 MB at 26.7% processor | 3.6 |
| Duplex | 161.6 MB at 76.4% processor | 7.4 | 216.8 MB at 39.6% processor | 2.9 |
The following tables show the machine cycles per transaction for a request and response workload. A transaction is defined as a round-trip request and reply size.
| Size of transaction | Transactions per second and Virtual I/O Server utilization | MTU 1500 or 9000, cycles per transaction |
|---|---|---|
| Small packets (64 bytes) | 59,722 TPS at 83.4% processor | 23,022 |
| Large packets (1024 bytes) | 51,956 TPS at 80% processor | 25,406 |
| Size of transaction | Transactions per second and Virtual I/O Server utilization | MTU 1500 or 9000, cycles per transaction |
|---|---|---|
| Small packets (64 bytes) | 60,249 TPS at 65.6% processor | 17,956 |
| Large packets (1024 bytes) | 53,104 TPS at 65% processor | 20,196 |
The preceding tables demonstrate that the threading option of the shared Ethernet adds overhead. It is approximately 16% to 20% more overhead for MTU 1500 streaming and 31% to 38% more overhead for MTU 9000. The threading option has more overhead at lower workloads due to the threads being started for each packet. At higher workload rates, like full duplex or the request and response workloads, the threads can run longer without waiting and being redispatched. The thread option is a per-shared Ethernet option that can be configured by Virtual I/O Server commands. Disable the thread option if the shared Ethernet is running in a Virtual I/O Server partition by itself (without Virtual SCSI in the same partition).
You can enable or disable threading using the -attr thread option of the mkvdev command. To enable threading, use the -attr thread=1 option. To disable threading, use the -attr thread=0 option. For example, the following command disables threading for Shared Ethernet Adapter ent1:
mkvdev -sea ent1 -vadapter ent5 -default ent5 -defaultid 1 -attr thread=0
Creating a shared-processor partition for a Virtual I/O Server can be done if the Virtual I/O Server is running slower-speed networks (for example 10/100 Mb) and a full processor partition is not needed. It is recommended that this be done only if the Virtual I/O Server workload is less than half a processor or if the workload is inconsistent. Configuring the Virtual I/O Server partition as uncapped might also allow it to use more processor cycles as needed to handle inconsistent throughput. For example, if the network is used only when other processors are idle, the Virtual I/O Server partition might be able to use other machine cycles and could be created with minimal processor to handle light workload during the day but the uncapped processor could use more machine cycles at night.
If you are creating a Virtual I/O Server in a shared-processor partition, add additional processor entitlement as a sizing contingency.