XIV and QoS - Using new Performance Classes
anthonyv 2000004B9K Comments (4) Visits (9202)
I have been exploring some of the new features added in XIV firmware version 10.2.4.
Today I look at QoS (Quality of Service).
This new feature allows you to restrict how much IO (in IOs per second or IOPS) or throughput (in megabytes per second or MBps) an individual host can generate.
We do this by creating a new construct called a Performance Class. We can create up to four different Performance Classes and assign different hosts to each of those classes.
You can spot the new menu item under Hosts and Clusters:
The new panel contains a list of all your hosts and a new button at the top of the GUI panel called Add Performance Class:
When you create the Performance Class, you can set either an IOPS or a Bandwidth Limit.
It is also possible in a single Performance Class to set both an IOPS limit AND a Bandwidth Limit.
In these examples we set just a bandwidth limit.
One small quirk is that the limits will be rounded to a multiple of the number of active interface modules.
So don't be surprised if the numbers you enter are not the numbers that then appear. In the example below I enter 100 (for 100 MBps).
However when the class got created, the value was rounded down to 96 MBps( since I have 6 active interface modules).
To prove a simple point, in this example we have created three Performance Classes, all of which limit bandwidth.
You can see by their names the limit they will impose on any host moved to that Performance Class.
The exercise I performed used an AIX LPAR with an Oracle workload generator, that generates a constant workload of 150 MBps.
The first step was to add the host to the 96 MBps Performance Class.
Then the fun began. Monitoring of the performance of the LPAR was done with XIV Top. We moved the LPAR between performance classes to see the
effect on throughput of each class. All of this was done concurrently with no host interruption.
You can see from the output of XIV Top, that as the performance class was changed, the throughput was gradually throttled back (or allowed up) to that level.
At the end of the process we then removed the LPAR from its Performance Class, returning it to an unrestricted state.
This effectively allowed it to move back up to 150 MBps.
So why is this important?
Some clients had a concern that non-production hosts (such as test and development servers), got an equal share of the XIV performance pie.
In general this is not as issue, as the grid architecture of the XIV works very well with competing IO from multiple sources.
However with the advent of very high performance machines, it is not outside the realms of possibility for an individual server to generate
over 80,000 IOPS or over 1,000 MBps. I have certainly achieved this during benchmarking. If you spin-up several of these runaway
hosts simultaneously, you could saturate the grid and impact more deserving hosts.
So adding this feature makes sense.
What is even more sensible? it is added at no extra cost via a non-disruptive code update.