Hi All,
This was an interesting discussion I had with one of my advocate customers recently, and I thought it would be worth posting for you all.
The customer wanted to perform compression to make more use of their existing capacity, and were keen to try out the compression capability of XIV in this case, but the same would be true if using compression on a V7000 behind an SVC.
As they thought about how to best deploy this, their initial thinking was that they would perform compression on the XIV - because there would be lots more machines running the compression software, hence it would be much less likely that they would run out of resources to handle the compression for their workload.
This is a very valid thought process, and if you are really worried about overloading the compression capabilities of the SVC then it may be the right solution for you. However let's consider the downsides to doing compression (or in fact any form of overallocation (i.e. thin provisioning, de-duplication, ...) on backend storage controllers. I focus on compression here, but each of these can apply to Thin provisioning or deduplication as well
- Volume level control:
- We find that not all volumes are good candidates for compression. Running compression behind the SVC makes it a lot harder to choose which volumes are compressed and which aren't. On the other hand, the problem with uncompressable volumes is that they waste resources and the system can run out of resources to manage that workload. Having lots of compression resources in the backend will mitigate this
- EasyTier
- EasyTier is constantly migrating data around between managed disks to keep the workload balanced. This data migration will be forced to decompress and recompress for every migration if you run compression on the backend controller.
- Because EasyTier is moving data around, there is a good chance that every single byte of the managed disk behind SVC may end up containing old copies of data, which have since been migrated elsewhere. This will reduce the compression effectiveness at the backend controller, because you will not get as much zero-reclaimation in the backend controller.
- As a simple example (This wouldn't work this way in real life, but it helps to demonstrate the point)
- There are 10 x 1 GB extents in a managed disk group, and you create a volume which is 1GiB in size. This volume uses one extent from the managed disk group.
- You then write 50% compressible data onto that 1GB volume - at this point, your volume is storing 1GB of data and your managed disks are storing only 512MiB of data (1GiB * 50%)
- Over the month, the EasyTier software migrates this single extent around to perform performance balancing, and over the course of the next month, that 1GB of data is copied onto each of the 10 different extents. EasyTier doesn't delete the old copy of data, it just leaves the old data there but stops using it.
- So now - the managed disk is storing 10 different copies of the 1GB of user data - which means that whilst your volume is still only storing 1GB of data, the managed disks are storing 5GB of data (10 * 1GB * 50%)
- Capacity Planning and operational concerns
- I think that this is the least obvious problem. What happens when my overallocation runs out of space.
- If you run out of space on SVC - each volume has it's own emergency capacity. This means that when you run out of space, the volumes will each fail one at a time. And it might be hours or days between each volume running out of space and going offline. This means that your operations teams have time to move things around to make more space available.
- I discuss this emergency capacity concept in more details here: Thin Provisioning - what should you think about from an operational point of view
- In contrast, if your managed disk goes offline because the backend controller has run out of space, the entire SVC storage pool and all of the volumes (potentially 1000s of them) inside that pool will go offline at the same time, You will then have to find a way to bring the managed disk back online, and you won't be able to use anything like the SVC data migration or deleting a volume to make more space in the pool.
- So you need to be very closely monitoring the backend storage to make sure that you never ever run out of space.
- Capability of the SVC compression hardware
- Whilst the compression capability on the 2145-CG8 and the V7000 Gen1 was not too hard to overload with uncompressible workloads, it's actually a lot harder to overload the Compression Accelerator cards in the 2145-DH8 nodes or the V7000 Gen2. So overloading becomes less likely. I'd always recommend having both of the compression accelerator cards in these platforms if you are going to turn on compression.
It has not been my intention to scare or discourage the use of overallocation in controllers behind SVC. I just want to make sure you all understand the factors involved, and take appropriate measures to protect your environment from space concerns.
Hope this is helpful,
Andrew