After successfully implementing Real-time Compression feature in Storwize V7000, IBM has taken a step further bringing this patented technology in IBM XIV storage system. In a recent announcement of XIV 11.6.0 release, the Real-time Compression feature is seamlessly integrated in XIV storage system. Eliminating the need to add any extra hardware, the IBM Random Access Compression Engine (RACE) technology is now integrated with XIV storage system software stack to compress data before writing it to disk (above cache mechanism) resulting in up to 80% storage capacity savings.
It is designed with transparency in mind so that it can be implemented without changes to applications, hosts, networks, fabrics, or external storage systems. The solution is not visible to hosts, thus users and applications continue to work as is. To estimate the compression savings on an existing XIV non compressed volumes, Comprestimator utility is now integrated with XIV software.
What does Compression has in store for me ?
On the XIV system the compression ratio for all uncompressed volumes in the system is continuously estimated, even before enabling compression. The figure shows the various stages of volumes on the system ranging from uncompressed to potential savings and finally the total amount of compression on a volume
What are the Compression benefits for XIV ?
With the inline implementation of Real-time Compression the IBM XIV now delivers dramatic cost savings without need for extra hardware and provides following benefits :
- Increases usable capacity per rack typically to one Petabyte or more with Real-time Compression, greatly reducing effective cost per capacity
- Replicates compressed data faster and using less bandwidth, freeing up bandwidth for other uses
- Continuously displays predicted or actual compression ratios for all volumes
- Converts non-compressed volumes to compressed non-disruptively
So How does it work ?
Real-time compression implementation in XIV storage uses above cache architecture where data is compressed or de-compressed between the I/O interface and the cache. The compression node runs on every module of XIV taking advantage of parallel architecture of XIV. It compresses the portion of volume which only belongs to the module and thus distributing compression workload across all the modules of XIV. Hence, Real-time Compression implementation in XIV have minimal impact on the performance delivered by XIV.
Whenever write operations happen, data is compressed before they enter cache and acknowledgment is sent back to the host. During read operations, reads are stored compressed in cache and data is de-compressed when they are read from cache using RACE before passing it to the host. During XIV mirroring operation, data is compressed only once and compressed data is sent across the network reducing network bandwidth.
What will benefit more from Compression ?
- Database environments – DB2, Oracle, MS-SQL, and so on
- Database Applications – SAP, Oracle applications, and so on
- Server/Desktop Virtualization – KVM, VMware, Hyper-V, and so on
- Other compressible workloads – seismic, engineering, and so on
- Email – Microsoft Exchange, and so on
Are there any guidelines for Compression?
- IBM Real-time Compression is appropriate for data that has the following characteristics:
- Any data for which the Comprestimator tool estimates 25% or higher savings
- Volumes that contain data that is not already compressed (for example, un-compressed image and video files)
- Data for which application based encryption is not used or data that is not sent encrypted to the XIV.
Anything I can refer to ?
Real-time Compression not only works best with randomly accessed data such as database like IBM DB2, Oracle, MS-SQL Server but it also provides good results with server virtualization solutions like VMware, KVM, Hyper-V. When using Oracle databases, compressed volumes take advantage of above cache architecture compressing the writes seamlessly. A 57% compression has been observed during creation of a terabyte of data with minimal performance penalty. ( Publication : WP102551 )
VMware vSphere virtual machines can be seamlessly deployed on the compressed volumes, often with the compression savings of 50% to75%, allowing customers to reduce the storage capacity required for vitalized environments. ( Publication : WP102552 )
Microsoft Hyper-V virtualization helps customers maximize System x server and other resource use. Included in Windows Server, Microsoft Hyper-V virtualization helps reduce costs by allowing a greater number of application workloads to be hosted on fewer physical servers. When using Microsoft SQL Server 2012 SP1 OLTP data files and VM Windows Server 2012 R2 system files stored in Hyper-V virtual disk and XIV compressed volume achieved 73% compression savings ( Publication : WP102553 )
What about the performance ?
While the team tested the compression benefits and compiled the paper, another team from IBM Tel Aviv lab, had been busy with performance testing of the Oracle database hosted on the IBM XIV compressed volumes.
In the test setup, the team used, both compressed and uncompressed volumes configured on XIV for better parallelism. These volumes were mapped to the ESX system hosting the database server to create multiple VMFS file systems. A 5 TB database was created on the VMFS volumes using the Benchmark Factory tool. During the test run of 12 hour, load starting with 1,000 to a maximum of 30,000 users was made to put the system under a realistic production load. The I/O per second (IOPS) and response time information shown by the Benchmark Factory tool is shown by Figure below. Each point on the graph indicates an addition of 2500 users. The graph clearly indicates that the application has minimum impact in terms of response time when using the compressed volumes.
Blog Authors: Mandar Vaidya, Shashank Shingornikar