In today's highly competitive marketplace, it is important to deploy a data processing architecture that not only meets your immediate tactical needs but that also provides the flexibility to grow and change to adapt to your future strategic requirements. In December 2009, IBM introduced the DB2 pureScale Feature for Enterprise Server Edition. The DB2 pureScale Feature builds on familiar and proven design features from the IBM DB2 for z/OS® database software. IBM has brought the proven industry-leading technology and reliability of DB2 for z/OS to open systems. DB2 pureScale is intended to meet the needs of many customers by providing:
- Unlimited capacity to scale out your system by adding additional machines to your cluster with ease.
- Application transparency to leverage your existing applications without changes.
- Continuous availability by providing active-active architecture with inherent redundancy.
- A reduced Total Cost of Ownership (TCO) by allowing for a simplified deployment and management of advanced technology.
A key aspect to the DB2 pureScale solution lies around the capabilities of the underlying storage. Not only does the storage need to provide best of breed performance but it also must provide best-of-breed availability characteristics. In conjunction with that, DB2 pureScale takes advantage of specific storage capabilities to enhance the solution. This article will take you through the value proposition of DB2 pureScale on IBM Storwize V7000 storage demonstrating a solution that meets the needs of even the most demanding businesses.
The DB2 pureScale Feature leverages an active-active shared-disk database implementation based on the DB2 for z/OS data sharing architecture. It leverages proven technology from DB2 on the mainframe to bring the best-of-breed technology to open systems. Using DB2 pureScale offers the following key benefits:
- Unlimited capacity — DB2 pureScale provides practically unlimited capacity by allowing for the addition and removal of members on demand. DB2 pureScale can scale to 128 members and has a highly efficient centralized management facility that allows for superior scale-out compared to peer-to-peer models. DB2 pureScale also leverages a technology called Remote Direct Memory Access, which provides highly efficient inter-node communication mechanism that also facilitates in the superior scaling ability.
- Application transparency — An application that runs in a DB2 pureScale environment, it does not need to have any knowledge of the members in the cluster or concern for partitioning data. DB2 pureScale will automatically route applications to the members deemed most appropriate. DB2 pureScale also provides native support for a great deal of syntax used by other database vendors allowing those applications to run in a DB2 pureScale environment with minimal or no changes. Thus, the benefits of DB2 pureScale can be leveraged in many cases without having to modify the application.
- Continuous availability — DB2 pureScale provides a fully active-active configuration such that if one member goes down, processing can continue at the remaining active members. During a failure, only data being modified on the failing member is temporarily unavailable until database recovery completes for that set of data, which is very quick. This is in direct contrast to other competing solutions where an entire system freeze may occur as part of the database recovery stage.
- Reduced TCO — DB2 pureScale can be deployed with ease as is demonstrated here. The DB2 pureScale interfaces handle the deployment and maintenance of components integrated within DB2 pureScale. This reduces what might amount to steep learning curves that would be associated with some of the competing technologies.
The Storwize V7000 provides a best-in-breed storage solution for redundancy, resilience, performance, high availability and multi-site disaster recovery. V7000 provides many options for back-end storage drive sizes and speeds as well as varying RAID technologies for companies to choose from depending on performance vs. redundancy needs.
Figure 1. IBM Storwize V7000
- Dual controllers and up to 12 (3.5") or 24 (2.5") drives in a 2U form factor
- Up to nine expansion enclosures attach to one control enclosure
- Up to four control enclosures clustered to form a single disk subsystem
- Ability to mix drives sizes and hard drive/solid state drives in enclosure
- 8 (8 Gbps) Fibre Channel ports plus four 1-Gbps iSCSI ports per controller pair
For customers that need an easy, cost-effective way to automatically manage their performance needs, the use of Easy Tier is suggested. To use Easy Tier, simply create your pools of storage using arrays of any two, or all three disk classes (Nearline, Enterprise, and Solid-State (SSD)). Run your normal workloads to these pools, and Easy Tier will continually 'learn' the characteristics of the workload and migrate extents (from 16-MB to 8-GB segments of data) to the appropriate disk class. 'Hot' (highly accessed) extents will migrate to the high-performance disks, while cooler or cold extents will migrate down to lower-performance tier(s) of the pool. In this way, a high-performance pool is created without having the cost of a full set of high-performance drives. This feature is transparent to host applications, but should create a noticeable performance benefit. At present, Storwize V7000 only supports two tier easy tiers at the moment ("hot" and "colder").
The IBM Storwize V7000 provides a wide variety of Copy Services functions. Multiple forms of point-in-time copies (called FlashCopy®) are designed for simple redundant copies of data on the same storage machine, or data mining capability that has no effect on application data. Multiple forms of Peer-to-Peer Remote Copy (PPRC) are designed for two- or three-site replication and disaster recovery solutions. FlashCopy and PPRC in their many forms can also be used in conjunction with each other to create incredibly resilient, highly available, high-performance storage solutions.
The ability to reduce disk capacity required to store data by up to 80 percent, helping increase efficiency and reduce storage costs. The ability to generate these compression results may also translate to performance improvements in systems that are bound by disk input/output operations.
An easy-to-use graphical administration interface allows for simpler deployment and management.
DB2 pureScale leverages capabilities provided by the storage to enhance the availability characteristics of the solution. The two key items being exploited are the ability to create a DB2 Cluster Services Tiebreaker Disk along with the ability to provide ultra-fast fail-over and recovery by fencing virtually instantaneously any host that is deemed not healthy.
DB2 pureScale leverages a disk tie-breaker technology automatically configured by the DB2 installer if it is supported within the given environment. pureScale is designed such that if sets of machines get partitioned from others due to events such as network errors, the grouping of machines that is greater than half of the cluster is deemed the healthy side and will stay online, while the grouping of machines that is less than or equal to half the cluster will go to an offline state. DB2 pureScale implements this technique to ensure data integrity to protect against the case where two disjoint portions of the cluster proceed while thinking the other side is offline.
Consider a case where we have four machines in the pureScale configuration and the machines have a network partition such that the first two machines can only talk to each other and the last two machines can only talk to each other. In this case, neither side of the cluster constitutes greater than half the machines, and both sides would have to go offline unless we had a disk tie-breaker. The disk tie-breaker is a disk resource that basically acts as another voting party in the cluster. Each disjoint side of the cluster will try to acquire the disk tie-breaker. By definition, only one entity can acquire the disk tie-breaker at any point in time. The side that acquires the disk tie-breaker will be considered to have a majority portion of the cluster and will be deemed to be the portion of the cluster that can stay online. Thus with a disk tie-breaker resource, half of the machines in the pureScale cluster can fail or be separated, yet the pureScale cluster can still safely stay online to handle database requests.
DB2 pureScale leverages a SCSI-3 Persistent Reserve [mode 0x7] technology to allow for fast eviction and fencing of failed nodes in the cluster. DB2 pureScale ensures that it fences failed or rogue nodes from the cluster to provide the highest level of data integrity. By not allowing this failed node to access the storage, it has no ability to cause corruption. A key aspect of a reliable clustering solution is its ability to fence failed nodes and to do it in a timely manner. By leveraging this technology, pureScale can fence failed members in just a few seconds, where other technologies can take upwards of 60 seconds. DB2 will automatically enable this SCSI-3 Persistent Reserve technology if it is supported in the environment.
When shared disks in a DB2 pureScale configuration are backed by the V7000 storage controller, the multipath I/O solutions you can use are AIX® Multi-Path I/O (MPIO) and SDDPCM, Linux Device Mapper Multipath I/O (DM-MP). You can use these multipath I/O solutions with either Fibre Channel or Fibre Channel over Ethernet (FCoE) protocols.
For a complete list of the validated storage configurations and which of the above features they support, check out the Storage hardware requirements.
V7000 requires firmware level 188.8.131.52 or later for SCSI-3 PR functionality. No additional configuration is required on the V7000 to enable SCSI-3 PR when using DB2 10 Fixpack 1 or later.
When you run the DB2 pureScale installer, it queries the underlying storage subsystem and detects, through the report capabilities SCSI inquiry command, whether the specified shared disk meets the pureScale SCSI-3 PR requirements. If the storage subsystem satisfies pureScale SCSI-3 PR requirements, the pureScale installer automatically enables the cluster configuration to support SCSI-3 PR.
If a pureScale cluster file system has the
usePersistentReserve attribute enabled, that pureScale cluster is
configured to use SCSI-3 PR. For example, in our test cluster, we queried
the pureScale cluster file system configuration by issuing the
db2cluster –cfs command.
# ./db2cluster -cfs -list -configuration OPTION VALUE ---------------------- ---------------- ... tiebreakerDisks gpfs2nsd ... usePersistentReserve yes ... verifyGpfsReady yes
With a DB2 pureScale instance ready for use, it is recommended to have DB2 create a
file system to use for the data and the logs. This can be done using the
- As root create one file system for data and one for logs.
# <DB2 Instance Path>/sqllib/bin/db2cluster –cfs –create –filesytem data –disk /dev/hdisk4
# <DB2 Instance Path>/sqllib/bin/db2cluster –cfs –create –filesytem log –disk /dev/hdisk5
Note: The "data" and "log" file systems will by default be created under /db2fs and will be accessible on all machines in the DB2 pureScale instance.
- As root modify the owner of the filesystem to be the DB2 instance owner so the
DB2 instance owner will have full access to this file system.
# chown db2sdin1:db2iadm1 /db2fs/data # chown db2sdin1:db2iadm1 /db2fs/log
- As the instance owner configure the DB2 pureScale instance to accept remote
client connections. This step along with the subsequent steps will be done by
the DB2 instance owner ID. By default, the install of DB2 pureScale will add
entries similar to the following into the /etc/services file of each machine
in the DB2 pureScale instance:
db2c_db2sdin1 36630/tcp. To allow for remote connections, issue the following commands from the DB2 Instance Owner:
db2set db2comm=TCPIP db2 update dbm cfg using svcename 36630
This is the value DB2 inserted into the /etc/services file during install, which defines the port used for client server communication.
- As the instance owner start the
db2instanceby issuing the
db2startcommand. You can see the state of the DB2 pureScale instance at any point by using the
> db2start 06/14/2012 15:04:20 0 0 SQL1063N DB2START processing was successful. 06/14/2012 15:04:20 1 0 SQL1063N DB2START processing was successful. SQL1063N DB2START processing was successful.
TIP: You can view the state of a DB2 pureScale cluster using the
- As the instance owner create the database and move the logs to the log file
> db2 create db database_alias on /db2fs/data > db2 update db cfg for database_alias using newlogpath /db2fs/log
- Catalog client connections to any active pureScale members and connect to the pureScale server.
There are many advantages that DB2 pureScale adds, as discussed in the introduction. Below are more details of use cases that demonstrate the added value pureScale provides. The steps previously documented in the DB2 pureScale introduction section already demonstrate a reduced cost of ownership by the simplicity of deployment.
DB2 pureScale allows the ability to add members to the configuration quickly and
without any data redistribution requirements. The DB2 installation binaries are
automatically stored on the IIH, thus not requiring access to the original install
media when members are being added. A member can be added by simply running the
db2iupdt –d –add –m
Members can be started or quiesced transparently to the application such that the application is unaware a change has occurred.
DB2 pureScale provides the ability to dynamically distribute a workload across all
the active members based on the machine utilization levels of the
machines. Multi-threaded CLI applications will by default have
connection-level workload balancing without any changes. This workload balancing can be modified such
that it applies at the transaction level as opposed to the connection level. For
multi-threaded Java applications, the following can be changed in the connection
string to take advantage of transaction-level workload balancing:
As additional members are started, clients will automatically route to the new member without any interruption of service. Also, members can be stopped as per the instructions under stealth maintenance without the application knowing this operation has even occurred.
It should also be noted that clients can also be configured to have a preference to which member it should connect to. This is referred to as client affinity and can be beneficial if a partitioned workload already exists.
NOTE: To take advantage of DB2 pureScale features such as transaction-level workload balancing or client affinity, the minimum client-level should be 9.7 fixpack 1 or the correlating JCC level. To correlate the JCC levels included in the various fixpack levels, check out DB2 JDBC Driver Versions.
In many cases it is critical to apply maintenance to a system, but you don't want any
negative effect on the client applications. Stealth maintenance basically lets all
transactions on a member complete and then transparently route that application to
another member. To drain member 1, the following command can be run:
db2stop member 1 quiesce.
In some cases you may encounter situations where a user session for which a UOW was
started but was not committed or rolled back. The
db2stop quiesce will have to wait for that UOW to be completed before
stopping that member unless a timeout value is
specified. For situations like this, a user can specify a timeout value
such as 10 minutes that allows the application 10 minutes to complete the
UOW, and if after 10 minutes the UOW is not completed, DB2 will automatically force off that application.
For any applications that completed within the 10 minutes, they will have been
automatically rerouted to the active members when they completed their UOW. To drain
member 1 with a 10-minute timeout, the following command can be run:
db2stop member 1 quiesce 10.
One of the significant value propositions of DB2 pureScale is the high-availability characteristics integrated into the architecture. All necessary resources are automatically monitored by the DB2 pureScale cluster services and restarted as needed. Applications connected to a failing member will automatically be rerouted to an active member where the application can reissue any failed transactions. Applications connected to a non-failing component will not be affected beyond what might be seen as a short performance blip.
One of the distinguishing factors of DB2 pureScale compared to competing technologies is that upon a member failure, no cluster-wide freeze occurs. In fact, only data in the process of being updated on the failing member is temporarily unavailable until recovery is complete. The recovery will be completed in the tens of seconds range so data availability will look similar to the following through a member failure.
Figure 2. High availability
DB2 pureScale inherently brings a local high-availability solution, but many customers will also require a disaster recovery solution to meet business continuity requirements. DB2 pureScale can leverage remote disk mirroring with Storwize V7000 Metro Mirror and is designed to work with database replication products. If the entire primary site running a DB2 pureScale instance fails, the remote site can be leveraged to allow business operations to continue. DB2 pureScale can also leverage the traditional database backup, restore and roll-forward functionality that can also be used to enable disaster recovery solutions.
Figure 3. Disaster recovery
DB2 pureScale provides a database solution that meets the needs of the most demanding customers. It is designed to scale effectively to meet the growing and dynamic needs of different organizations. Additional members can be started without any impact to existing applications to meet the demands of peak processing times. Applications will then be transparently balanced across all members, including the newly started member without any changes at the application side. If a member machine fails, applications will be automatically routed among the other active members. When the failed member machine comes back online, applications will be transparently routed to the restarted member. With all the capability DB2 pureScale provides, it still has a reduced TCO compared to other solutions as DB2 pureScale allows for a simplified deployment and maintenance model. One of the fundamental design points behind DB2 pureScale was to mask any underlying complexities to allow customers to be able to deploy DB2 pureScale very quickly.
All the value propositions of DB2 pureScale can be leveraged with IBM Storwize V7000 with out-of-the-box capability for ultra-fast fail-over and disk tiebreaker capabilities. IBM Storwize V7000 also provides industry-leading resiliency, which further enhances the solution to meet the needs of most businesses.
- Refer to "What is DB2 pureScale?" for more information.
- Check out "Deploying the DB2 pureScale Feature."
- Read the IBM Redbooks® publication titled "Implementing the
IBM Storwize V7000 V6.3."
- Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation,
how-to articles, education, downloads, product information, and
- Stay current with
developerWorks technical events and webcasts.
- Follow developerWorks on
Get products and technologies
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
- Now you can use
DB2 for free. Download DB2 Express-C, a no-charge
version of DB2 Express Edition for the community that offers the same core
data features as DB2 Express Edition and provides a solid base to build
and deploy applications.
- Participate in the discussion forum.
- Check out the
blogs and get involved in the
Madhusudan K J has been working in System Verification Testing for DB2 since 2006 at IBM. He is an IBM DB2 Certified Advanced DBA. He has special interest in DB2 high availability, backup, and recovery solutions.
Aslam Nomani has been working with the Database Technology (DBT) team in the IBM Toronto Laboratory for five years. For the past four years, he has worked in the DB2 Universal Database (UDB) System Verification Test department. Aslam has worked extensively in testing DB2 Universal Database in high availability environments. He is currently a team lead within the DBT test organization.
Saroj Kumar Tripathy has been working in system verification testing and functional verification testing for DB2 since 2010 at IBM. He is an IBM DB2 Certified Advanced DBA. His area of expertise is DB2 high availability, HADR and expert security solutions.
Sumair Kayani is a systems engineer at IBM Toronto lab. As a member of Systems Optimization Competency Center, his responsibilities include optimization of software and hardware components that compromise the IBM integrated systems. His role as a systems administrator focused on System P, and Storage Area Networks.
Barry Whyte is a Master Inventor working in the Systems & Technology Group in IBM Hursley, United Kingdom. He primarily works on the IBM SAN Volume Controller and Storwize V7000 virtual disk systems. In his 16 years at IBM, he has worked on the successful Serial Storage Architecture (SSA) and the IBM DS8000 range. He joined the SVC development team soon after its inception and has held many positions before taking on his current role as performance architect.