Db2 and Cloud Object Storage: Get Started
For a while, Db2 has built-in support to directly access Cloud Object Storage (COS) like Amazon S3 and IBM COS (Softlayer / IBM Bluemix IaaS). It allows to perform backups (and restores) as well as data load operations to and from the cloud. Phil Nelson wrote an excellent overview and introduction on the IDUG blogwith his examples focused on S3. Thus, in the following I am going to mostly cover Db2 using the IBM Cloud Object Storage and fill in some additional details.
IBM Cloud Object Storage
Overview
Db2 manages Cloud Object Storage location and access to them on the instance level. First, the provider along with the necessary credentials are cataloged and an alias assigned. Thereafter, the alias can be referred to on the database level by using an URI with the DB2REMOTE prefix:
First Steps
To get started with Cloud Object Storage on a recent version of Db2 some requirements need to be met. Because the information about a cloud or remote storage location includes access credentials, Db2 requires the presence of a keystore. The process of creating a local keystore or how to configure a centralized keystore are described in the Db2 Knowledge Center. It is the same process as for native data encryption. If you already have that up and running, you are good to go. Else, the next is to configure the Db2 instance keystore_location and keystore_type settings. In my case it was simply:
The keystore “hlks.p12” of type PKCS12 is located in the directory “ks”. Thereafter, restart Db2 for the changes to become active.
An alias for a storage location is defined on the instance level using the command CATALOG STORAGE ACCESS ALIAS:
The above creates a new alias “hlswift” with a Frankfurt-based storage location. The credentials are encrypted and stored in the keystore configured above. Once I have an alias defined I could use it like shown in the backup example above. The CATALOG command has (security) options to restrict who can use the alias. It is also possible to directly specify which container or storage bucket to use.
Staging Area
As a temporary buffer for uploading or downloading files Db2 uses a staging area. The path for it can be set via the DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH registry variable:
The above sets the staging area to the “/tmp” directory. The default is in the instance directory under “sqllib/tmp/RemoteStorage.xxxx”. The “xxxx” refers to the Db2 partition number. Note that the staging area needs to have enough capacity to hold backup images or load files.
Administration / Maintenance
The defined storage aliases can be printed via the LIST STORAGE ACCESS command. It returns the details on each of the defined storage locations except the passwords. Here is the entry for the alias “hlswift” as defined above:
Next, to remove an existing alias, use the UNCATALOG STORAGE ACCESS ALIAS command. Last but not least, there is a command to rotate the key for the encrypted credentials, ROTATE MASTER KEY FOR STORAGE ACCESS.
Conclusions
The above provides you with a rough introduction into combining Db2 with Cloud Object Storage (or the other way?) with a focus on the IBM COS. If not done, start by reading Phil Nelson’s introduction at the IDUG website. There are many details that I didn’t cover today which warrant a follow-up blog post.
If you want to learn more about that topic or discuss the blog with me, then meet me at the upcoming IDUG EMEA and Db2 Aktuell conferences. Additionally, you can read more about the features and benefits about IBM Cloud Object Storage here. If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.
Note: For an update on how to backup data to cloud storage, load from external storage, and even directly query data somewhere in the Internet, see my blog post "Db2 and IBM Cloud Object Storage with the S3 API."