Preparing to provision a Db2 Big SQL instance
To query data, Db2 Big SQL instances must be connected to an existing remote big data storage system.
The remote big data storage system can be:
- A Hadoop cluster on Cloudera Data Platform Version 7.1.9
- Object storageDb2 Big SQL supports:
- IBM Cloud Object Storage
- Amazon Web Services object storage
- IBM Storage Scale object storage
- Red Hat® OpenShift® Data Foundation
Object store requirements
To connect Db2
Big SQL to an object store,
you must meet the following requirements.
- The object store must be one of the following Hadoop S3a compatible object stores:
- Amazon Web Services S3
- IBM Cloud Object Storage
- IBM Storage Scale Object Storage
- IBM Storage Ceph®/Red Hat
Ceph Storage Note: If version 7 of Ceph is used, the minimum level that is supported is 7.0z2.
- Microsoft Azure Data Lake Storage Gen2
- The credentials must permit read and write access on the storage that Db2 Big SQL interacts with.
Remote Hadoop cluster requirements
To connect Db2 Big SQL to a remote Hadoop cluster, you must meet the following requirements.
- The Hadoop cluster is on CDP 7.1.9, on x86-64 hardware.
- As Db2
Big SQL needs to connect to the
individual HDFS Data Nodes, they must be accessible and the associated ports must be open on each
data node to the Db2
Big SQL service. The
following components must also be accessible:
- Cloudera Manager server
- HDFS NameNodes
- Zookeeper
- Hive metastore
- Ranger (if configured)