Mapping similar source connections in Watson Knowledge Catalog

IBM Spectrum® Discover supports mapping source connections with Watson™ Knowledge Catalog connections (WKC) through WKC connector App.

When metadata is being exported to WKC, IBM Spectrum Discover might use a source connection that WKC can also connect to. In such a scenario, you can configure the connection mapping within the WKC Connector App.

The connections that can be linked to are:
  • Amazon S3
  • IBM Cloud® Object Storage
  • IBM Storage Scale
Note: Connections with IBM Storage Scale are established through an S3 connection as WKC does not support IBM Storage Scale directly.

The details for configuring the connection maps with each of these source connections are described.

S3

For an IBM Spectrum Discover S3 connection, the WKC connection must contain the following details:

Bucket
If the bucket name is configured, then you do not need to provide any further configuration details. The WKC Connector App can infer the details from the global namespace of Amazon S3 buckets.
If the bucket name is not provided, then configure the WKC_Connection_Map environment variable by using the following format: <datasource>;<cluster>:<wkc connection name>. A sample variable value is shown. All documents, that the WKC App receives in its work message corresponding to the data source and cluster pair that is defined in the variable, maps to the WKC connection of that name.
WKC_CONNECTION_MAP=testbucket1.sd.ibm.com;s3.eu-west-1.amazonaws.com:s3_con_no_bucket
Note: It is mandatory to provide the bucket name while you are configuring details in IBM Spectrum Discover but it is optional for WKC.

IBM Cloud Object Storage Infrastructure

For an IBM Spectrum Discover IBM Cloud Object Storage connection the WKC connection must contain the following details:

Login URL
The login URL must be that of the accessor defined in IBM Spectrum Discover. The data source for IBM Cloud Object Storage is the vault. However, since it is not possible to provide a vault here, you must provide the mapping within the environment variable WKC_CONNECTION_MAP in the following format: <datasource>;<cluster>:<wkc connection name>. A sample variable value is shown.
WKC_CONNECTION_MAP=vault1;e09cdac0-80f8-73be-00ed-cb8edeede242:local_cos_con

Multiple IBM Cloud Object Storage connections to the same system can map to the same WKC connection (as it is at a higher level and can see all vaults).

IBM Storage Scale

Connection mapping with IBM Storage Scale must be done through an S3 connection as WKC cannot connect directly with it.

To establish an S3 connection, configure the following details in the WKC connector app:

Endpoint URL
The S3 Endpoint URL on the IBM Storage Scale mode. For example, http://modevvm19.tuc.stglabs.ibm.com:9000.
Access Key
Type the S3 access key.
Secret Key
Type the S3 secret key.
Bucket
Do not enter values in the bucket field.
Define the mapping within the environment variable WKC_CONNECTION_MAP in the following format: <datasource>;<cluster>:<wkc connection name>. A sample variable value is shown:
WKC_CONNECTION_MAP=scale0;modevvm19.tuc.stglabs.ibm.com:s3_scale

Mapping multiple connections

You can map multiple connections within the same environment variable by using commas to separate the values. A sample is shown:
WKC_CONNECTION_MAP=vault1;e09cdac0-80f8-73be-00ed-cb8edeede242:local_cos_con,scale0;modevvm19.tuc.stglabs.ibm.com:s3_scale,testbucket1.sd.ibm.com;s3.eu-west-1.amazonaws.com:s3_con_no_bucket

Run the following command to edit the WKC Connector app deployment and add the mapping:

oc -n spectrum-discover edit deploy/spectrum-discover-wkcconnector

The WKC connections are configured in the following format:
name: WKC_CONNECTION_MAP
value: vault1;e09cdac0-80f8-73be-00ed-cb8edeede242:local_cos_con,scale0;modevvm19.tuc.stglabs.ibm.com:s3_scale,testbucket1.sd.ibm.com;s3.eu-west-1.amazonaws.com:s3_con_no_bucket
Note: The edit command configures the WKC connection mapping and automatically restarts the WKC pod.