Using object store providers

IBM® Connect:Direct® for UNIX can be configured to extend support to object storage providers including IBM Cloud Object Storage, Microsoft Azure Blob, Google Storage, AWS and S3 compatible providers like Minio, Dell EMC ECS, and Red Hat Ceph to execute public and on-premise cloud-based operations. Users can now continue using the benefits of Connect:Direct features like security, reliability, and point-to-point file transfers optimized for high-volume delivery along with versatility that comes with an object storage backend.

The Linux platform supports managed file transfers between the node and the object store. It is strongly recommended that Connect:Direct be installed as close as the object store storage devices as possible for high performance, consistency, and reliability. It is theoretically possible to remotely access object stores, but it is strongly discouraged as performance frequently suffers due to inconsistent access times to storage resources.

To set up accounts, instances and storage on cloud providers contact your IT Administrator.

A IBM Connect:Direct for UNIX node running this release could either be located on-premise or be running on an instance on the cloud. The user can also configure both the Pnode and Snode on two instances on Cloud. An object store can serve as a source or as a destination to send and receive files.

Setting up Connect:Direct Node on object store providers

By default, cloud support for Connect:Direct is not enabled. To enable cloud support, complete the following tasks:

Connect:Direct for UNIX can also be configured to extend support to other S3 object store providers such as Minio, Dell EMC ECS, and Red Hat Ceph or use dedicated endpoints to execute public and on-premise cloud-based operations. For endpoint configuration/override see the dedicated section.

Pre-requisites to set-up Connect:Direct Unix on cloud provider

Before you configure Connect:Direct node definitions necessary for using Connect:Direct for UNIX, you must complete the following tasks:
  1. Set up cloud accounts and credentials
  2. Select and create a compute instance, RedHat or SuSE
  3. Create IAM user/roles
  4. Create Security group. Port numbers which are specific to Connect:Direct should be added to the security group
  5. Create storage
  6. Obtain credentials for cloud storage object access

For more information see, Account (amazon.com), Get started with Google Cloud | Documentation, IBM Cloud Docs, Azure documentation | Microsoft Docs

Installing Connect:Direct Unix node on Cloud

No specific configuration is required to install Connect:Direct for UNIX node on a compute instance. For information to install Connect:Direct for UNIX see, Installing Connect:Direct for UNIX.

If you are upgrading from an old release of Connect:Direct for UNIX node note that:
  • CD Unix, Linux platform, and JRE are now included in the base installation
  • Initparms to be included during the S3 plugin configuration are updated during the upgrade process

Setting up Connect:Direct Node for object store providers

Store Object naming and initparm.cfg

Objects on object stores can be read and written with Connect:Direct.

A file name identified with an URI type format is an object in an object store.
scheme://bucketOrContainer/objectKey

The scheme in the Connect:Direct process file name is used to search for the right entry in the initparm.cfg file using the name field in a file.ioexit section as the key.

Connect:Direct for UNIX can read and/or write to the following object stores:
  • Amazon S3
  • IBM Cloud Object Storage
  • Azure Blob
  • Google Storage

Each of them is named an object provider. Each time a process involves an object from an object store, the right provider must be triggered. This is performed using properties set in the process or in the initparm.cfg file.

Selecting the right Provider

Main property

Store provider selection is triggered by the value of the store.providerName property (See Stores Properties). The following names are available and valid:
  • Amazon S3: S3
  • IBM Cloud Object Storage: COS
  • Azure Blob: AZ
  • Google Storage: GS

S3 is the default value and can be omitted if Amazon S3 is the expected provider.

Each provider owns its subset of properties. Use these properties to fine configure this provider usage for credentials, endpoints, and objects properties.

Using initparm.cfg file

Initparm.cfg file include a section dedicated to store providers. The file.ioexit section identifies a behavior to adopt thru a set of properties. More than one entry can exist and each of them can define a different provider and/or a different behavior for the same provider.

Note: Any colon characters (':') in a parameter value must be escaped with a backslash ('\').
# Azure 
file.ioexit:\
 :name=AZ:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m \
 -Dstore.providerName=AZ \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:

# Amazon S3 Production
file.ioexit:\
 :name=S3Prod:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m -Ds3.profileName=profileProd \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:

# Amazon S3 QA
file.ioexit:\
 :name=S3QA:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m -Ds3.profileName=profileQA \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:

Using sysopts

Connect:Direct sysopts can also be used to select the right store provider. All properties can be overridden using sysopts, the store.providerName can also be overridden.

# All-purpose entry, default to S3
file.ioexit:\
 :name=ALL:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:


#Write to an Azure container using the all-purpose entry
To (
FILE=ALL://container/object
sysopts=’:store.providerName=AZ:az.connectionString=’aconnectionstring’:’
DISP=RPL

#Write to an S3 bucket using the all-purpose entry
To (
FILE=ALL://container/object
sysopts=’:s3.accessKey=…:s3.secretKey=…:’
DISP=RPL

Setting up CA certificates used for secure connections to an object store

The default place where the Java JRE looks for CA certificates is [CDU_DIR]/jre/ibm-java-x86_64-80/jre/lib/security/cacerts. When CAs must be added, replaced in this file, there is a risk they are overridden when a Connect:Direct update is applied and the cacerts file replaced.

To avoid this situation, it’s possible to use the Connect:Direct Secure Plus key store as a replacement or a complement to the JRE keystore.

Without any action on configuration, the JRE keystore remains the only source for secure connections validation. To activate a different behavior it’s necessary to set a new property in configuration thru initparms.cfg file or process sysopts.

The property is store.keyStore and the possible values for this property are:
  • JRE_ONLY (default)
  • SP_ONLY: The secure Plus keystore will be used as the unique source for CAs
  • JRE_SP: the JRE keystore is the first source for CAs, next Secure Plus keystore will be used
  • SP_JRE: the Secure Plus keystore is the first source for CAs, next the JRE keystore will be used

Migrating to a JRE only to a SP only configuration

It’s recommended to first set the property to SP_JRE, move from JRE keystore or install the necessary CAs in the Secure Plus keystore, next set the property to SP_ONLY when all the necessary CAs are in the Secure Plus keystore.
# A scheme using the JRE keystore only (default)
file.ioexit:\
 :name=JRE:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:

# A scheme using the JRE keystore and next the Secure Plus keystore
file.ioexit:\
 :name=JRESP:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m -Dstore.keyStore=JRE_SP\
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
# A scheme using the Secure Plus keystore
file.ioexit:\
 :name=SP:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m -Dstore.keyStore=SP_ONLY\
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:

#This process overrides the default behavior for scheme SP
#Only the JRE CAs will be used

To (
FILE=SP://container/object
sysopts=’:store.keyStore=JRE_ONLY:’
DISP=RPL
)

Credentials

Credentials are not managed identically and depend on the selected store provider. See Stores Properties for properties definitions.

Azure Blob

Credentials are managed in the following order:
  1. Connection String (az.connectionString). Connection String includes endpoint.
  2. StorageSharedKeyCredential using account name and account key (az.accountName , az.accountKey) using calculated endpoint. For more information, refer to Stores Properties.
  3. SAS token (az.sasToken)
  4. Managed Identity (az.managedIdClientId)
  5. Workload Identity (az.workloadIdClientId, optional: az.workloadTenantId, az.workloadServiceTokenFilePath) – Only available if running inside Azure
  6. Environment variables credentials

Google Storage

Only the Google Account generated json credential file can be used. Set property gs.credentialsPath to locate this file.

IBM Cloud Object Storage

Credentials are managed in the following order:
  1. Json credentials file path (cos.credentialsPath)
  2. BasicIBMOAuthCredentials using Api key and service Id (cos.apiKey, cos.serviceId)
  3. BasicAWSCredentials using hmac access key and secret key (cos.hmacAccessKey, cos.hmacSecretKey)
  4. ProfileCredentialsProvider using profile path and profile name (cos.profilePath, cos.profileName)
  5. The default credentials provider chain
    1. Environment Variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
    2. Java System Properties aws.accessKeyId and aws.secretKey
    3. JSON credential file at the default location (~/.bluemix/cos_credentials)
    4. Web Identity Token credentials from the environment or container.
    5. Credential profiles file at the default location (~/.aws/credentials)
    6. Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI" environment variable is set and security manager has permission to access the variable
    7. Instance profile credentials delivered through the Amazon EC2 metadata service

Amazon S3

Credentials are managed in the following order:
  1. AwsBasicCredentials using hmac access key and secret key (s3.accessKey, s3.secretKey)
  2. ProfileCredentialsProvider using profile path and profile name (s3.profilePath,s3.configPath,s3.profileName)
  3. The default credentials provider chain
    1. Environment Variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
    2. Java System Properties aws.accessKeyId and aws.secretKey
    3. Web Identity Token credentials from the environment or container
    4. Credential profiles file at the default location (~/.aws/credentials)
    5. Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI" environment variable is set and security manager has permission to access the variable.
    6. Instance profile credentials delivered through the Amazon EC2 metadata service

Using S3 Role

It is possible to use the role arn mechanism directly inside profiles or specify it thru properties. When provided thru properties, a Secure Token Service client will be created to get temporary credentials the same way the profile mechanism does.

To provide the role properties, the following properties can be used:
  • s3.roleArn
  • s3.roleProfile
  • s3.roleDuration (optional)

Credentials refresh

When a profile entry is updated in either the credentials or config file, whatever they are located in their default location or in a particular location (using s3.profilePath, s3.configPath), credentials will be validated again.

When refreshed, credentials may abort the current process if they became invalid.

Endpoints

Endpoints often have a default value but can be overridden with provided properties. It is sometime necessary to provide more info thru properties to have the endpoint correctly calculated.

Azure Blob

  • Endpoint info are provided (az.endpoint*): endpoint is built with provided values
  • From connection string if provided
  • No endpoint info provided: endpoint is derived from account name using the following pattern: https://{az.accountName}.blob.core.windows.net if no connection string is provided.

Google Storage

Endpoint Information Provided (gs.endpoint*): The endpoint is constructed using the specified values.

IBM Cloud Object Storage

  • Endpoint info provided (cos.endpoint*): The endpoint is constructed using the specified values and may include location if provided (cos.location)
  • No endpoint info provided: endpoint is derived from endpoint type and location name using the following pattern: https://s3.{cos.endpointtype}[.{cos.location}].cloud-object-storage.appdomain.cloud.

Amazon S3

  • Endpoint Information Provided (s3.endpoint*): The endpoint is constructed using the specified values.

    If s3.useFipsEndpoint=YES, the s3.endpoint* values are ignored.

Amazon S3 bucket arn and access points

A bucket can be located either with its name or its arn. In a process the following form are equivalent and supported:
  • Classic URI form: S3://bucketname/objectKey
  • Access point: S3://arn:aws:s3:region:****:accesspoint/bucketaccesspointName/objectKey
  • Access point Alias: S3://bucketaccesspoint-****-s3alias/objectKey
  • Multi region access point: S3://arn:aws:s3::****:accesspoint/*********.mrap/objectKey

In the previous examples, S3 was used for the scheme name, but any value can be used as long as scheme is declared.

Stores Properties

Usage

All properties can be used and set in initparm and/or sysopts. Any value set in sysopts will override value set in initparm if present.

# IO Exit parameters for Azure

file.ioexit:\
 :name=AZ1:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m \
 -Dstore.providerName=AZ -Dstore.tags=’key1=AnotherValue\:key2=value2’ \
 -Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Sysopts override in CD process

sysopts=’:store.tags=’key1=AnotherValue\:key2=value2’:’
Sysopts override in CD process

sysopts=’:store.tags=’key1=AnotherValue\:key2=value2’:’

store.tags value overrides value defined in initparm.

Property names are case sensitive when used in initparm but not case sensitive when set in sysopts.

If a property value contains “:” or “=”, value must be enclosed in quotes, and the ":" must be escaped with a backslash, "\".

propertyName=’tag=abc\:error=true’

General properties

These properties are available for all stores providers. The store.providerName is the most important of them and will trigger the right provider.

Property name (, alternate name for compatibility) Description Possible values Default value Connect Direct Integrated File Agent
store.providerName Triggers the right store service

S3: Amazon S3

AZ: Azure Blob

GS: Google Storage

COS: IBM Cloud Object Storage

S3 for compatibility with previous version YES YES
store.keyStore Keystore usage

JRE_ONLY: The cacerts file will be used

SP_ONLY: Only Secure Plus keystore will be used

JRE_SP: The cacerts file and next the Secure Plus keystore

SP_JRE : The Secure Plus keystore and next the cacerts file

JRE_ONLY YES YES
store.configFromCD Integrated file agent will get the store configuration from Connect:Direct and will not use the stores.properties content

YES

NO

NO NO, Only for Integrated File Agent YES
store.contentType, s3ioexit.contentType Object Content-Type Free None YES NO
store.contentEncoding

Object Content-Encoding

For compatibility with previous version: if *.contentType contains ‘charset’, charset value will be used for Content-Encoding but only if store.contentEncoding is empty

Free None YES NO
store.dwldRange, s3ioexit.dwldRange Size of buffer to read from the provider stream >= 5MB, <= 50MB 5MB YES NO
store.objectSize, s3ioexit.objectSize Object size If CD can’t provide it. This value can be used to calculate part size on multi parts uploads.

S3: up to 5TB

AZ: up to 4.78TB

GS: up to 5TB

COS: up to 5TB

None YES NO
store.partSize, s3ioexit.partSize Override to calculated part size

S3: not less than 5MB, up to 5GB

AZ: not less than 64KB, up to 100MB

GS: not less than 5MB, up to 5TB

COS: not less than 5MB, up to 5GB

None YES NO
store.tags Additional info to store with the object as tags or metadata. S

Tags must be enclosed into quotes and separated with semicolon.

store.tags=’key=value;otherKey=abc’

None YES NO
store.maxConnections For multipart uploads, the maximum number of parallel connections the client can use. This value is not fixed or guaranteed and may vary depending on the size of the uploaded parts and available system resources. integer 30 YES NO
store.endpointUrl

Endpoint Override

This value can be overridden by any *.endpointUrl value if specified.
    YES YES
store.endpointPort

Endpoint port Override

This value can be overridden by any *.endpointPort value if specified.
    YES YES
store.endpointSecure

Endpoint will https or http

This value can be overridden by any *.endpointSecure value if specified.
YES, NO YES YES YES

Azure Blob properties (az.*)

Property name Description Possible values Default value Connect Direct Integrated File Agent
az.connectionString

The connection string includes the full set of info to connect to the service.

Value must be enclosed in quotes.

Example:

az.connectionString=’DefaultEndpointsProtocol=
https;AccountName=
cduioexit;AccountKey=abcd1r4BIZQlahie2V3cFqTg==;
BlobEndpoint=
https\://cduioexit.blob.core.windows.net/;

QueueEndpoint
=https\://cduioexit.queue.core.windows.net/;
TableEndpoint=
https\://cduioexit.table.core.windows.net/;
FileEndpoint=
https\://cduioexit.file.core.windows.net/’

None YES YES
az.applicationId Additional info application can provide Free None YES YES
az.accountName Credentials account name provided by Azure account None YES YES
az.accountKey Credentials account key provided by Azure account None YES YES
az.sasToken Credentials SAS token provided by Azure account None YES YES
az.workloadIdClientId Credentials managed Identity client ID provided by Azure account None YES YES
az.workloadIdClientId Credentials Workload Identity client ID provided by Azure account None YES, only when running on Azure YES, only when running on Azure
az.workloadTenantId Workload tenant ID provided by Azure account None YES, only when running on Azure YES, only when running on Azure
az.workloadService
TokenFilePath
File path to the service token file for workload identity provided by Azure account None YES, only when running on Azure YES, only when running on Azure
az.endpointUrl Endpoint info to override default endpoint. Mainly used when using Azurite (Use Azurite emulator for local Azure Storage development | Microsoft Docs)   None YES YES
az.endpointPort Endpoint port   None YES YES
az.endpointSecure Endpoint will use https or http YES | NO YES YES YES
az.accessTier Object storage class HOT, COOL, ARCHIVE None (inferred from bucket) YES NO

Google Storage properties (gs.*)

Property name Description Possible values Default value Connect Direct Integrated File Agent
gs.credentialsPath Path to the json credentials file Provided by Google account None YES YES
gs.projectId Additional info application can provide Free None YES YES
gs.storageClass Object storage class STANDARD, NEARLINE, COLDLINE, ARCHIVE None (inferred from bucket) YES NO
gs.partUploadFolder

GCP SDK does not provide an API for multipart uploads as other cloud providers do. Instead, CD creates parts with unique names and then composes them into the final object.

Unlike other cloud providers, these parts are not hidden, which means scanning tools may detect them as separate objects. To address this issue, a new property is now available to store these temporary parts in a dedicated folder.

When this property is enabled, parts are stored in the specified folder using the following naming pattern:container/'partUploadFolderValue'/objectKey.uniqueId.Part.n

A valid folder name None YES NO
gs.composeDelay

The Object Store Service uses the compose API provided by the Google SDK to merge uploaded parts into the final object. This API has a rate limit of one call per second for an object, and in some cases, two consecutive calls may be too fast. To address this issue, a delay and retry mechanism have been implemented when this error occurs.

Delay applied for two consecutive part uploads.

milliseconds 1000 YES NO
gs.composeRetries See gs.composeDelay

Number of retry attempts when a delay error occurs.

integer 10 YES NO
gs.endpointUrl Endpoint override     YES YES
gs.endpointPort Endpoint port override     YES YES
gs.endpointSecure Endpoint will use https or http YES, NO YES YES YES

IBM Cloud Object Storage properties (cos.*)

Property name Description Possible values Default value Connect Direct Integrated File Agent
cos.credentialsPath Path to the json credentials file   None YES YES
cos.serviceInstanceId CredentialsService Instance ID   None YES YES
cos.apiKey Credentials API key   None YES YES
cos.hmacAccessKey AWS S3 credentials hmac access key   None YES YES
cos.hmacSecretKey AWS S3 credentials hmac secret key   None YES YES
cos.profilePath AWS S3 credential file with hmac keys and profiles   None YES YES
cos.profileName AWS S3 credential file with hmac keys, profile name to use Profile names available in credentials file default YES YES
cos.endpointUrl Endpoint override   None YES YES
cos.endpointPort Endpoint port override   None YES YES
cos.endpointSecure Endpoint will use https or http YES, NO YES YES YES
cos.location Data Center location: used to dynamically build endpoint (when not overridden by cos.endpoint*) See Locations for resource deployment | IBM Cloud Docs   YES YES
cos.endpointType Endpoint Type, used to dynamically build endpoint (when not overridden by cos.endpoint*) DIRECT, PRIVATE, PUBLIC PUBLIC YES YES
cos.storageClass Object storage class Accelerated, DeepArchive, Glacier, IntelligentTiering, OneZoneInFrequentAccess, Standard, StandardInFrequentAccess Inferred from bucket YES NO
cos.sseS3 Server side encryption requested YES, NO NO YES NO
cos.virtualHostedUri

Endpoint format for bucket access, path style access (https://xxx.com/bucket-name/key-name) or virtual hosted (https://bucket-name.xxx.com/key-name)

Only works without endpoint override.

YES, NO YES YES YES

Amazon S3 properties (s3.*)

Property name Description Possible values Default value Connect Direct Integrated File Agent
s3.accessKey AWS S3 credentials hmac access key provided by Amazon account None YES YES
s3.secretKey AWS S3 credentials hmac secret key provided by Amazon account None YES YES
s3.roleArn Role arn to assume provided by Amazon account None YES YES
s3.roleProfile Role profile with credentials provided by Amazon account None YES YES
s3.roleDuration Role duration in seconds From 900 to 43200 None YES YES
s3.profilePath AWS S3 credential file with hmac keys and profiles   None YES YES
s3.configPath AWS S3 credential additional config file   None YES YES
s3.profileName AWS S3 credential file with hmac keys, profile name to use Profile names available in merged credentials file and config file default YES YES
s3.region AWS region Will be retrieve from profile If not provided None YES YES
s3.endpointUrl Endpoint override     YES YES
s3.endpointPort Endpoint port override     YES YES
s3.endpointSecure Endpoint will use https or http YES, NO YES YES YES
s3.storageClass Object storage class Deep_Archive, Glacier, Glacier_IR Intelligent_Tiering, OneZone_IA, Outposts, Reduced_Redundancy, Standard, Standard_IA Inferred from bucket YES NO
s3.sseS3 Server side encryption requested with SSE-S3 YES, NO NO YES NO
s3.virtualHostedUri

Endpoint format for bucket access, path style access (https://xxx.com/bucket-name/key-name) or virtual hosted (https://bucket-name.xxx.com/key-name)

Only works without endpoint override.

YES, NO YES YES YES
s3.useFipsEndpoint S3 FIPS endpoints must be used. See FIPS - Amazon Web Services (AWS) for more details. YES, NO NO YES YES
s3.proxyScheme

Default S3 http clients proxy scheme is HTTP. Only system properties http_proxyHost, http_proxyPort, http_proxyUser, http_proxyPassword or environment variable HTTP_PROXY can be set to establish a non-secure connection to a proxy.

For a proxy secure connection, system properties https_proxyHost, https_proxyPort, https_proxyUser, https_proxyPassword or environment variable HTTPS_PROXY, proxy scheme must be "HTTPS" but this value can't be set thru a system property.

s3.proxyScheme allows this override.

HTTP, HTTPS HTTP YES YES

Logging

The logging for the basic object store plugin in Connect:Direct SMGR will be captured in the Connect:Direct SMGR trace when it is enabled. See Running System Diagnostics for instructions to enable this trace.

The object store plugin will also create its own log file via the following properties specified in the initparm.cfg file.ioexit record(s):
  • -Ds3ioexit.trace_level=level \
    • Default level is INFO
    • Debug level is activated with level=DEBUG
  • -Dcdioexit_trace_destination=file
    • file is the word ‘file’ and not a file path
    • log file will be created in cduser home directory
    • log file name is objectstorepluginlogs.log
Example:
file.ioexit:\
:name=S3:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m -Ds3ioexit.trace_level=DEBUG \
-Dcdioexit_trace_destination=file \
 -Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar 
com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:

Enhancing logging

If a file with name log4j2.properties is stored in /opt/cdunix/ndm/ioexit-plugins/s3, logging will use this properties file. Connect Direct installation creates a default log4j2.properties file with logging level set to INFO. There are more loggers than the basic configuration. These loggers should be enabled at the DEBUG level on support request.

Override Mechanism

Irrespective of the store provider, the mechanism for overriding properties is consistent. A property can be defined at the initparms.cfg level and/or in system options (sysopts).

For the integrated File Agent, properties are specifically defined in the config/stores.properties file and cannot be overridden. Unlike other components, there is no provision for using sysopts options with the Integrated File Agent.

Properties defined in initparms.cfg can be overridden at the process level using sysopts statements, providing enhanced flexibility in both provider and process definitions.

Example 1: Only one entry in initparms.cfg but multiple providers
# A general definition

file.ioexit:\
 :name=ALL:\
 :library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
 :home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
 :options=-Xmx640m \
 -Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Process overrides the store.providerName property (with default S3). The scheme ALL is a multi-store scheme.
Process, truncated
Copy from S3 to Azure same bucket/container object

FROM (
FILE=ALL://container/object
sysopts=’:store.providerName=S3:s3.accessKey=….:s3.secretKey=……:’
DISP=RPL )

)
TO (
FILE=ALL://container/object
sysopts=’:store.providerName=AZ:az.connectionString=’aconnectionstring’:’
DISP=RPL )
Example 2: Two entries in initparms.cfg for 2 providers, 2 sources, 1 destination
# Provider 1 – Amazon S3 default credentials 
file.ioexit:\ 
:name=FROMS3:\ 
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\ 
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\ 
:options=-Xmx640m 
-Dstore.providerName=S3 
-Ds3.accessKey=… -Ds3.secretKey=… \ 
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory: 
# Provider 2 – Azure 
file.ioexit:\ 
:name=TOAZ:\ 
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\ 
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\ 
:options=-Xmx640m -Dstore.providerName=AZ\ 
-Daz.connectionString='...' \ 
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Process 1, truncated
Copy from S3 to Azure same bucket/container object
Nothing in sysopts, all properties come from initparms definition

FROM (
FILE=FROMS3://container/object
DISP=RPL )

)
TO (
FILE=TOAZ://container/object
DISP=RPL )

Process 2, truncated
Copy from another S3 bucket with other credentials to Azure same bucket/container object
anothercontainer needs other credentials, sysopts overrides the default

FROM (
FILE=FROMS3://anothercontainer/object
sysopts=’:s3.accessKey=…:s3.secretKey=…:’
DISP=RPL )

)
TO (
FILE=TOAZ://container/object
DISP=RPL )

Understanding properties origin

When logging is enabled, log shows properties origin.

Connect:Direct example
Properties in initparms.cfg file are provided thru system properties (-D…). So, label for origin is SystemProperties.
 
Available property from CDProcessSysopts s3.configpath:/pathto/config-east-1
Available property from SystemProperties s3.accesskey:****
Available property from SystemProperties store.providername:S3
Available property from CDProcessSysopts s3.profilepath:/path/aws_credentials
Available property from SystemProperties s3.secretkey:****
Integrated File Agent example (IFA is the client)
IFA requested to use CD initparms properties. Label for origin is CDInitparms.
 
Available property from CDProcessSysopts s3.configpath:/pathto/config-east-1
Available property from CDInitparms      s3.accesskey:****
Available property from CDInitparms      store.providername:S3
Available property from CDProcessSysopts s3.profilepath:/path/aws_credentials
Available property from CDInitparms      s3.secretkey:****
Available property from Client           store.configfromcd:YES

Integrated File Agent

The Integrated File Agent can utilize the same set of properties, excluding those specific to object storage class or encryption. These properties are usually configured in the config/stores.properties file located in the Integrated File Agent installation directory.

Integrated File Agent executing on X86_64 Linux

When executed on X86_64 Linux, Integrated File Agent can take advantage of the initparms.cfg file to avoid configuration duplication.

In the stores.properties file, the conventional format is cdfa.provider.setName=property=value, where 'setName' serves as an identifier for grouping properties in a specific set.

The following example shows two groups of properties. The first one points to Azure, the second one uses Google Storage as provider.
cdfa.provider.AZURE=store.providerName=AZ
cdfa.provider.AZURE=scheme=MICROSOFT://
cdfa.provider.AZURE=az.connectionString=…

cdfa.provider.GOOGLE=store.providerName=GS
cdfa.provider.GOOGLE=scheme=GOOGLE://
cdfa.provider.GOOGLE=gs.credentialsPath=PathTo/google_credentials.json
cdfa.provider.GOOGLE=gs.projectId=ExampleProject

As Integrated File Agent is designed to submit processes to Connect:Direct, it's crucial that Connect:Direct includes schemes MICROSOFT and GOOGLE defined in initparms.cfg with the corresponding credentials properties set. Failure to provide these definitions will result in process failure.

To use the initparms.cfg definitions, just set property store.configFromCD=YES in the groups expecting credentials or properties from Connect:Direct.
cdfa.provider.AZURE=scheme=MICROSOFT://
cdfa.provider.AZURE=store.configFromCD=YES

Group AZURE for scheme MICROSOFT will get credentials or other properties from Connect:Direct. Any other properties in the group will be ignored.

The scheme property remains mandatory. The scheme property is used to associate the watched directories with an entry inside the stores.properties file. The outscheme property can also be used to override the name of object submitted to Connect:Direct. When used, this 'outscheme' must also be defined in initparms.cfg file

Limitations

  • When running Connect:Direct for UNIX with an object store be aware of the following limitations:

    Instance is static, not elastic.

  • For Run Task/Job operations on cloud objects, the object store CLI for the cloud in question should be invoked. UNIX system commands, such as mv or cp, referencing cloud objects are not supported.
  • Depending on the cloud provider, the maximum object size which can be transferred may vary. Refer to each provider to identify the object size.
  • The user should consider the cost associated with the usage of cloud resources.
  • Storage object does not support the option MOD (modify) for files that are transferred via Connect:Direct. Only NEW or RPL(replace) is supported.
  • Wildcard Copy sent from the cloud is not supported.
  • If there are multiple versions of the same object present in a bucket/container then only the latest version can be downloaded and not any previous version.
  • Multipart download is not supported. Multipart upload is supported.
  • For multipart uploads, a user should consider creating a lifecycle rule to delete incomplete parts for aborted and not restarted uploads. Parts remain on storage and involve an additional cost.
  • Checkpoint restart can be explicitly configured within a copy step through the ckpt parameter. If it is not configured in the copy step, it can be configured in the Initparms through the ckpt.interval parameter. For more information, refer to Getting Started Guide.