Using object store providers
IBM® Connect:Direct® for UNIX can be configured to extend support to object storage providers including IBM Cloud Object Storage, Microsoft Azure Blob, Google Storage, AWS and S3 compatible providers like Minio, Dell EMC ECS, and Red Hat Ceph to execute public and on-premise cloud-based operations. Users can now continue using the benefits of Connect:Direct features like security, reliability, and point-to-point file transfers optimized for high-volume delivery along with versatility that comes with an object storage backend.
The Linux platform supports managed file transfers between the node and the object store. It is strongly recommended that Connect:Direct be installed as close as the object store storage devices as possible for high performance, consistency, and reliability. It is theoretically possible to remotely access object stores, but it is strongly discouraged as performance frequently suffers due to inconsistent access times to storage resources.
To set up accounts, instances and storage on cloud providers contact your IT Administrator.
A IBM Connect:Direct for UNIX node running this release could either be located on-premise or be running on an instance on the cloud. The user can also configure both the Pnode and Snode on two instances on Cloud. An object store can serve as a source or as a destination to send and receive files.
Setting up Connect:Direct Node on object store providers
- Pre-requisites to activate Connect:Direct Unix on cloud provider. For more information, refer to Account (amazon.com), Get started with Google Cloud|Documentation, IBM Cloud Docs, Azure documentation|Microsoft Docs
- Installing Connect:Direct Unix node on cloud.
- Configuring Connect:Direct node for object storage
Connect:Direct for UNIX can also be configured to extend support to other S3 object store providers such as Minio, Dell EMC ECS, and Red Hat Ceph or use dedicated endpoints to execute public and on-premise cloud-based operations. For endpoint configuration/override see the dedicated section.
Pre-requisites to set-up Connect:Direct Unix on cloud provider
- Set up cloud accounts and credentials
- Select and create a compute instance, RedHat or SuSE
- Create IAM user/roles
- Create Security group. Port numbers which are specific to Connect:Direct should be added to the security group
- Create storage
- Obtain credentials for cloud storage object access
For more information see, Account (amazon.com), Get started with Google Cloud | Documentation, IBM Cloud Docs, Azure documentation | Microsoft Docs
Installing Connect:Direct Unix node on Cloud
No specific configuration is required to install Connect:Direct for UNIX node on a compute instance. For information to install Connect:Direct for UNIX see, Installing Connect:Direct for UNIX.
- CD Unix, Linux platform, and JRE are now included in the base installation
- Initparms to be included during the S3 plugin configuration are updated during the upgrade process
Setting up Connect:Direct Node for object store providers
Store Object naming and initparm.cfg
Objects on object stores can be read and written with Connect:Direct.
scheme://bucketOrContainer/objectKey
The scheme in the Connect:Direct process file name is used to search for the right entry in
the initparm.cfg
file using the name field in a file.ioexit section as the
key.
- Amazon S3
- IBM Cloud Object Storage
- Azure Blob
- Google Storage
Each of them is named an object provider. Each time a process involves an object from an
object store, the right provider must be triggered. This is performed using properties set in
the process or in the initparm.cfg
file.
Selecting the right Provider
Main property
store.providerName
property (See Stores Properties). The following names are available and valid:- Amazon S3: S3
- IBM Cloud Object Storage: COS
- Azure Blob: AZ
- Google Storage: GS
S3 is the default value and can be omitted if Amazon S3 is the expected provider.
Each provider owns its subset of properties. Use these properties to fine configure this provider usage for credentials, endpoints, and objects properties.
Using initparm.cfg file
Initparm.cfg
file include a section dedicated to store providers. The
file.ioexit
section identifies a behavior to adopt thru a set of properties.
More than one entry can exist and each of them can define a different provider and/or a
different behavior for the same provider.
# Azure
file.ioexit:\
:name=AZ:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m \
-Dstore.providerName=AZ \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
# Amazon S3 Production
file.ioexit:\
:name=S3Prod:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m -Ds3.profileName=profileProd \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
# Amazon S3 QA
file.ioexit:\
:name=S3QA:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m -Ds3.profileName=profileQA \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Using sysopts
Connect:Direct sysopts can also be used to select the right store provider. All properties can
be overridden using sysopts
, the store.providerName
can also
be overridden.
# All-purpose entry, default to S3
file.ioexit:\
:name=ALL:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
#Write to an Azure container using the all-purpose entry
To (
FILE=ALL://container/object
sysopts=’:store.providerName=AZ:az.connectionString=’aconnectionstring’:’
DISP=RPL
#Write to an S3 bucket using the all-purpose entry
To (
FILE=ALL://container/object
sysopts=’:s3.accessKey=…:s3.secretKey=…:’
DISP=RPL
Setting up CA certificates used for secure connections to an object store
The default place where the Java JRE looks for CA certificates is [CDU_DIR]/jre/ibm-java-x86_64-80/jre/lib/security/cacerts. When CAs must be added, replaced in this file, there is a risk they are overridden when a Connect:Direct update is applied and the cacerts file replaced.
To avoid this situation, it’s possible to use the Connect:Direct Secure Plus key store as a replacement or a complement to the JRE keystore.
Without any action on configuration, the JRE keystore remains the only source for secure connections validation. To activate a different behavior it’s necessary to set a new property in configuration thru initparms.cfg file or process sysopts.
- JRE_ONLY (default)
- SP_ONLY: The secure Plus keystore will be used as the unique source for CAs
- JRE_SP: the JRE keystore is the first source for CAs, next Secure Plus keystore will be used
- SP_JRE: the Secure Plus keystore is the first source for CAs, next the JRE keystore will be used
Migrating to a JRE only to a SP only configuration
# A scheme using the JRE keystore only (default)
file.ioexit:\
:name=JRE:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
# A scheme using the JRE keystore and next the Secure Plus keystore
file.ioexit:\
:name=JRESP:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m -Dstore.keyStore=JRE_SP\
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
# A scheme using the Secure Plus keystore
file.ioexit:\
:name=SP:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m -Dstore.keyStore=SP_ONLY\
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
#This process overrides the default behavior for scheme SP
#Only the JRE CAs will be used
To (
FILE=SP://container/object
sysopts=’:store.keyStore=JRE_ONLY:’
DISP=RPL
)
Credentials
Credentials are not managed identically and depend on the selected store provider. See Stores Properties for properties definitions.
Azure Blob
- Connection String (az.connectionString). Connection String includes endpoint.
- StorageSharedKeyCredential using account name and account key (az.accountName , az.accountKey) using calculated endpoint. For more information, refer to Stores Properties.
- SAS token (az.sasToken)
- Managed Identity (az.managedIdClientId)
- Workload Identity (az.workloadIdClientId, optional: az.workloadTenantId, az.workloadServiceTokenFilePath) – Only available if running inside Azure
- Environment variables credentials
Google Storage
Only the Google Account generated json credential file can be used. Set property gs.credentialsPath to locate this file.
IBM Cloud Object Storage
- Json credentials file path (cos.credentialsPath)
- BasicIBMOAuthCredentials using Api key and service Id (cos.apiKey, cos.serviceId)
- BasicAWSCredentials using hmac access key and secret key (cos.hmacAccessKey, cos.hmacSecretKey)
- ProfileCredentialsProvider using profile path and profile name (cos.profilePath, cos.profileName)
- The default credentials provider chain
- Environment Variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
- Java System Properties aws.accessKeyId and aws.secretKey
- JSON credential file at the default location (~/.bluemix/cos_credentials)
- Web Identity Token credentials from the environment or container.
- Credential profiles file at the default location (~/.aws/credentials)
- Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI" environment variable is set and security manager has permission to access the variable
- Instance profile credentials delivered through the Amazon EC2 metadata service
Amazon S3
- AwsBasicCredentials using hmac access key and secret key (s3.accessKey, s3.secretKey)
- ProfileCredentialsProvider using profile path and profile name (s3.profilePath,s3.configPath,s3.profileName)
- The default credentials provider chain
- Environment Variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
- Java System Properties aws.accessKeyId and aws.secretKey
- Web Identity Token credentials from the environment or container
- Credential profiles file at the default location (~/.aws/credentials)
- Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI" environment variable is set and security manager has permission to access the variable.
- Instance profile credentials delivered through the Amazon EC2 metadata service
Using S3 Role
It is possible to use the role arn mechanism directly inside profiles or specify it thru properties. When provided thru properties, a Secure Token Service client will be created to get temporary credentials the same way the profile mechanism does.
- s3.roleArn
- s3.roleProfile
- s3.roleDuration (optional)
Credentials refresh
When a profile entry is updated in either the credentials or config file, whatever they are located in their default location or in a particular location (using s3.profilePath, s3.configPath), credentials will be validated again.
When refreshed, credentials may abort the current process if they became invalid.
Endpoints
Endpoints often have a default value but can be overridden with provided properties. It is sometime necessary to provide more info thru properties to have the endpoint correctly calculated.
Azure Blob
- Endpoint info are provided (az.endpoint*): endpoint is built with provided values
- From connection string if provided
- No endpoint info provided: endpoint is derived from account name using the following pattern: https://{az.accountName}.blob.core.windows.net if no connection string is provided.
Google Storage
Endpoint Information Provided (gs.endpoint*
): The endpoint is constructed
using the specified values.
IBM Cloud Object Storage
- Endpoint info provided (cos.endpoint*): The endpoint is constructed using the specified values and may include location if provided (cos.location)
- No endpoint info provided: endpoint is derived from endpoint type and location name
using the following pattern:
https://s3.{cos.endpointtype}[.{cos.location}].cloud-object-storage.appdomain.cloud
.
Amazon S3
- Endpoint Information Provided (s3.endpoint*): The endpoint is constructed using the
specified values.
If
s3.useFipsEndpoint=YES
, thes3.endpoint*
values are ignored.
Amazon S3 bucket arn and access points
- Classic URI form: S3://bucketname/objectKey
- Access point: S3://arn:aws:s3:region:****:accesspoint/bucketaccesspointName/objectKey
- Access point Alias: S3://bucketaccesspoint-****-s3alias/objectKey
- Multi region access point: S3://arn:aws:s3::****:accesspoint/*********.mrap/objectKey
In the previous examples, S3 was used for the scheme name, but any value can be used as long as scheme is declared.
Stores Properties
Usage
All properties can be used and set in initparm and/or sysopts. Any value set in sysopts will override value set in initparm if present.
# IO Exit parameters for Azure
file.ioexit:\
:name=AZ1:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m \
-Dstore.providerName=AZ -Dstore.tags=’key1=AnotherValue\:key2=value2’ \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Sysopts override in CD process
sysopts=’:store.tags=’key1=AnotherValue\:key2=value2’:’
Sysopts override in CD process
sysopts=’:store.tags=’key1=AnotherValue\:key2=value2’:’
store.tags
value overrides value defined in
initparm.
Property names are case sensitive when used in initparm but not case sensitive when set in
sysopts
.
If a property value contains “:” or “=”, value must be enclosed in quotes, and the ":" must be escaped with a backslash, "\".
propertyName=’tag=abc\:error=true’
General properties
These properties are available for all stores providers. The store.providerName is the most important of them and will trigger the right provider.
Property name (, alternate name for compatibility) | Description | Possible values | Default value | Connect Direct | Integrated File Agent |
---|---|---|---|---|---|
store.providerName | Triggers the right store service |
S3: Amazon S3 AZ: Azure Blob GS: Google Storage COS: IBM Cloud Object Storage |
S3 for compatibility with previous version | YES | YES |
store.keyStore | Keystore usage |
JRE_ONLY: The cacerts file will be used SP_ONLY: Only Secure Plus keystore will be used JRE_SP: The cacerts file and next the Secure Plus keystore SP_JRE : The Secure Plus keystore and next the cacerts file |
JRE_ONLY | YES | YES |
store.configFromCD | Integrated file agent will get the store configuration from Connect:Direct and will not use the stores.properties content |
YES NO |
NO | NO, Only for Integrated File Agent | YES |
store.contentType, s3ioexit.contentType | Object Content-Type | Free | None | YES | NO |
store.contentEncoding |
Object Content-Encoding For compatibility with previous version: if *.contentType contains ‘charset’, charset value will be used for Content-Encoding but only if store.contentEncoding is empty |
Free | None | YES | NO |
store.dwldRange, s3ioexit.dwldRange | Size of buffer to read from the provider stream | >= 5MB, <= 50MB | 5MB | YES | NO |
store.objectSize, s3ioexit.objectSize | Object size If CD can’t provide it. This value can be used to calculate part size on multi parts uploads. |
S3: up to 5TB AZ: up to 4.78TB GS: up to 5TB COS: up to 5TB |
None | YES | NO |
store.partSize, s3ioexit.partSize | Override to calculated part size |
S3: not less than 5MB, up to 5GB AZ: not less than 64KB, up to 100MB GS: not less than 5MB, up to 5TB COS: not less than 5MB, up to 5GB |
None | YES | NO |
store.tags | Additional info to store with the object as tags or metadata. | S Tags must be enclosed into quotes and separated with semicolon. store.tags=’key=value;otherKey=abc’ |
None | YES | NO |
store.maxConnections | For multipart uploads, the maximum number of parallel connections the client can use. This value is not fixed or guaranteed and may vary depending on the size of the uploaded parts and available system resources. | integer | 30 | YES | NO |
store.endpointUrl | Endpoint Override This value can be overridden by any*.endpointUrl value if specified. |
YES | YES | ||
store.endpointPort | Endpoint port Override This value can be overridden by any*.endpointPort value if specified. |
YES | YES | ||
store.endpointSecure | Endpoint will https or http This value can be overridden by any*.endpointSecure value if specified. |
YES, NO | YES | YES | YES |
Azure Blob properties (az.*)
Property name | Description | Possible values | Default value | Connect Direct | Integrated File Agent |
---|---|---|---|---|---|
az.connectionString |
The connection string includes the full set of info to connect to the service. Value must be enclosed in quotes. |
Example: az.connectionString=’DefaultEndpointsProtocol= QueueEndpoint |
None | YES | YES |
az.applicationId | Additional info application can provide | Free | None | YES | YES |
az.accountName | Credentials account name | provided by Azure account | None | YES | YES |
az.accountKey | Credentials account key | provided by Azure account | None | YES | YES |
az.sasToken | Credentials SAS token | provided by Azure account | None | YES | YES |
az.workloadIdClientId | Credentials managed Identity client ID | provided by Azure account | None | YES | YES |
az.workloadIdClientId | Credentials Workload Identity client ID | provided by Azure account | None | YES, only when running on Azure | YES, only when running on Azure |
az.workloadTenantId | Workload tenant ID | provided by Azure account | None | YES, only when running on Azure | YES, only when running on Azure |
az.workloadService TokenFilePath |
File path to the service token file for workload identity | provided by Azure account | None | YES, only when running on Azure | YES, only when running on Azure |
az.endpointUrl | Endpoint info to override default endpoint. Mainly used when using Azurite (Use Azurite emulator for local Azure Storage development | Microsoft Docs) | None | YES | YES | |
az.endpointPort | Endpoint port | None | YES | YES | |
az.endpointSecure | Endpoint will use https or http | YES | NO | YES | YES | YES |
az.accessTier | Object storage class | HOT, COOL, ARCHIVE | None (inferred from bucket) | YES | NO |
Google Storage properties (gs.*)
Property name | Description | Possible values | Default value | Connect Direct | Integrated File Agent |
---|---|---|---|---|---|
gs.credentialsPath | Path to the json credentials file | Provided by Google account | None | YES | YES |
gs.projectId | Additional info application can provide | Free | None | YES | YES |
gs.storageClass | Object storage class | STANDARD, NEARLINE, COLDLINE, ARCHIVE | None (inferred from bucket) | YES | NO |
gs.partUploadFolder |
GCP SDK does not provide an API for multipart uploads as other cloud providers do. Instead, CD creates parts with unique names and then composes them into the final object. Unlike other cloud providers, these parts are not hidden, which means scanning tools may detect them as separate objects. To address this issue, a new property is now available to store these temporary parts in a dedicated folder. When this property is enabled, parts are stored in the specified folder using the
following naming
pattern: |
A valid folder name | None | YES | NO |
gs.composeDelay |
The Object Store Service uses the compose API provided by the Google SDK to merge uploaded parts into the final object. This API has a rate limit of one call per second for an object, and in some cases, two consecutive calls may be too fast. To address this issue, a delay and retry mechanism have been implemented when this error occurs. Delay applied for two consecutive part uploads. |
milliseconds | 1000 | YES | NO |
gs.composeRetries | See gs.composeDelay Number of retry attempts when a delay error occurs. |
integer | 10 | YES | NO |
gs.endpointUrl | Endpoint override | YES | YES | ||
gs.endpointPort | Endpoint port override | YES | YES | ||
gs.endpointSecure | Endpoint will use https or http | YES, NO | YES | YES | YES |
IBM Cloud Object Storage properties (cos.*)
Property name | Description | Possible values | Default value | Connect Direct | Integrated File Agent |
---|---|---|---|---|---|
cos.credentialsPath | Path to the json credentials file | None | YES | YES | |
cos.serviceInstanceId | CredentialsService Instance ID | None | YES | YES | |
cos.apiKey | Credentials API key | None | YES | YES | |
cos.hmacAccessKey | AWS S3 credentials hmac access key | None | YES | YES | |
cos.hmacSecretKey | AWS S3 credentials hmac secret key | None | YES | YES | |
cos.profilePath | AWS S3 credential file with hmac keys and profiles | None | YES | YES | |
cos.profileName | AWS S3 credential file with hmac keys, profile name to use | Profile names available in credentials file | default | YES | YES |
cos.endpointUrl | Endpoint override | None | YES | YES | |
cos.endpointPort | Endpoint port override | None | YES | YES | |
cos.endpointSecure | Endpoint will use https or http | YES, NO | YES | YES | YES |
cos.location | Data Center location: used to dynamically build endpoint (when not overridden by cos.endpoint*) | See Locations for resource deployment | IBM Cloud Docs | YES | YES | |
cos.endpointType | Endpoint Type, used to dynamically build endpoint (when not overridden by cos.endpoint*) | DIRECT, PRIVATE, PUBLIC | PUBLIC | YES | YES |
cos.storageClass | Object storage class | Accelerated, DeepArchive, Glacier, IntelligentTiering, OneZoneInFrequentAccess, Standard, StandardInFrequentAccess | Inferred from bucket | YES | NO |
cos.sseS3 | Server side encryption requested | YES, NO | NO | YES | NO |
cos.virtualHostedUri |
Endpoint format for bucket access, path style access (https://xxx.com/bucket-name/key-name) or virtual hosted (https://bucket-name.xxx.com/key-name) Only works without endpoint override. |
YES, NO | YES | YES | YES |
Amazon S3 properties (s3.*)
Property name | Description | Possible values | Default value | Connect Direct | Integrated File Agent |
---|---|---|---|---|---|
s3.accessKey | AWS S3 credentials hmac access key | provided by Amazon account | None | YES | YES |
s3.secretKey | AWS S3 credentials hmac secret key | provided by Amazon account | None | YES | YES |
s3.roleArn | Role arn to assume | provided by Amazon account | None | YES | YES |
s3.roleProfile | Role profile with credentials | provided by Amazon account | None | YES | YES |
s3.roleDuration | Role duration in seconds | From 900 to 43200 | None | YES | YES |
s3.profilePath | AWS S3 credential file with hmac keys and profiles | None | YES | YES | |
s3.configPath | AWS S3 credential additional config file | None | YES | YES | |
s3.profileName | AWS S3 credential file with hmac keys, profile name to use | Profile names available in merged credentials file and config file | default | YES | YES |
s3.region | AWS region | Will be retrieve from profile If not provided | None | YES | YES |
s3.endpointUrl | Endpoint override | YES | YES | ||
s3.endpointPort | Endpoint port override | YES | YES | ||
s3.endpointSecure | Endpoint will use https or http | YES, NO | YES | YES | YES |
s3.storageClass | Object storage class | Deep_Archive, Glacier, Glacier_IR Intelligent_Tiering, OneZone_IA, Outposts, Reduced_Redundancy, Standard, Standard_IA | Inferred from bucket | YES | NO |
s3.sseS3 | Server side encryption requested with SSE-S3 | YES, NO | NO | YES | NO |
s3.virtualHostedUri |
Endpoint format for bucket access, path style access (https://xxx.com/bucket-name/key-name) or virtual hosted (https://bucket-name.xxx.com/key-name) Only works without endpoint override. |
YES, NO | YES | YES | YES |
s3.useFipsEndpoint | S3 FIPS endpoints must be used. See FIPS - Amazon Web Services (AWS) for more details. | YES, NO | NO | YES | YES |
s3.proxyScheme |
Default S3 http clients proxy scheme is HTTP. Only system properties http_proxyHost, http_proxyPort, http_proxyUser, http_proxyPassword or environment variable HTTP_PROXY can be set to establish a non-secure connection to a proxy. For a proxy secure connection, system properties https_proxyHost, https_proxyPort, https_proxyUser, https_proxyPassword or environment variable HTTPS_PROXY, proxy scheme must be "HTTPS" but this value can't be set thru a system property. s3.proxyScheme allows this override. |
HTTP, HTTPS | HTTP | YES | YES |
Logging
The logging for the basic object store plugin in Connect:Direct SMGR will be captured in the Connect:Direct SMGR trace when it is enabled. See Running System Diagnostics for instructions to enable this trace.
- -Ds3ioexit.trace_level=level \
- Default level is INFO
- Debug level is activated with level=DEBUG
- -Dcdioexit_trace_destination=file
- file is the word ‘file’ and not a file path
- log file will be created in cduser home directory
- log file name is objectstorepluginlogs.log
file.ioexit:\
:name=S3:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m -Ds3ioexit.trace_level=DEBUG \
-Dcdioexit_trace_destination=file \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar
com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Enhancing logging
If a file with name log4j2.properties is stored in /opt/cdunix/ndm/ioexit-plugins/s3, logging will use this properties file. Connect Direct installation creates a default log4j2.properties file with logging level set to INFO. There are more loggers than the basic configuration. These loggers should be enabled at the DEBUG level on support request.
Override Mechanism
Irrespective of the store provider, the mechanism for overriding properties is consistent. A
property can be defined at the initparms.cfg
level and/or in system options
(sysopts
).
For the integrated File Agent, properties are specifically defined in the
config/stores.properties
file and cannot be overridden. Unlike other
components, there is no provision for using sysopts options with the Integrated File
Agent.
Properties defined in initparms.cfg
can be overridden at the process level
using sysopts
statements, providing enhanced flexibility in both provider and
process definitions.
# A general definition
file.ioexit:\
:name=ALL:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Process, truncated
Copy from S3 to Azure same bucket/container object
FROM (
FILE=ALL://container/object
sysopts=’:store.providerName=S3:s3.accessKey=….:s3.secretKey=……:’
DISP=RPL )
)
TO (
FILE=ALL://container/object
sysopts=’:store.providerName=AZ:az.connectionString=’aconnectionstring’:’
DISP=RPL )
# Provider 1 – Amazon S3 default credentials
file.ioexit:\
:name=FROMS3:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m
-Dstore.providerName=S3
-Ds3.accessKey=… -Ds3.secretKey=… \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
# Provider 2 – Azure
file.ioexit:\
:name=TOAZ:\
:library=/opt/cdunix/ndm/lib/libcdjnibridge.so:\
:home.dir=/opt/cdunix/ndm/ioexit-plugins/s3:\
:options=-Xmx640m -Dstore.providerName=AZ\
-Daz.connectionString='...' \
-Djava.class.path=/opt/cdunix/ndm/ioexit-plugins/s3/cd-s3-ioexit.jar com.aricent.ibm.mft.connectdirect.s3ioexit.S3IOExitFactory:
Process 1, truncated
Copy from S3 to Azure same bucket/container object
Nothing in sysopts, all properties come from initparms definition
FROM (
FILE=FROMS3://container/object
DISP=RPL )
)
TO (
FILE=TOAZ://container/object
DISP=RPL )
Process 2, truncated
Copy from another S3 bucket with other credentials to Azure same bucket/container object
anothercontainer needs other credentials, sysopts overrides the default
FROM (
FILE=FROMS3://anothercontainer/object
sysopts=’:s3.accessKey=…:s3.secretKey=…:’
DISP=RPL )
)
TO (
FILE=TOAZ://container/object
DISP=RPL )
Understanding properties origin
When logging is enabled, log shows properties origin.
Properties in initparms.cfg file are provided thru system properties (-D…). So, label for origin is SystemProperties.
Available property from CDProcessSysopts s3.configpath:/pathto/config-east-1
Available property from SystemProperties s3.accesskey:****
Available property from SystemProperties store.providername:S3
Available property from CDProcessSysopts s3.profilepath:/path/aws_credentials
Available property from SystemProperties s3.secretkey:****
IFA requested to use CD initparms properties. Label for origin is CDInitparms.
Available property from CDProcessSysopts s3.configpath:/pathto/config-east-1
Available property from CDInitparms s3.accesskey:****
Available property from CDInitparms store.providername:S3
Available property from CDProcessSysopts s3.profilepath:/path/aws_credentials
Available property from CDInitparms s3.secretkey:****
Available property from Client store.configfromcd:YES
Integrated File Agent
The Integrated File Agent can utilize the same set of properties, excluding those specific to
object storage class or encryption. These properties are usually configured in the
config/stores.properties
file located in the Integrated File Agent
installation directory.
Integrated File Agent executing on X86_64 Linux
When executed on X86_64 Linux, Integrated File Agent can take advantage of the
initparms.cfg
file to avoid configuration duplication.
In the stores.properties
file, the conventional format is
cdfa.provider.setName=property=value
, where 'setName' serves as an
identifier for grouping properties in a specific set.
cdfa.provider.AZURE=store.providerName=AZ
cdfa.provider.AZURE=scheme=MICROSOFT://
cdfa.provider.AZURE=az.connectionString=…
cdfa.provider.GOOGLE=store.providerName=GS
cdfa.provider.GOOGLE=scheme=GOOGLE://
cdfa.provider.GOOGLE=gs.credentialsPath=PathTo/google_credentials.json
cdfa.provider.GOOGLE=gs.projectId=ExampleProject
As Integrated File Agent is designed to submit processes to Connect:Direct, it's crucial
that Connect:Direct includes schemes MICROSOFT and GOOGLE defined in
initparms.cfg
with the corresponding credentials properties set. Failure
to provide these definitions will result in process failure.
initparms.cfg
definitions, just set property
store.configFromCD=YES
in the groups expecting credentials or properties
from Connect:Direct.cdfa.provider.AZURE=scheme=MICROSOFT://
cdfa.provider.AZURE=store.configFromCD=YES
Group AZURE for scheme MICROSOFT will get credentials or other properties from Connect:Direct. Any other properties in the group will be ignored.
The scheme
property remains mandatory. The scheme
property is used to associate the watched directories with an entry inside the
stores.properties
file. The outscheme property can also be used to
override the name of object submitted to Connect:Direct. When used, this 'outscheme' must
also be defined in initparms.cfg file
Limitations
-
When running Connect:Direct for UNIX with an object store be aware of the following limitations:
Instance is static, not elastic.
- For Run Task/Job operations on cloud objects, the object store CLI for the cloud in question should be invoked. UNIX system commands, such as mv or cp, referencing cloud objects are not supported.
- Depending on the cloud provider, the maximum object size which can be transferred may vary. Refer to each provider to identify the object size.
- The user should consider the cost associated with the usage of cloud resources.
- Storage object does not support the option MOD (modify) for files that are transferred via Connect:Direct. Only NEW or RPL(replace) is supported.
- Wildcard Copy sent from the cloud is not supported.
- If there are multiple versions of the same object present in a bucket/container then only the latest version can be downloaded and not any previous version.
- Multipart download is not supported. Multipart upload is supported.
- For multipart uploads, a user should consider creating a lifecycle rule to delete incomplete parts for aborted and not restarted uploads. Parts remain on storage and involve an additional cost.
- Checkpoint restart can be explicitly configured within a copy step through the ckpt parameter. If it is not configured in the copy step, it can be configured in the Initparms through the ckpt.interval parameter. For more information, refer to Getting Started Guide.