Credential stores

Data Collector stages communicate with external systems to read and write data. Many of these external systems require sensitive information, such as user names or passwords, to access the data. When you configure stages that connect to these external systems, you must specify the details that the stages need to connect.

If you enter sensitive information directly in stage properties, you expose those details to any user with access to the flow. To access external systems without exposing the sensitive information, add them as secrets in a credential store and then use Data Collector credential functions in the stage properties to retrieve those values.

Defining secrets in a credential store can make it easier to migrate flows to another environment. For example, if you migrate multiple flows from a development to a production environment, you do not need to edit each flow with details for the production environment. You can simply replace the development credential store with the production version.

You can configure Data Collector to use multiple credential stores at the same time. Each credential store is identified by a unique credential store ID.

You can use the following credential stores with Data Collector:
  • AWS Secrets Manager
  • Azure Key Vault
  • CyberArk
  • Google Secret Manager
  • Hashicorp Vault
  • Java keystore
Important: Use the Java keystore credential store system in a development environment only. In a production environment, use a centralized keystore such as the other supported credential stores, to better secure sensitive information.

Enabling credential stores

You can configure Data Collector to use one or more credential stores. Each credential store is identified by a unique credential store ID.

You specify the credential stores that Data Collector can use in a credential-stores.properties file. The file includes the following information:
credentialStores property
This property defines the credential stores that Data Collector can use.
By default, the property is commented out and includes a default credential store ID for each of the supported credential store types, such as aws for AWS Secrets Manager and azure for Azure Key Vault.
To enable using credential stores, you uncomment this property and enter a comma-separated list of the credential store IDs to use.
You can specify multiple credential stores of the same type or of different types, such as two Hashicorp Vaults and one Java keystore. You simply specify a unique ID for each credential store.
Sets of related properties
Each supported credential store type has a set of related properties. The property names include the default credential store IDs originally specified in the credentialStores property.
For example, the CyberArk properties include cyberark, the default CyberArk ID, in each CyberArk property name, such as credentialStore.cyberark.config.region and credentialStore.cyberark.config.access.key.
When you use a custom credential store ID, you must update all related property names to match the custom ID. For example, if you want to use cyberarkUS as a custom ID, you must update all CyberArk default property names for the cyberarkUS credential store replacing cyberark with cyberarkUS.
Note: When you want to use multiple credential stores of the same type, you must have a set of related store properties that are renamed and defined appropriately for each credential store.

For example, say you want to use two Azure credential stores, azureDev for development and azureProd for production. To do this, you specify the credential store IDs in the credentialStores property and make a copy of the related Azure credential store properties, so you have one set for each credential store.

Then, you rename and configure the properties for azureDev, and you do the same for azureProd. The resulting properties might look as follows:
################################################
#      Data Collector Credential Stores        #
################################################

credentialStores=azureDev,azureProd


############################################################
# azureDev: Azure Key Vault Credential Store Configuration #
############################################################

credentialStore.azureDev.def=streamsets-datacollector-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore
credentialStore.azureDev.config.credential.refresh.millis=30000
credentialStore.azureDev.config.credential.retry.millis=15000
credentialStore.azureDev.config.vault.url=https://development.vault.azure.net/
credentialStore.azureDev.config.client.id=devClientID
credentialStore.azureDev.config.client.key=devClientKey


#############################################################
# azureProd: Azure Key Vault Credential Store Configuration #
#############################################################

credentialStore.azureProd.def=streamsets-datacollector-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore
credentialStore.azureProd.config.credential.refresh.millis=30000
credentialStore.azureProd.config.credential.retry.millis=15000
credentialStore.azureProd.config.vault.url=https://production.vault.azure.net/
credentialStore.azureProd.config.client.id=prodClientID
credentialStore.azureProd.config.client.key=prodClientKey

Configuring credential stores

Note the additional details for the following credential store types:
AWS Secrets Manager
In Secrets Manager, you must configure an access and secret key pair with correct permission to read the key. To follow best practices, make secrets read-only and limit access. See the Secrets Manager documentation on identity and access management (IAM) policies.
Azure Key Vault
Using Azure Key Vault requires performing several tasks in Azure. For more information, see Azure registration and key requirements.
CyberArk
At this time, CyberArk integration is only supported using web services to the CyberArk Central Credential Provider.
Google Secret Manager
Using Google Secret Manager requires Google authentication. For more information, see Google authentication requirement.
As a best practice, make secrets read-only and limit access. For additional suggestions, see the Google Secret Manager best practices documentation.
Java keystore
A Java keystore credential storage system requires the distribution of a keystore file, which complicates security. Before using a Java keystore system, decide how the keystore will be distributed and consult with your IT security team to ensure that the system meets IT policies.
Use the Java keystore credential store system in development environments only. In a production environment, use one of the other supported credential stores.
Use the stagelib-cli jks-credentialstore command to add credentials to the credential store. For more information, see Adding secrets to the Java Keystore. For more information abut the jks-credentialstore command, see jks-credentialstore command reference.

Step 1. Install the credential store stage library

Install the stage library for the credential store that you want to use on the Data Collector environment.

For most credential stores, the stage library is named after the credential store. For example, to use the Google Secret Manager, install the Google Secret Manager Credential Store stage library.

Hashicorp Vault requires the Vault Credential Store stage library.

For more information about configuring an environment, see Creating a StreamSets environment.

Step 2. Create the credential store properties file

Before you can configure credential store properties, you must create the credential store properties file:
  1. Create a file named credential-stores.properties.
  2. Copy the credential store properties into the file, then save the file.
  3. Store the credential-stores.properties file on the Data Collector engine workstation in an accessible location. For example, ${HOME}/sdc/credential-stores.properties.

Credential store properties


# IBM Confidential
# PID 5900-BAF
# Copyright IBM Corp., Year 2025
#

# Use this file to enable the use of credential stores with Data Collector.

# IMPORTANT: This file includes a set of properties for each credential store type.
# Property names include the default credential store IDs: jks,aws,a,cyberark,thycotic,vault,gcp.
# When you use custom IDs, you must update the corresponding property names.

# To use multiple credential stores of the same type, make sure each credential store
# has a set of related properties defined. Make sure the property names include
# the appropriate credential store ID.

################################################
#       Data Collector Credential Stores       #
################################################

# Defines the credential stores for Data Collector to use. Specify a comma-separated list
# of unique credential store IDs.
#credentialStores=jks,aws,azure,cyberark,thycotic,vault,gcpg
################################################
# Java Keystore Credential Store Configuration #
################################################

# The following properties are for a Java keystore credential store that uses the 'jks'
# default credential store ID. If you specified a custom ID in the credentialStores property
# above, replace 'jks' in the property names with the custom ID.

# Defines the implementation of the 'jks' credential store
# Update 'jks' in the property name as needed, but do not change the definition of this property.
credentialStore.jks.def=streamsets-datacollector-jks-credentialstore-lib::com_streamsets_datacollector_credential_javakeystore_JavaKeyStoreCredentialStore

# A Java keystore credential store can be of type JCEKS or PKCS12
credentialStore.jks.config.keystore.type=PKCS12

# The location of the Java keystore. Specify an absolute path or a path relative to the
# $SDC_CONF directory.
credentialStore.jks.config.keystore.file=jks-credentialStore.pkcs12

# The password to access the Java keystore
credentialStore.jks.config.keystore.storePassword=changeIt

# The minimum refresh millis used to reload the keystore file
#credentialStore.jks.config.keystore.file.min.refresh.millis=10000

############################################################
#    AWS Secrets Manager Credential Store Configuration    #
############################################################

# The following properties are for an AWS Secrets Manager credential store that uses the 'aws'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'aws' in the property names with the custom ID.

# Defines the implementation of the 'aws' credential store
# Update 'aws' in the property name as needed, but do not change the definition of this property.
credentialStore.aws.def=streamsets-datacollector-aws-secrets-manager-credentialstore-lib::com_streamsets_datacollector_credential_aws_secrets_manager_AWSSecretsManagerCredentialStore

# Default name-key separator for the name parameter in credential functions
credentialStore.aws.config.nameKey.separator=&

# AWS Region
credentialStore.aws.config.region=<MUST BE SET>

# It must be: accessKeys or instanceProfile
credentialStore.aws.config.security.method=accessKeys

# AWS access key
credentialStore.aws.config.access.key=<MUST BE SET IF ACCESS KEYS IS USED AS A SECURITY METHOD>

# AWS secret key
credentialStore.aws.config.secret.key=<MUST BE SET IF ACCESS KEYS IS USED AS A SECURITY METHOD>

# Secrets cache max size
# Maximum number of secrets to cache locally
credentialStore.aws.config.cache.max.size=1024

# Secrets cache TTL
# The number of milliseconds that a cached secret is considered valid before requiring a refresh
# The default is equivalent to 1 hour
credentialStore.aws.config.cache.ttl.millis=3600000


########################################################
#    Azure Key Vault Credential Store Configuration    #
########################################################

# The following properties are for an Azure Key Vault credential store that uses the 'azure'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'azure' in the property names with the custom ID.

# Defines the implementation of the 'azure' credential store
# Update 'azure' in the property name as needed, but do not change the definition of this property.
credentialStore.azure.def=streamsets-datacollector-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore

# Credential refresh interval
# How long a credential can be cached locally before fetching it again from Azure Key Vault.
credentialStore.azure.config.credential.refresh.millis=30000

# Credential retry interval
# How long to wait before retrying to fetch a credential from Azure Key Vault in case of errors.
# This retry delay is not blocking. Locally, it will fail immediately.
credentialStore.azure.config.credential.retry.millis=15000

# Credential method
# Defines which method should be used to access Azure Key Vault. Valid options are either clientKeys (requires
# client.id and client.key values to be configured) or managedIdentity (permission needs to be set for the application
# from the Azure portal)
credentialStore.azure.config.credential.method=clientKeys

# Azure Key Vault credential provider URL
# This property must be set.
# credentialStore.azure.config.vault.url=https://<YOUR_KEY_VAULT>.vault.azure.net/

# Azure Key Vault client ID for this Data Collector
#credentialStore.azure.config.client.id=<MUST BE SET IF SECURITY METHOD IS SET TO CLIENTKEYS>

# Azure Key Vault client key for this Data Collector
#credentialStore.azure.config.client.key=<MUST BE SET IF SECURITY METHOD IS SET TO CLIENTKEYS>


#################################################
#    CyberArk Credential Store Configuration    #
#################################################

# The following properties are for a CyberArk credential store that uses the 'cyberark'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'cyberark' in the property names with the custom ID.

# Defines the implementation of the 'cyberark' credential store
# Update 'cyberark' in the property name as needed, but do not change the definition of this property.
credentialStore.cyberark.def=streamsets-datacollector-cyberark-credentialstore-lib::com_streamsets_datacollector_credential_cyberark_CyberArkCredentialStore

# Credential refresh interval
# How long a credential can be cached locally before fetching it again from CyberArk.
#credentialStore.cyberark.config.credential.refresh.millis=30000

# Credential retry interval
# How long to wait before retrying to fetch a credential from CyberArk in case of errors.
# This retry delay is not blocking. Locally, it will fail immediately.
#credentialStore.cyberark.config.credential.retry.millis=15000

# Connector type to CyberArk
# Currently 'webservices' is the only supported connector
#credentialStore.cyberark.config.connector=webservices

##############################################################
#     CyberArk Credential Store Web Service Configuration    #
##############################################################

# CyberArk Central Credential Provider credential retrieval web service URL
credentialStore.cyberark.config.ws.url=https://<HOST>:<PORT>/AIMWebService/api/Accounts

# CyberArk application ID for this Data Collector
credentialStore.cyberark.config.ws.appId=<MUST BE SET>

# Maximum number of concurrent web service calls to CyberArk
#credentialStore.cyberark.config.ws.maxConcurrentConnections=10

# HTTP connection inactivity check
#credentialStore.cyberark.config.ws.validateAfterInactivity.millis=60000

# TCP and HTTP connection timeout
#credentialStore.cyberark.config.ws.connectionTimeout.millis=10000

# Default separator for CyberArk safe, folder, object name, and object element used in the
# name parameter in credential functions.
#credentialStore.cyberark.config.ws.nameSeparator=&

# HTTP authentication mechanism used by CyberArk Central Credential Provider web services
# Possible values: none, basic, digest
#credentialStore.cyberark.config.ws.http.authentication=none

# User name when using basic or digest authentication
#credentialStore.cyberark.config.ws.http.authentication.user=

# Password when using basic or digest authentication
#credentialStore.cyberark.config.ws.http.authentication.password=

# When using HTTPS and the server certificate is not signed by a public CA, a truststore
# with the public certificate must be available in this truststore file, or in the JDK default truststore.
# Specify an absolute path or a path relative to the $SDC_CONF directory.
#credentialStore.cyberark.config.ws.truststoreFile=

# The password to access the truststore file
#credentialStore.cyberark.config.ws.truststorePassword=

# HTTPS supported protocols
#credentialStore.cyberark.config.ws.supportedProtocols=TLSv1.2

# Determines if the hostname of the CyberArk Central Credential Provider web service should be
# verified against the domain defined in the HTTPS certificate.
#credentialStore.cyberark.config.ws.hostnameVerifier.skip=false

# When using HTTPS and the CyberArk Central Credential Provider web service is configured to require client side
# certificates, the client certificate must be available in this keystore file, or in the JDK default truststore.
# Specify an absolute path or a path relative to the $SDC_CONF directory.
#credentialStore.cyberark.config.ws.keystoreFile=

# The password to access the keystore file
#credentialStore.cyberark.config.ws.keystorePassword=

# The password to access the certificate within the keystore file
#credentialStore.cyberark.config.ws.keyPassword=

# The proxy URI used to access CyberArk
#credentialStore.cyberark.config.ws.proxyURI=


############################################################
#    Google Secret Manager Credential Store Configuration  #
############################################################

# The following properties are for an Google Secret Manager credential store that uses the 'gcp'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'gcp' in the property names with the custom ID.

# Defines the implementation of the 'gcp' credential store
# Update 'gcp' in the property name as needed, but do not change the definition of this property.
credentialStore.gcp.def=streamsets-datacollector-google-secret-manager-credentialstore-lib::com_streamsets_datacollector_google_secret_manager_credentialstore_GoogleSecretManagerCredentialStore

#Expiration time of the cache, default 30 minutes
credentialStore.gcp.config.cache.inactivityExpiration.millis=1800000

credentialStore.gcp.config.delimiter=?

credentialStore.gcp.config.project.id=

# Defines how the configuration for the access is supplied.
# Possible values: default,json or jsonPath
# The mode default uses the standard way to provide authentication details
# 1) Passing credentials via environment variable:
#    Provide authentication credentials to your application code by setting the
#    environment variable GOOGLE_APPLICATION_CREDENTIALS
#    e.g. export GOOGLE_APPLICATION_CREDENTIALS=="/home/user/Downloads/service-account-file.json"
# 2) Passing credentials via Well Known Credentials File
#    Default will try to locate the configuration file application_default_credentials.json using the following paths
#   a) Path defined in Environment variable CLOUDSDK_CONFIG
#   b) %APPDATA%/gcloud/ (Windows) or ${user.home}/.config/gcloud/ (MacOs, Linux)
# 3) Google App Engine Credentials
# 4) Google Cloud Shell
# 5) Compute Engine Credentials using Metadata Server
credentialStore.gcp.config.credentialsMode=default

# If you set 'credentialStore.gcp.config.credentialsMode=json', please supply the following property:
# The content of the json file generated by Google for the account used to access the credential store.
# (A property value can have multiple lines. To indicate that the next line still belongs to the property
# add a backslash (\) at the end of the line)
credentialStore.gcp.config.credentialsJson=

# If you set 'credentialStore.gcp.config.credentialsMode=jsonPath', please supply the following property:
# The path to the json file generated by Google for the account used to access the credential store.
credentialStore.gcp.config.credentialsJsonPath=


########################################################
#    Hashicorp Vault Credential Store Configuration    #
########################################################

# The following properties are for a Hashicorp Vault credential store that uses the 'vault'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'vault' in the property names with the custom ID.

# Defines the implementation of the 'vault' credential store
# Update 'vault' in the property name as needed, but do not change the definition of this property.
credentialStore.vault.def=streamsets-datacollector-vault-credentialstore-lib::com_streamsets_datacollector_credential_vault_VaultCredentialStore

# Default path-key separator for the name parameter in credential functions
credentialStore.vault.config.pathKey.separator=&

# URL of the Vault server to connect to
credentialStore.vault.config.addr=http://localhost:8200

# Vault authentication method. Valid options are azure, google, appRole and appId
credentialStore.vault.config.authMethod=appRole

# AppRole mode
credentialStore.vault.config.role.id=
credentialStore.vault.config.secret.id=${file("vault-secret-id")}

# Azure authentication
credentialStore.vault.config.azure.role=
credentialStore.vault.config.azure.resource=https://management.azure.com/

# These parameters are necessary for Azure authentication when is using bound service principal ids.
credentialStore.vault.config.azure.subscriptionId=
credentialStore.vault.config.azure.resourceGroupName=
credentialStore.vault.config.azure.vmName=

# Google Cloud authentication
credentialStore.vault.config.google.role=
credentialStore.vault.config.google.audience=

#
# The Vault User ID is generated by hashing the MAC address belonging to the network interface assigned
# the IP address of hostname -f. It can also be retrieved by the show-vault-id command of the
# StreamSets executable.
#

# Data Collector authenticates with Vault using the AppId authentication backend. The app-id must be specified below.
# credentialStore.vault.config.app.id=

# Optional Settings

# Supported KV Secret Engine version 1 by default. Possible values: 1 or 2.
credentialStore.vault.config.version=1

# Define namespaces for Vault Enterprise
#credentialStore.vault.config.namespace=

# Define mount point for login. If not specified, the default mount point for the specified login method will be used.
#credentialStore.vault.config.mountPoint=

# The renewal interval must be shorter than the shortest lease issued by Vault including auth tokens.
credentialStore.vault.config.lease.renewal.interval.sec=60
credentialStore.vault.config.lease.expiration.buffer.sec=120
credentialStore.vault.config.open.timeout=0
credentialStore.vault.config.proxy.address=
credentialStore.vault.config.proxy.port=8080
credentialStore.vault.config.proxy.username=
credentialStore.vault.config.proxy.password=
credentialStore.vault.config.read.timeout=0
credentialStore.vault.config.ssl.enabled.protocols=TLSv1.2,TLSv1.3
credentialStore.vault.config.ssl.truststore.file=
credentialStore.vault.config.ssl.truststore.password=
credentialStore.vault.config.ssl.verify=true
credentialStore.vault.config.ssl.timeout=0
credentialStore.vault.config.timeout=0


#####################################################################
#    Thycotic Secret Server Credential Store Configuration          #
#####################################################################

# The following properties are for an Thycotic Secret Server credential store that uses the 'thycotic'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'thycotic' in the property names with the custom ID.

# Defines the implementation of the 'thycotic' credential store.
# Update 'thycotic' in the property name as needed, but do not change the definition of this property.
credentialStore.thycotic.def=streamsets-datacollector-thycotic-credentialstore-lib::com_streamsets_datacollector_credential_thycotic_ThycoticCredentialStore

# Thycotic Secret Server URL. Use the following format: https://<host name>:<port number>.
# Use HTTPS to avoid unencrypted communication.
credentialStore.thycotic.config.url=<MUST BE SET>

# User name to connect to Thycotic Secret Server
credentialStore.thycotic.config.username=<MUST BE SET>

# Password to connect to Thycotic Secret Server
credentialStore.thycotic.config.password=${file("thycotic-secret-password")}

# Cache expiration time
credentialStore.thycotic.config.credential.cache.inactivityExpiration.seconds=1800

# Credential refresh interval
# How long a credential can be cached locally before fetching it again from Thycotic Secret Server.
credentialStore.thycotic.config.credential.refresh.seconds=30000

# Credential retry interval
# How long to wait before retrying to fetch a credential from Thycotic Secret Server in the case of an error.
credentialStore.thycotic.config.credential.retry.seconds=15000

# Buffer for expiring auth tokens. Data Collector renews tokens that expire in less than
# the specified number of seconds. Default is 1201.
credentialStore.thycotic.config.token.expiration.buffer.seconds=1201

# SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended. Default is TLSv1.2,TLSv1.3
credentialStore.thycotic.config.ssl.enabled.protocols=TLSv1.2,TLSv1.3

# Path to a Java truststore file. Required when using a private CA or certificates not trusted
# by the Java default truststore.
credentialStore.thycotic.config.ssl.truststore.file=

# Password for the truststore file
credentialStore.thycotic.config.ssl.truststore.password=

# Whether to verify that the Thycotic server hostname matches its certificate.
# Default is true. False is not recommended.
credentialStore.thycotic.config.ssl.verify=true

# Timeout for the SSL/TLS handshake in milliseconds. Default is 0 for no limit.
credentialStore.thycotic.config.ssl.timeout=0

# Separator to use for the Thycotic Secret Server secret ID and field name values in the
# credential name argument used in credential functions.
credentialStore.thycotic.config.nameSeparator=-

# Milliseconds to wait for data before timing out.
# Default is 0 for no limit.
credentialStore.thycotic.config.read.timeout=0

# Timeout to establish an HTTP connection to Thycotic Secret Server in milliseconds.
# Default is 0 for no limit.
credentialStore.thycotic.config.open.timeout=0

# Timeout in milliseconds to read from Thycotic Secret Server after a connection has been established.
# Default is 0 for no limit.
credentialStore.thycotic.config.timeout=0

Step 3. Configure credential store properties

To enable Data Collector to connect to a credential store, configure the appropriate properties in the credential store properties file that you created.
  1. In the credential-stores.properties file, uncomment the credentialStores property and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secrets Manager credential store, set the value to jks,aws. To use multiple Secrets Manager credential stores, simply specify separate IDs for each, such as awsDev,awsProd.

  2. Configure related properties in the file. For example, to use Azure Key Vault, configure the Azure Key Vault configuration properties in the file.

    When using multiple credential stores of the same type, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling credential stores.

Protecting sensitive data in credential store properties

When you define credential store properties, you must enter some sensitive data such as passwords to authenticate with the credential store system. For example, to use the AWS Secrets Manager credential store system, you enter the AWS access key ID and secret access key.

You can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

Mounting local files and directories

Some credential store properties require specifying the location of a file or directory, such as a credential file.

To include local files and directories in the Data Collector engine container, edit the StreamSets environment to customize the engine run command. Add the following mount option to the command:

--mount type=bind,source=<path to file or directory>,target=/etc/sdc/<path>,readonly
Define the following paths in the option:
  • <path to file or directory> is the location of the local file.
  • <path> is the location in the Data Collector engine container.
For example, you set the credentialStore.<cstore ID>.config.credentialsJsonPath Google Secrets Manager property to /etc/sdc/gcp-creds.json in the credential stores properties file. To enable Data Collector to mount the file, you edit the StreamSets environment to add the following command option:
--mount type=bind,source=<local path to gcp-creds.json>,target=/etc/sdc/gcp-creds.json,readonly

You can include these additional mount options when you enable Data Collector to mount the credential store properties file. For more information, see Step 4. Mount the credential store properties file.

AWS Secrets Manager properties

The following properties are grouped in the AWS Secrets Manager section of the file. Configure the properties as needed:

Secrets Manager Property Description
credentialStore.<cstore ID>.def Required. Defines the implementation of the AWS Secrets Manager credential store.

Do not change the default value.

credentialStore.<cstore ID>.config.nameKey.separator Optional. Separator to use in the name argument that credential functions use. Use the following format for the name argument:

<name><separator><key>

For example, if you keep the default ampersand (&), the format for the name argument is: <name>&<key>

Note: In Secrets Manager, names can contain alphanumeric and the following special characters: / _ + = . @ - . Therefore, avoid using those characters as separators.
credentialStore.<cstore ID>.config.region Required. AWS region that hosts Secrets Manager. For a list of available regions, see the AWS Region Table.
credentialStore.<cstore ID>.config.security.method Required. Authentication method used to connect to AWS. Set to one of the following values:
  • instanceProfile - Authenticates using an instance profile associated with Data Collector.
    Use when Data Collector runs on an Amazon EC2 instance that has an associated instance profile. Data Collector uses the instance profile credentials to automatically authenticate with AWS.
    Note: If Data Collector is running in a container environment, ensure that the Instance Metadata Service hop limit is set to 2. For more information, see the Amazon EC2 documentation.
  • accessKeys - Authenticates using an AWS access key pair.

    Use when Data Collector does not run on an Amazon EC2 instance or when the EC2 instance doesn’t have an instance profile.

credentialStore.<cstore ID>.config.access.key Required when using access keys to authenticate with AWS. AWS access key ID.
credentialStore.<cstore ID>.config.secret.key Required when using access keys to authenticate with AWS. AWS secret access key.
credentialStore.<cstore ID>.config.cache.max.size Optional. Maximum number of secrets Data Collector can cache locally. Default is 1024.
credentialStore.<cstore ID>.config.cache.ttl.millis Optional. Number of milliseconds that Data Collector considers a cached secret valid before requiring a refresh. Default is 1 hour.

Azure Key Vault properties

The following properties are grouped in the Azure Key Vault section of the file. Configure the properties as needed:

Azure Key Vault Property Description
credentialStore.<cstore ID>.def Required. Defines the implementation of the Azure Key Vault credential store.

Do not change the default value.

credentialStore.<cstore ID>.config.credential.refresh.millis Optional. Number of milliseconds that Data Collector locally caches a credential. When the time expires, Data Collector retrieves the credential from Azure Key Vault.
credentialStore.<cstore ID>.config.credential.retry.millis Optional. Number of milliseconds that Data Collector waits before attempting to retry a retrieval of a credential from Azure Key Vault, in the case of an error.
credentialStore.<cstore ID>.config.vault.url Required. URL to the key vault created in Azure Key Vault.

Use the following format:

https://<key vault name>.vault.azure.net/
credentialStore.<cstore ID>.config.credential.method Required. Authentication method for Azure Key Vault to use.
  • clientKeys - Use client key authentication.
  • managedIdentity - Use managed identity authentication. To use managed mdentity authentication in Data Collector, you must set up a managed identity in Azure. For information on setting up a managed identity in Azure, see the Microsoft documentation.
credentialStore.<cstore ID>.config.client.id Required to use client key authentication. Application ID assigned to this Data Collector when you registered Data Collector as an application in Azure Active Directory, as described in prerequisites.
credentialStore.<cstore ID>.config.client.key Required to use client key authentication. Authentication key assigned to this Data Collector when you registered Data Collector as an application in Azure Active Directory, as described in prerequisites.

CyberArk properties

The following properties are grouped in the CyberArk section of the file. Configure the properties as needed:

CyberArk Property Description
credentialStore.<cstore ID>.def Required. Defines the implementation of the CyberArk credential store.

Do not change the default value.

credentialStore.<cstore ID>.config.credential.refresh.millis Optional. Number of milliseconds that Data Collector locally caches a credential. When the time expires, Data Collector retrieves the credential from CyberArk.
credentialStore.<cstore ID>.config.credential.retry.millis Optional. Number of milliseconds that Data Collector waits before attempting to retry a retrieval of a credential from CyberArk, in the case of an error.
credentialStore.<cstore ID>.config.connector Optional. Connector type to CyberArk. Leave the default, webservices, since only web services is currently supported.
credentialStore.<cstore ID>.config.ws.url Required. CyberArk Central Credential Provider web service URL.

Use the following format:

https://<host name>:<port>/AIMWebService/api/Accounts
credentialStore.<cstore ID>.config.ws.appId Required. CyberArk application ID for this Data Collector. You must create the application ID in CyberArk.
credentialStore.<cstore ID>.config.ws.maxConcurrentConnections Optional. Maximum number of concurrent web service calls that Data Collector can make to CyberArk.
credentialStore.<cstore ID>.config.ws.validateAfterInactivity.millis Optional. Number of milliseconds of inactivity before Data Collector validates the HTTP connection to CyberArk.
credentialStore.<cstore ID>.config.ws.connectionTimeout.millis Optional. Number of milliseconds to wait for a connection to CyberArk.
credentialStore.<cstore ID>.config.ws.nameSeparator Optional. Separator to use in the name argument that credential functions use.
Use the following format for the name argument:
<safe><separator><folder><separator><object name><separator><element name>
For example, if you keep the default ampersand (&), the format for the name argument is:
<safe>&<folder>&<object name>&<element name>
credentialStore.<cstore ID>.config.ws.http.authentication Optional. Authentication type used by the CyberArk Central Credential Provider web services: none, basic, or digest.

Default is none.

credentialStore.<cstore ID>.config.ws.http.authentication.user Optional. Username if using basic or digest authentication.
credentialStore.<cstore ID>.config.ws.http.authentication.password Optional. Password if using basic or digest authentication.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.ws.truststoreFile Optional. Path to the truststore file if using HTTPS and the server certificate is using a private CA or is not trusted by the Java default truststore file.

Enter a path relative to the Data Collector configuration directory, or enter an absolute path.

credentialStore.<cstore ID>.config.ws.truststorePassword Optional. Password for the truststore file.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.ws.supportedProtocols Optional. SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended.
credentialStore.<cstore ID>.config.ws.hostnameVerifier.skip Optional. Determines whether the host name of the CyberArk Central Credential Provider web services should be verified against the domain defined in the HTTPS certificate.

By default, the host name is verified.

credentialStore.<cstore ID>.config.ws.keystoreFile Optional. If using HTTPS and the CyberArk Central Credential Provider web services requires client side certificates, the path to the keystore file that contains the client certificate.

Enter a path relative to the Data Collector configuration directory or enter an absolute path.

When using this property, be sure to mount the specified file or directory.

credentialStore.<cstore ID>.config.ws.keystorePassword Optional. Password for the keystore file.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.ws.keyPassword Optional. Password to access the certificate within the keystore file.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.ws.proxyURI Optional. URI for the proxy that should be used to reach the CyberArk services.

Google Secret Manager properties

The following properties are grouped in the Google Secret Manager section of the file. Configure the properties as needed:

Secret Manager Property Description
credentialStore.<cstore ID>.def Required. Defines the implementation of the Google Secret Manager credential store.

Do not change the default value.

credentialStore.<cstore ID>.config.cache.inactivityExpiration.millis Expiration time for the cache in milliseconds.

Default is 1800000.

credentialStore.<cstore ID>.config.delimiter Delimiter to use in the credential function name argument to separate the secret name and the version ID. Use a single character that is not included in credential names.

Use the following format for the name argument:

<name><delimiter><version id>

For example, if you use a slash, the format for the name argument is:

<name>/<version id>

Default is question mark (?).

credentialStore.<cstore ID>.config.project.id ID of the project associated with the Secret Manager.
credentialStore.<cstore ID>.config.credentialsMode Credentials to use for authentication with Secret Manager:
  • default - Uses Google Cloud default credentials.
  • json - Uses JSON-formatted credentials information specified in the credential store configuration properties.
  • jsonPath - Uses a JSON service account credentials file stored on the Data Collector machine.

For more information, see Google authentication requirement.

credentialStore.<cstore ID>.config.credentialsJson Contents of a Google Cloud service account credentials file.

Enter JSON-formatted credential information in plain text. If the content includes multiple lines of text, add a backslash (\) at the end of each line.

Required when using the json credentials mode.

credentialStore.<cstore ID>.config.credentialsJsonPath Path to a Google Cloud service account credentials file stored on the Data Collector machine. The credentials file must be a JSON file.

Enter a path relative to the Data Collector resources directory or enter an absolute path.

Required when using the jsonPath credentials mode.

When using this property, be sure to mount the specified file or directory.

Hashicorp Vault properties

The following properties are grouped in the Vault section of the file. Configure the properties as needed:

Vault Property Description
credentialStore.<cstore ID>.def Required. Defines the implementation of the Vault credential store.

Do not change the default value.

credentialStore.<cstore ID>.config.pathKey.separator Optional. Separator to use in the name argument that credential functions use.

Use the following format for the name argument:

<path><separator><key>
For example, if you keep the default ampersand (&), the format for the name argument is:
<path>&<key>
credentialStore.<cstore ID>.config.addr Required. Vault server URL entered in the following format:
https://<host name>:<port number>

Use HTTPS to avoid unencrypted communication.

credentialStore.<cstore ID>.config.authMethod Required. Authentication method that Data Collector uses to authenticate with Vault.
Specify one of the following authentication methods:
  • appId
  • appRole
  • azure
  • google
Important: The App ID authentication backend has been deprecated by Hashicorp and will be removed in a future release. As a result, do not use App ID authentication for new installations.

Default is appRole.

credentialStore.<cstore ID>.config.role.id Required for App Role authentication. Vault Role ID that Data Collector uses to authenticate with Vault. The Role ID is configured within Vault by your Vault administrator.

The Data Collector Vault integration relies on Vault's App Role authentication backend.

credentialStore.<cstore ID>.config.secret.id Required for App Role authentication. Vault Secret ID that Data Collector uses to authenticate with Vault. The Secret ID is configured within Vault by your Vault administrator.

To protect the Secret ID, store the Secret ID in an external location and then use a function to retrieve the Secret ID.

Default uses the file function to retrieve the Secret ID from vault-secret-id in the Data Collector configuration directory.

credentialStore.<cstore ID>.config.azure.role Required for Azure authentication. Name of the Vault role defined for Data Collector.

credentialStore.<cstore ID>.config.google.role

Name of the Vault role defined for Data Collector. Specify either a role or audience for Google Cloud Platform authentication.

credentialStore.<cstore ID>.config.google.audience

Audience value that Data Collector uses to authenticate with Vault. Specify either a role or audience for Google Cloud Platform authentication.
credentialStore.<cstore ID>.config.azure.subscriptionId Required for Azure authentication. Subscription ID of the Azure subscription where Data Collector is hosted.
credentialStore.<cstore ID>.config.azure.resourceGroupName Required for Azure authentication. Name of the resource group defined in the Vault role for Data Collector.
credentialStore.<cstore ID>.config.azure.vmName Required for Azure authentication. Name of the Azure VM where Data Collector is running.
credentialStore.<cstore ID>.config.azure.resource Required for Azure authentication. Name of the resource defined in the Azure authentication configuration.
credentialStore.<cstore ID>.config.app.id

Deprecated. App ID for App ID authentication.

Important: The App ID authentication backend has been deprecated by Hashicorp and will be removed in a future release. As a result, do not configure this property for new installations.

credentialStore.<cstore ID>.config.version

Azure Key Vault Secret Engine version number.

Data Collector supports versions 1 and 2.

Default is 1.

credentialStore.<cstore ID>.config.mountPoint

Mount point for Azure Key Vault authentication.

If this property is not configured, Data Collector uses the default mount point for the authentication method.

credentialStore.<cstore ID>.config.lease.renewal.interval.sec Optional. Seconds to wait before checking for leases that need renewal.

Default is 60.

credentialStore.<cstore ID>.config.lease.expiration.buffer.sec Optional. Buffer for expiring leases. Data Collector renews leases that expire in less than the specified number of seconds.

Default is 120.

credentialStore.<cstore ID>.config.open.timeout Optional. Timeout to establish an HTTP connection to Vault in milliseconds.

Default is 0 for no limit.

credentialStore.<cstore ID>.config.proxy.address Optional. Proxy URL. Configure to use a proxy to access Vault.
credentialStore.<cstore ID>.config.proxy.port Optional. Proxy port. Configure to use a proxy to access Vault.
credentialStore.<cstore ID>.config.proxy.username Optional. Proxy username. Configure to use a proxy to access Vault.
credentialStore.<cstore ID>.config.proxy.password Optional. Proxy password. Configure to use a proxy to access Vault.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.read.timeout Optional. Milliseconds to wait for data before timing out.

Default is 0 for no limit.

credentialStore.<cstore ID>.config.ssl.enabled.protocols Optional. SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended.

Default is TLSv1.2,TLSv1.3.

credentialStore.<cstore ID>.config.ssl.truststore.file Optional. Path to a Java truststore file. Required when using a private CA or certificates not trusted by the Java default truststore.

When using this property, be sure to mount the specified file or directory.

credentialStore.<cstore ID>.config.ssl.truststore.password Optional. Password for the truststore file.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.ssl.verify Optional. Whether to verify that the Vault server hostname matches its certificate.

Default is true. False is not recommended.

credentialStore.<cstore ID>.config.ssl.timeout Optional. Timeout for the SSL/TLS handshake in milliseconds.

Default is 0 for no limit.

credentialStore.<cstore ID>.config.timeout Optional. Timeout to read from Vault in milliseconds, after a connection has been established.

Default is 0 for no limit.

credentialStore.<cstore ID>.config.enforceEntryGroup This property is ignored at this time.

Java Keystore properties

The following properties are grouped in the Java Keystore section of the file. Configure the properties as needed:

Java Keystore Property Description
credentialStore.<cstore ID>.def Defines the implementation of the Java Keystore credential store.

Do not change the default value.

credentialStore.<cstore ID>.config.keystore.type Format of the Java keystore file:
  • JCEKS
  • PKCS12

Default is PKCS12.

credentialStore.<cstore ID>.config.keystore.file Path and name of the Java keystore file. Enter an absolute path to the file, or a path relative to the Data Collector configuration directory.

Default is jks-credentialStore.pkcs12.

When using this property, be sure to mount the specified file or directory.

credentialStore.<cstore ID>.config.keystore.storePassword Password that Data Collector uses to access the Java keystore file.

You must change the default value before using the keystore file.

To protect the password, store the password in an external location and then use a function to retrieve the password.

credentialStore.<cstore ID>.config.keystore.file.min.refresh.millis Milliseconds that Data Collector waits before reloading the keystore file.

Default is 10000, or ten seconds.

Step 4. Mount the credential store properties file

About this task

To enable Data Collector to use the credential store properties file, edit the StreamSets environment to customize the engine run command. Add a mount option to the command, then run the customized command to restart the engine.

Procedure

  1. If the engine is running, stop the engine.
    1. Determine the container ID for the engine:
      <docker|podman> ps

      For example, use the following command for Docker: docker ps

    2. Copy the ID of the container that you want to update.
    3. Stop the engine:
      <docker|podman> stop <container_id>
  2. On the Manage tab of your project, click the StreamSets tool.
  3. For the environment, click Options > Edit environment.
  4. Expand the Advanced configuration section.
  5. In the Docker command options section, click Add value.
  6. Add the following option and specify the path to the credential stores properties file:
    --mount type=bind,source=<path to credential-stores.properties>,target=/etc/sdc/credential-stores.properties,readonly
    Important: Make sure that all files exist in the expected locations.
    For example, the following option provides "${HOME}"/sdc/credential-stores.properties as the location of the credential stores properties file:
    --mount type=bind,source="${HOME}"/sdc/credential-stores.properties,target=/etc/sdc/credential-stores.properties,readonly

    If you need more mount options for credential store properties, add them as well.

  7. Save your changes.
  8. For the environment, click Options > Get run command, and then copy the command.

    Notice that the copied command includes your customization.

  9. Run the customized engine command.
    The command starts a Data Collector engine container that includes the credential store properties file and any other files or paths that you mounted.
    Important: Review Additional requirements and details for more requirements before calling secrets from stages in your flows.

Additional requirements and details

In addition to the general credential store configuration steps, note the following details for specific credential store types:

Azure Key Vault prerequisites

Before Data Collector can connect to the Microsoft Azure Key Vault credential store system, you must complete the following prerequisite tasks:

Register Data Collector with Azure Active Directory
Use the Azure portal to register Data Collector as an application in Azure Active Directory. When an application such as Data Collector accesses keys or secrets in an Azure key vault, the application must use an authentication token from Azure Active Directory.
The registration process assigns Data Collector the following values, which you specify when you configure credential store properties:
  • Application ID
  • Authentication key
For more information about registering applications in Azure Active Directory, see the Azure Key Vault documentation.
Authorize Data Collector to use keys in the Azure key vault
Use the Azure portal to authorize Data Collector to use the keys, or secrets, in the Azure key vault. Azure Key Vault requires that applications be authorized to access each key vault.
For information about authorizing applications to use keys, see the Azure Key Vault documentation.
Set the AZURE_AUTHORITY_HOST environment variable, as needed
Set the AZURE_AUTHORITY_HOST environment variable on the Data Collector workstation if you do not connect to the default Azure public cloud: https://login.microsoftonline.com.
When you set the environment variable, you specify the cloud environment endpoint that you want Data Collector to connect to, such as an Azure government cloud or a private cloud.
Use the following command to set the environment variable:
export AZURE_AUTHORITY_HOST=<Azure_endpoint>
For example, to connect to Azure China you might use the following command:
export AZURE_AUTHORITY_HOST=https://login.chinacloudapi.cn/

Google authentication requirement

Data Collector must authenticate with Google Secret Manager using Google credentials.

When you configure the credential store properties, you configure Data Collector to use one of the following credential modes:

Default
Data Collector authenticates with Google Secret Manager using the credentials file defined in the GOOGLE_APPLICATION_CREDENTIALS environment variable.
Set the environment variable on the Data Collector machine.

For more information about using default credentials, see the Google Cloud documentation.

JSON
Data Collector authenticates with Google Secret Manager using JSON-formatted credential information specified in the credential store configuration properties. You copy the JSON content from a Google Cloud service account credentials file.
Enter the JSON content in plain text. If the content includes multiple lines of text, add a backslash (\) at the end of each line.
JSON path
Data Collector authenticates with Google Secret Manager using a Google Cloud service account credentials file stored on the Data Collector machine.

Enter the path to the file in the credential store configuration properties. Enter a path relative to the Data Collector resources directory or enter an absolute path.

For information about generating a service account credential file, see the Google Cloud Platform documentation.

Adding secrets to Java Keystore (Java only)

Use the stagelib-cli jks-credentialstore command to add secrets to the Java keystore file. You can add multiple secrets to the file.

Use the command from the Data Collector installation directory as follows:
bin/streamsets stagelib-cli jks-credentialstore add -i <cstore ID> -n <secret name> -c <secret value>
For example, the following command adds a secret named OracleDBPassword with the value 278yT6u to the devjks Java keystore credential store:
bin/streamsets stagelib-cli jks-credentialstore add -i devjks -n OracleDBPassword -c 278yT6u
Note: The stagelib-cli jks-credentialstore command also includes delete and list subcommands that you use to manage the secrets defined in the keystore file. For information on using these commands, see jks-credentialstore command reference (Java only).

jks-credentialstore command reference (Java only)

The stagelib-cli jks-credentialstore command provides subcommands to add, list, and delete secrets in the Java keystore credential store.

Any changes made to the Java keystore file take effect immediately. For example, if you change the value of an existing secret in the file, running flows that require a new connection to the external system use the updated secret.
Note: In previous releases, the jks-cs command provided the same subcommands to add, list, and delete secrets in the Java keystore credential store. However, the jks-cs command is now deprecated and will be removed in a future release.
You can use the following subcommands with the stagelib-cli jks-credentialstore command:
add
Adds a secret to the Java keystore credential store.
Use the command from the Data Collector installation directory as follows:
bin/streamsets stagelib-cli jks-credentialstore add \
(-i <cstore ID> | --id <cstore ID>) \
(-n <secret name> | --name <secret name>) \
(-c <secret value> | --credential <secret value>)
Add Option Description
-i <cstore ID>

or

--id <cstore ID>

Required. Unique ID for the credential store.

The default ID for a Java keystore is jks.

-n <secret name>

or

--name <secret name>

Required. Name of the secret to add to the Java keystore credential store.

If the name includes non-alphanumeric characters, use single quotation marks around the name.

-c <secret value>

or

--credential <secret value>

Required. Value to add to the Java keystore credential store.

If the value includes non-alphanumeric characters, use single quotation marks around the value.

For example, the following command adds a secret named OracleDBPassword with the value df35yT_&5 to the devjks Java keystore credential store:

bin/streamsets stagelib-cli jks-credentialstore add -i devjks -n OracleDBPassword -c 'df35yT_&5'
delete
Deletes a secret from the Java keystore credential store.
Use the command from the Data Collector installation directory as follows:
bin/streamsets stagelib-cli jks-credentialstore delete \
(-i <cstore ID> | --id <cstore ID>) \
(-n <secret name> | --name <secret name>)
Delete Option Description
-i <cstore ID>

or

--id <cstore ID>

Required. Unique ID for the credential store.

The default ID for a Java keystore is jks.

-n <secret name>

or

--name <secret name>

Required. Name of the secret to delete from the Java keystore credential store.

If the name includes non-alphanumeric characters, use single quotation marks around the name.

For example, the following command deletes a secret named SQLServerDBPassword from the devjks Java keystore credential store:
bin/streamsets stagelib-cli jks-credentialstore delete -i devjks -n SQLServerDBPassword
list
Lists the names of all secrets defined in the Java keystore credential store. The command does not list the values.
Use the command from the Data Collector installation directory as follows:
bin/streamsets stagelib-cli jks-credentialstore list \
(-i <cstore ID> | --id <cstore ID>)
List Option Description
-i <cstore ID>

or

--id <cstore ID>

Required. Unique ID for the credential store.

The default ID for a Java keystore is jks.

For example, the following command lists the names of all secrets defined in the devjks Java keystore credential store:
bin/streamsets stagelib-cli jks-credentialstore list -i devjks

Calling secrets from the flow

Specify credential functions in stage properties to retrieve keys or secrets from a credential store.

You can use credential functions in any stage property that displays an eye icon. For example:

Stage properties with eye icons.

Important: When you use a credential function in a stage property, the function must be the only value defined in the property.

For details about credential functions, see Credential functions.