Amazon S3 connection

To access your data in Amazon S3, create a connection asset for it.

Amazon S3 (Amazon Simple Storage Service) is a service by Amazon Web Services (AWS) that provides object storage through a web service interface.

For other types of S3-compliant connections, you can use the Generic S3 connection.

Create a connection to Amazon S3

To create the connection asset, you need these connection details based on your deployment:

Common connectivity

Bucket: Bucket name that contains the files. If your AWS credentials have permissions to list buckets and access all buckets, then you only need to supply the credentials. If your credentials don't have the privilege to list buckets and can only access a particular bucket, then you need to specify the bucket.
Region: Amazon Web Services (AWS) region. If you specify an Endpoint URL that is not for the AWS default region (us-west-2), enter a value for Region.

Select Server proxy to access the Amazon S3 data source through a proxy server. Depending on its setup, a proxy server can provide load balancing, increased security, and privacy. The proxy server settings are independent of the authentication credentials and the personal or shared credentials selection. The proxy server settings cannot be stored in a vault.

Proxy host: The proxy URL. For example, https://proxy.example.com.
Proxy port number: The port number to connect to the proxy server. For example, 8080 or 8443.
The Proxy username and Proxy password fields are optional.

Credentials

Choose your authentication methods based on your deployment:

Common connectivity

Choose the authentication method:

Basic credentials

Access key: The access key ID (username) for authorizing access to AWS.
Secret key: The password that is associated with the access key ID for authorizing access to AWS.

Temporary credentials

Access key: The access key ID (username) for authorizing access to AWS.
Secret key: The password that is associated with the access key ID for authorizing access to AWS.
Session token: The session token for the temporary credential.

Trusted role credentials

Access key: The access key ID (username) for authorizing access to AWS.
Secret key: The password that is associated with the access key ID for authorizing access to AWS.
Role ARN: The Amazon Resource Name (ARN) of the role that the connection assumes.
Role session name: A name to identify the session for S3 administrators. For example, you can use your IAM name.
External ID: The external ID of the organization to use the role.
Duration seconds: The duration in seconds of the temporary security credentials. For Credentials, you can use secrets if a vault is configured for the platform and the service supports vaults. For information, see Using secrets from vaults in connections.

For setup instructions for the Amazon S3 account owner, see Setting up temporary credentials or a Role ARN for Amazon S3.

Federal Information Processing Standards (FIPS) compliance

This connection is FIPS-compliant and can be used on a FIPS-enabled cluster.

Amazon S3 setup

See the Amazon Simple Storage Service User Guide for the setup steps.

Restriction

Folders cannot be named with the slash symbol (/) because the slash symbol is a delimiter for the file structure.

Supported file types

The Amazon S3 connection supports structured and unstructured file formats.

The connection supports the following structured file types: Avro, CSV, Delimited text, Excel, JSON, ORC, Parquet, SAS, SAV, SHP, and XML.

The connection supports modes for reading and writing binary data. These modes can be used to read and write the unstructured data formats such as: DOC, DOCX, MD, PDF, PPT, PPTX, and TXT.

Table formats

In addition to Flat file, the Amazon S3 connection supports these Data Lake table formats: Delta Lake and Iceberg.

Learn more

Related connection: Generic S3 connection