Amazon S3 connection
To access your data in Amazon S3, create a connection asset for it.
Amazon S3 (Amazon Simple Storage Service) is a service that is offered by Amazon Web Services (AWS) that provides object storage through a web service interface.
For other types of S3-compliant connections, you can use the Generic S3 connection.
Create a connection to Amazon S3
To create the connection asset, you need these connection details:
- Bucket name that contains the files
- Endpoint URL: Include the region code. For example,
https://s3.<region-code>.amazonaws.com
. For the list of region codes, see AWS service endpoints. - Region: Amazon Web Services (AWS) region. If you specify an Endpoint URL that is not for the AWS default region, then you should also enter a value for Region.
Credentials
The combination of Access key and Secret key is the minimum credentials.
For Credentials, you can use secrets if a vault is configured for the platform and the service supports vaults. For information, see Using secrets from vaults in connections.
If the Amazon S3 account owner has set up temporary credentials or a Role ARN (Amazon Resource Name), enter the values provided by the Amazon S3 account owner for the applicable authentication combination:
- Access key, Secret key, and Session token
- Access key, Secret key, Role ARN, Role session name, and optional Duration seconds
- Access key, Secret key, Role ARN, Role session name, External ID, and optional Duration seconds
For setup instructions for the Amazon S3 account owner, see Setting up temporary credentials or a Role ARN for Amazon S3.
Choose the method for creating a connection based on where you are in the platform
In a project Click Add to project > Connection. See Adding a connection to a project.
In a catalog
Click Add to catalog > Connection. See Adding a connection asset to a catalog.
In a deployment space
Click Add to space > Connection. See Adding connections to a deployment space.
In the Platform assets catalog
Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use Amazon S3 connections in the following workspaces and tools:
Analytics projects
- Data Refinery (Watson Studio or Watson Knowledge Catalog)
- DataStage (DataStage service)
- Decision Optimization (Watson Studio and Watson Machine Learning)
- Metadata import (Watson Knowledge Catalog). You must create the Amazon S3 connection in a project, and then select it from the existing connections when you create a metadata import. You cannot create the Amazon S3 from within the metadata import.
- Notebooks (Watson Studio). Use the insert-to-code function to get the connection credentials and load the data into a data structure. See Load data from data source connections.
- SPSS Modeler (SPSS Modeler service)
Catalogs
- Platform assets catalog
- Other catalogs (Watson Knowledge Catalog)
Data Virtualization service You can connect to this data source from Data Virtualization.
Amazon S3 setup
See the Amazon Simple Storage Service User Guide for the setup steps.
Supported file types
The Amazon S3 connection supports these file types: Avro, CSV, Delimited text, Excel, JSON, ORC, Parquet, SAS, SAV, SHP, and XML.
Learn more
Related connection: Generic S3 connection
Parent topic: Supported connections