Adding data from a connection to an analytics project (Watson Studio and Watson Knowledge Catalog)
A connected data asset is a pointer to data that is accessed through a connection to an external data source. You create a connected data asset by specifying a connection, any intermediate structures or paths, and a relational table or view, a set of partitioned data files, or a file. When you access a connected data asset, the data is dynamically retrieved from the data source.
You create a connected data asset based on a specific relational table or view, a set of partitioned data files, or a file that is accessed through the connection to the data source.
You can also add a folder asset that is accessed through a connection in the same way. See Add a folder asset to a project.
Partitioned data assets have previews and profiles and can be masked like relational tables. However, you cannot yet shape and cleanse partitioned data assets with the Data Refinery tool.
Partitioned data is recognized and treated like a relational table if the files meet these requirements:
- The files have a prefix of
part-
. - The files are in a single folder within IBM Cloud Object Storage that contains no other files.
You can add data and COBOL copybooks assets from mainframes to catalogs in IBM Cloud Pak for Data with the use of a connection to Data Virtualization Manager for z/OS. The process is similar to adding these types of assets to a catalog. For more information, see Adding COBOL copybook assets.
To add a data asset from a connection to a project:
- Click Add to project > Connected data.
- Select an existing connection asset as the source of the data. If you don’t have any connection assets, return to Add to project, and select Connection and create a connection asset.
-
If necessary, enter your personal credentials for locked data connections that are marked with a key icon (). This is a one-time step that permanently unlocks the connection for you. After you have unlocked the connection, the key icon is no longer displayed. See Adding connections to projects.
- Select the data you want and click Select. For partitioned data, select the folder that contains the files. If the files are recognized as partitioned data, you see the message
This folder contains a partitioned data set.
- Add a name and description.
- Click Create. The asset appears on the project Assets page.
When you click on the asset name, you can see this information about connected assets:
- The asset name and description
- The tags for the asset
- The name of the person who created the asset
- The size of the data
- The date when the asset was added to the project
- The date when the asset was last modified
- A preview of relational data
- A profile of relational data
Next steps
- Refine the data
- Analyze the data with notebooks
- Analyze the data with models
- Publish the data asset to a catalog