Identical data assets
To consistently govern connected data assets that represent the same physical resource (identical data assets) across multiple governed catalogs and select projects, you must have an accurate and a consistent view of a specific set of asset properties (shared properties) that such assets reference. When shared properties and their values are updated, the changes are immediately visible on all identical data assets across the specified workspaces. Depending on your asset role, you might be able to edit or view the shared properties.
Data assets in projects and data products are identified as identical data assets if they were added to the project from a governed catalog.
Limitations
- Only connected data assets in governed catalogs that enforce the usage of shared properties across workspaces might be recognized as identical data assets and reference the same shared properties record.
- For existing governed catalogs, all connected data assets with the same resource or identity key automatically reference a shared properties record.
- For new governed catalogs, you must specify that you want to enforce shared properties across workspaces when you create a new catalog. You can't change this setting after the catalog is created.
The option isn't available for the following assets:
- Data assets in ungoverned catalogs.
- Data assets created from locally uploaded files.
- SQL Query assets (Data assets with the Query subtype).
If you're working with Git-based projects, you must check out a branch in the Git project first to publish assets from catalog to a Git project.
Data assets and unique identifiers
You can add connected data assets that represent the same physical asset residing in a remote data source to one or more catalogs, projects, and deployment spaces. As a result, the same physical asset is represented in different workspaces by multiple connected data assets. Such assets are called identical data assets. They reference the same shared properties and are assigned the same unique identifier, either an identity key or a resource key.
If a data source definition is present for the connections or data sources, identity keys are used to identify identical data assets across workspaces. If data source definitions aren't defined in the system, resource keys are used instead.