Profiles of assets (Watson Knowledge Catalog)
Data assets that contain textual data have profiles. The profile of a data asset includes generated metadata and statistics about the textual content of the data. You can see the profile when you open the data asset in a catalog or project and go to the asset’s Profile page. All catalog or project members can see data asset profiles.
The profile of a data asset that contains relational or structured data shows information about each column in the data set, based on the first 5000 rows of data. The profile shows the frequency of the inferred data classes and statistics about the data for each column. Data classes describe the contents of the data in the column: for example, city, account number, or credit card number. Data classes are necessary to mask data with policies. The data classes appear for each column on the asset’s Overview page as well as on the Profile page.
These types of relational and structured data are profiled by column:
- Data assets from a connection to the data sources listed here.
- Partitioned data assets that consist of partitioned files in a folder of the local file system.
- Data assets from files in a folder of your local file system with these formats:
However, structured data files are not profiled when data assets do not explicity reference them, such as in these circumstances:
- The files are within a folder asset. Files that are accessible from a folder asset are not treated as assets and are not profiled.
- The files are within an archive file. The archive file is referenced by the data asset and the compressed files are not profiled.
In governed catalogs, profiles for data assets are created by default.
In projects and in catalogs without data protection rule enforcement, you must manually create profiles for data assets.