Data classes

Data classes describe the type of data contained in data assets, such as data fields or table columns, for example, city, phone number, or credit card number. IBM Knowledge Catalog provides a set of predefined data classes.

Data classes are algorithms that help organizations assign business terms to data elements, this assignment being based on the syntax of the data. Data classes are used during metadata enrichment to increase the accuracy of business term assignment recommendations. They can be seen as a syntactic counterpart to the semantic business terms.

Data classes can be used to phrase actionable rules such as data protection rules and data quality rules. They also play a role in data quality to find suspicious entries that might not be correct.

When you create custom data-class artifacts, you can use matching data to specify how to classify data automatically. You can also add related artifacts like classifications and business terms. Such business terms are then suggested to be assigned when a data class is assigned to a column in a data asset.

In governed catalogs, data assets that contain tabular data are automatically profiled and assigned data classes. In ungoverned catalogs, you can choose to profile a relational data asset and choose which data classes to assign. Profiles for unstructured data assets are created automatically when you add such assets to a catalog, regardless of whether policies are enforced, or a project. All catalog users can see the data classes in the asset preview on the Overview and Profile pages of the asset.

Learn more