Where you can find the migrated data (IBM Knowledge Catalog)

After the migration, migrated data is available in core IBM Knowledge Catalog components that provide functionality equivalent to the removed legacy features.

General overview
User role mapping
Migrated data connections
Migrated import areas
Migrated database assets
Migrated data models
Migrated business intelligence assets
Migrated extended data sources
Migrated extension mapping documents
Migrated OpenIGC assets
Migrated lineage information
Migrated automated discovery jobs and results
Migrated data quality projects
Migrated customizations
Migrated glossary assets

For information about known issues related to migration and migrated data, see Known issues for migration from InfoSphere Information Server.

General overview

The following table gives a general overview of how InfoSphere Information Server features can be mapped to IBM Knowledge Catalog features. The table also shows which Cloud Pak for Data user permissions must be included in a user role to work with the IBM Knowledge Catalog features. For more information about user permissions, see Predefined roles and permissions in IBM® Software Hub. For additional access requirements, see the individual feature information.

Table 1. Which IBM Knowledge Catalog feature replaces which InfoSphere Information Server feature
InfoSphere Information Server feature	IBM Knowledge Catalog feature	Required Cloud Pak for Data user permissions
Automation rules	No replacement	None.
Custom attribute administration	Custom properties and relationships for governance artifacts and catalog assets	Manage glossary This permission is not granted with any predefined role. To create custom asset types or custom properties and relationships for assets, the user must also have the Manage catalogs permission. This permission is granted with the predefined Administrator role.
Custom asset display for information assets	No replacement	None.
Data discovery (automated discovery and quick scan)	Metadata import and metadata enrichment in projects	Metadata import: Manage asset discovery This permission is granted with the following predefined roles: Administrator Data Engineer Data Quality Analyst Data Steward Metadata enrichment: no specific Cloud Pak for Data user permission required The role in the project determines whether a user can view or manage metadata imports and metadata enrichment. For metadata imports that target a catalog the user running the import must also be a catalog collaborator with the Admin or Editor role.
Data quality projects	Metadata enrichment and data quality rules in projects Data quality is an optional feature that must explicitly be enabled. See Optional features and the component you need to enable.	Metadata enrichment: no specific Cloud Pak for Data user permission required The role in the project determines whether a user can view or manage metadata enrichment. Data quality: Access data quality definitions and rules: Manage data quality assets The role in the project determines whether a user can view or manage data quality definitions and rules. Run data quality rules: Measure data quality The user must be a collaborator in the project. View the data that caused data quality issues: Drill down to issue details The user must be a collaborator in the project. These permissions are granted with the following predefined roles: Administrator Data Quality Analyst
Information assets view including lineage	Assets in catalogs including lineage	Access catalogs This permission is granted with the following predefined roles: Business Analyst Data Engineer Data Quality Analyst Data Scientist Data Steward Developer
Administration for information assets lineage	No replacement	None.
Metadata import with a bridge or connector	Metadata import in projects	Manage asset discovery This permission is granted with the following predefined roles: Administrator Data Engineer Data Quality Analyst Data Steward The role in the project determines whether a user can view or manage metadata imports.
Glossary assets	Governance artifacts

User role mapping

If the InfoSphere Information Server system is configured to use an LDAP user registry and both the InfoSphere Information Server system and the IBM Software Hub system are connected to the same LDAP server, user information can be migrated. InfoSphere Information Server user roles are mapped to roles in Cloud Pak for Data as shown in the following table. If several roles are listed in the Cloud Pak for Data column for an InfoSphere Information Server role, migrated users will have all of these roles assigned. For more information, see Predefined roles and permissions in IBM Software Hub.

For migration purposes, the following user roles are created in Cloud Pak for Data when the migration toolkit is installed: wkc_catalog_owner_role, wkc_catalog_editor_role, and wkc_catalog_viewer_role

These roles are used for granting the proper catalog collaborator role to migrated users and groups. Do not assign or delete any of these roles.

Limitations: Information Governance Catalog catalog permissions and Information Governance Catalog glossary development permissions allow to restrict access to categories to specific users or user groups. However, these permissions cannot be mapped to the roles and permissions model in Cloud Pak for Data and are not migrated.
Access to the migrated categories and the terms that they contain is granted based on the Information Governance Catalog security role that a user or group has.

InfoSphere Information Server role	Roles and permissions in Cloud Pak for Data
Common Metadata Administrator Internal name:	Platform roles Data Engineer Data Quality Analyst Data Scientist Data Steward Catalog collaborator role None Category collaborator role None
Information Analyzer Data Administrator	Platform roles Administrator Data Engineer Data Quality Analyst Data Scientist Data Steward Catalog collaborator role None Category collaborator role None
Information Analyzer Project Administrator	Platform roles Administrator Data Engineer Data Quality Analyst Catalog collaborator role None Category collaborator role None
Information Analyzer User	Platform roles User Catalog collaborator role None Category collaborator role None
Information Governance Catalog Glossary Administrator	Platform roles Administrator Catalog collaborator role Owner role in the target catalog Category collaborator role Owner role for migrated categories
Information Governance Catalog Glossary Author	Platform roles Administrator Catalog collaborator role None Category collaborator role Owner role for migrated categories
Information Governance Catalog Glossary Basic User	Platform roles Data Steward Catalog collaborator role None Category collaborator role Viewer role for migrated categories
Information Governance Catalog Information Asset Administrator	Platform roles wkc_catalog_access_role The following permissions are associated with that role: Access catalogs Access governance artifacts This role is migration-specific and should not be manually assigned to any user. Catalog collaborator role Viewer role in the target catalog Category collaborator role Viewer role for migrated categories
Information Governance Catalog Information Asset Assigner	Platform roles wkc_catalog_access_role The following permissions are associated with that role: Access catalogs Access governance artifacts This role is migration-specific and should not be manually assigned to any user. Catalog collaborator role Editor role in the target catalog Category collaborator role None
Information Governance Catalog Information Asset Author	Platform roles wkc_catalog_access_role The following permissions are associated with that role: Access catalogs Access governance artifacts This role is migration-specific and should not be manually assigned to any user. Catalog collaborator role Editor role in the target catalog Category collaborator role Viewer role for migrated categories
Information Governance Catalog Information Asset Reviewer	Platform roles wkc_catalog_access_role The following permissions are associated with that role: Access catalogs Access governance artifacts This role is migration-specific and should not be manually assigned to any user. Catalog collaborator role Viewer role in the target catalog Category collaborator role None
Information Governance Catalog User	Platform roles wkc_catalog_access_role The following permissions are associated with that role: Access catalogs Access governance artifacts This role is migration-specific and should not be manually assigned to any user. Catalog collaborator role Viewer role in the target catalog Category collaborator role Viewer role for migrated categories
Rules Administrator	Platform roles Data Engineer Data Quality Analyst Catalog collaborator role None Category collaborator role None
Rules Author	Platform roles Data Engineer Data Quality Analyst Catalog collaborator role None Category collaborator role None
Rules Manager	Platform roles Data Engineer Data Quality Analyst Catalog collaborator role None Category collaborator role None
Rules User	Platform roles User Catalog collaborator role None Category collaborator role None

Migrated data connections

You can find the migrated connections in the catalog that is specified in the import_params.yaml file as the migration target, usually the default catalog. Go to Catalogs > All catalogs and search for the catalog name.

The migrated connections are not added to the Platform assets catalog.

In general, any required connection properties except for credentials are migrated for each connection. You must edit each migrated connection and update credentials manually. See Post-migration tasks.

If a connection that was added to the catalog from a project or created from a platform connection has the same name and resource key as the migrated connection, the connection assets are not merged. Instead, a second connection asset is created for the migrated connection.

Connections for the ODBC connector are migrated as follows:

For connections to data sources for which a connector is available in Cloud Pak for Data, ODBC connections are migrated to connections of the proper type. These connections are created with shared credentials.
For connections to data sources for which no connector is available in Cloud Pak for Data, ODBC connections are migrated to connections of the type Generic JDBC. These connections are also created with shared credentials.
If a JDBC driver for the data source is available in the platform, no further action is required.

If a JDBC driver for the data source is not available in the platform, upload the required driver .jar as described in Importing JDBC driver files for data sources and update the migrated connection. To find out whether a driver is available, edit the connection and click Test connection. If a driver is not available, the test fails, and the error message states that the driver class was not found.
Important: By default, uploading JDBC driver files is disabled and users cannot view the list of JDBC drivers in the web client. An administrator must enable users to upload or view JDBC drivers as described in Enabling users to upload JDBC drivers.

Connections for the IBM Cognos TM1 Connector are migrated to connections of the type IBM Planning Analytics. You must update the new connection with the SSL certificate used with the IBM Cognos TM1 Connector.

Connections for the Apache HBase Connector and the File Connector - Engine tier are migrated to connections of the type Generic JDBC. Upload the required driver .jar as described in Importing JDBC driver files for data sources and update the migrated connection.

Migrated import areas

One single new project is created that contains all migrated import areas. The new project is named Legacy migration - metadata asset manager import areas-suffix. The suffix can be specified as an import parameter. Otherwise, the default suffix migration is used. You can find the new project in Projects > All projects.

For each migrated import area, the following entities are created in this new project:

Legacy import area is mapped to a metadata import JSON file. The import area name is mapped to a metadata import entity name. The area description becomes the description of the metadata import entity description. The import area's Assets to import information defines the metadata import scope.
An equivalent metadata import job is created but not automatically run.
For the data connection, a connection asset is created in the project and in the catalog that is used as the migration target unless an equivalent connection asset already exists.
For connections that were not created from platform connections, no credentials are migrated. Test each connection and update it with appropriate credentials if required. Then, you can use the connection for further imports. See Post-migration tasks.
The shared imported assets are added to the catalog that is used as the migration target. They are not added to the project.

Import areas with connections for the MITI bridges IBM Cognos and Tableau are migrated, but only the connections can be used further.

Migrated database assets

Migrated assets of the type database table and data file are added as data assets to the catalog that is set as the migration target, usually the default catalog. Migrated assets of the type database schema, data file folder, and data file field are stored as properties of the respective data assets.

Migrated database table aliases are added to the migration target catalog as data assets with the table type ALIAS.

Database tables with the same name in the same schema that are imported from the same source are merged by default after migration even though they are stored under different hosts or different databases in the legacy metadata repository.

Data files with same name and the same path that reside on the same source server but were imported to the legacy metadata repository under different hosts are merged after migration.

Migrated assets of the type Host and Database are stored as connection details of connection assets in the target catalog, usually the default catalog.

Primary and foreign keys are migrated for data assets that are part of data quality projects and can be accessed through a metadata enrichment asset that includes the corresponding asset.

Migrated data models

Migrated logical and physical data model assets are stored in the migration target catalog, usually the default catalog.

Migrated business intelligence assets

Migrated business intelligence assets are stored in the migration target catalog, usually the default catalog.

Source asset type	Asset type of the migrated asset
BI Report	Report
BI Report Query	Report query
BI Report Query Item	Report query item

Migrated extended data sources

Migrated extended data source assets are stored in the migration target catalog, usually the default catalog.

Migrated extension mapping documents

Migrated extension mapping documents are stored in the migration target catalog, usually the default catalog.

Source asset type	Asset type of the migrated asset
Extension Mapping Document	Lineage mapping group
Extension Mapping	Lineage mapping

Migrated OpenIGC assets

Migrated OpenIGC asset types show up as filter criteria in the migration target catalog, usually the default catalog. You can view or update these asset types only by using the Watson Data APIs, for example, List all asset types defined for an account, catalog, project or space.

Migrated OpenIGC assets are stored in the migration target catalog. On an asset's Overview page, you can see all migrated properties and property values in the Details section. Within this section, the property groups are structured as defined in the OpenIGC asset type definition. Boolean properties are not migrated. Therefore, this section might be created but will be empty.

Asset types containment information is migrated in form of contextual relationships depending on the asset type definition.

The source and the target catalog have different set of supported languages. Translated labels can be migrated for Brazilian Portuguese (pt-BR), Chinese-Simplified (zh-CN), Chinese-Traditional (zh-TW), French (fr), German (de), Italian (it), Spanish (es), Japanese (ja), Korean (ko), and Russian (ru). Labels translated into Arabic (ar) or Hebrew (he) are not migrated.

Migrated lineage information

Migrated lineage information is available on the individual asset's Lineage tab in the migration target catalog.

Migrated automated discovery jobs and results

For each migrated automated discovery job and its results, the following entities are created:

For each data quality project that is referenced in an automated discovery job, a new project is created. The name of the new project consists of the name of the data quality project plus a suffix as specified in the import parameters file or the default suffix migration, for example, SourceDQProjectName-migration.You can find the new project in Projects > All projects.
All members of the data quality project are added to the new project as collaborators.
For each automated discovery job, a metadata import asset and a metadata enrichment asset are created in the corresponding project, along with their associated jobs.
Metadata import asset

A metadata import asset is used to add assets from a connection to a project or a catalog without any type of analysis. Analysis is done in metadata enrichment.
The metadata import asset is created with the data scope and for the connection defined in the automated discovery job. All assets are added to the new project. No metadata import job is run. A connection asset is also created if it doesn't exist.

The naming convention is MDI_AD_name_of_autodiscovery_job.

Metadata enrichment asset
A metadata enrichment asset defines the scope of the data assets to be enriched and the settings to use. Metadata enrichment can be run on one data connection per metadata enrichment asset. The data assets to be analyzed must be available in this project, either imported through metadata import or added otherwise. Each data asset in an individual project can be in at most one metadata enrichment asset. A metadata enrichment job is created from this configuration and can be run automatically or manually. You can also schedule enrichment jobs and set up recurring runs.

The metadata enrichment asset is created with the data scope of the automated discovery job. For assets that were added to the data quality project by other means than automated discovery jobs, no metadata enrichment asset is created during migration nor are such assets automatically added to any other metadata enrichment asset. You must manually add them to existing metadata enrichment assets or create entirely new ones.

The naming convention is MDE_AD_name_of_autodiscovery_job.
The enrichment options of the new metadata enrichment are set as follows:
- Enrichment objectives are selected based on the discovery options selected for the automated discovery job.
  Tip: If the automated discovery job was configured for data discovery only, that is, without any analysis option, the new metadata enrichment is created with the Profile data enrichment option. Make sure to review such metadata enrichment. You might want to select additional enrichment options before you run the enrichment.
- All available categories are selected, regardless of the categories assigned data classes and terms belong to. Before you run the metadata enrichment for the first time, edit the metadata enrichment asset and update the list of selected categories. You might want to select a subset of categories if many categories are defined. You can check the assigned and suggested terms in the enrichment results to find out which categories to choose.
- The sampling settings of the data quality project are used except for the Use every Nth value up to maximum number of records allowed option. This sampling option is mapped to the default sampling setting for metadata enrichment, which is Basic.
  If the data quality project is set up without sampling, the new metadata enrichment assets are also created with the default sampling setting Basic. Metadata enrichment does not provide an option to not use sampling.
- Scheduling is disabled.
All business term and data class assignments are migrated. Manually and automatically assigned terms show up as such after the migration. Rejected terms are also migrated.

Profiling results are migrated.
Permissions are granted at the project level:
- The data quality project owner gets the Admin role in the new project.
- Any other data quality project collaborators get the Editor role in the new project.
- The user who runs the migration becomes the owner of the assets that are created in the new project.

Migrated data quality projects

For each data quality project, a new project is created in IBM Knowledge Catalog.

The name of the new project consists of the name of the migrated project plus a suffix. You can specify a suffix in the import_params.yaml file. If you omit this parameter , the default suffix migration is used, for example, SourceProjectName-migration.

Also, the following considerations apply:

Some special characters in data quality project names are replaced during migration. See Certain special characters in data quality project names are not preserved.
Data quality project names that have more than 100 characters are replaced. See Import of data quality projects with names longer than 100 characters fails.

In addition to automated discovery jobs and results and the individual assets in a data quality project, the following items are migrated to the new project:

Source item	Migrated item
Data rule definition	Data quality definition
Data rule	Data quality rule created from a data quality definition By default, rules with bindings to columns in virtual tables that are not built based on SQL statements are not migrated. You can set the `migrate_rules_bound_to_classical_vt_assets` option of the `DQ_PERF_CONFIG` parameter to `true` to have such rules migrated. Before you can work with such rules, you must redo the bindings by using the IBM Knowledge Catalog API: Update data quality rule. For more information about setting the option, see Optional export parameters.
Rule set	Data quality rule created from multiple data quality definitions By default, rules with bindings to columns in virtual tables that are not built based on SQL statements are not migrated. You can set the `migrate_rules_bound_to_classical_vt_assets` option of the `DQ_PERF_CONFIG` parameter to `true` to have such rules migrated. Before you can work with such rules, you must redo the bindings by using the IBM Knowledge Catalog API: Update data quality rule. For more information about setting the option, see Optional export parameters.
Rule history and results	Rule history and results of the corresponding data quality rule
Global logical variables bound to constants	DataStage parameter sets
SQL virtual table	SQL virtual tables are migrated as SQL-based data assets (data assets of the type Query). You can set the `migrate_sql_vt_as_sql_asset` option of the `DQ_PERF_CONFIG` parameter to `false` to revert to the previous behavior. See Optional export parameters.
Supported relationships between assets in data quality projects and governance artifacts	Relationships on the respective data assets and data quality rules in the new project
Quality rules	Data quality rule created from a data quality definition The naming convention for new rules is `QualityRuleForasset_name`. If several quality rules for the same asset name are migrated, the name of the rule that is created first follows this pattern. For subsequently created rules, the name is suffixed with a number, for example, `QualityRuleForCustomers-1`.
Primary and foreign keys of migrated assets	Primary and foreign keys can be accessed through the metadata enrichment that includes the migrated asset.

Data quality project settings are mapped as follows:

Data quality project setting	Equivalent new setting
Steward	None. Setting is not migrated.
Enable drill-down security	Project collaborator with the required access to the data source.
Null threshold	Null threshold in enrichment default settings.
Cardinality settings	Cardinality settings in enrichment default settings.
Frequency distribution settings	No equivalent setting. Will not migrate.
Data classification settings	No individual selection of data classes. Selection of categories in the metadata enrichment configuration determines which data classes are applied.
Data quality threshold	Data quality threshold in enrichment default settings.
Data quality dimensions	Not available yet.
Ignore new data quality dimensions that are installed	None. Setting is not migrated.
Enable automation rules	No longer applicable.
Primary key settings: Minimum uniqueness allowed	None. Setting is not migrated.
Compound keys and relationships: Search for compound keys relationships	Can be configured for each individual run of primary key or in-depth key relationship analysis.
Foreign key settings: Maximum percentage of allowed orphan values	None. Setting is not migrated.
Minimum percentage of common distinct values	None. Setting is not migrated.
Minimum confidence for the relationships	None. Setting is not migrated.
Limit columns to speed up analysis	None. Setting is not migrated.
Sampling settings	No project-level sampling settings. Sampling can be configured individually for each metadata enrichment. The metadata enrichment assets that are created for migrated automated discovery jobs are configured with the default sampling setting, which is Basic. The sampling settings of the data quality project are mapped to the equivalent settings in the metadata enrichment assets created for the migrated automated discovery jobs except for the Use every Nth value up to maximum number of records allowed option. This sampling option is mapped to the default sampling setting for metadata enrichment, which is Basic. If the data quality project is set up without sampling, the new metadata enrichment assets are also created with the default sampling setting Basic. Metadata enrichment does not provide an option to not use sampling.
Engine settings	No longer applicable.
Database settings	No longer applicable.
Automatically register data rule output tables as data assets	Can be configured individually for each data quality rule.
Maximum length for system generated columns	None. Setting is not migrated.
Users and group settings User groups were not supported.	Data quality project roles as such are not migrated. The project owner gets the Admin role in the new project. Any other data quality project collaborators get the Editor role in the new project. Owner of any migrated assets in the new project is the user who ran the migration. User groups are supported in the new project.
Retain the (DataStage) analysis jobs and the job logs	None. Setting is not migrated.
Automatically delete output tables for data rules and rule sets	None. Setting is not migrated.

Global settings are no longer available. All settings are defined at project level, or for individual data quality rules or metadata enrichments.

Migrated customizations

The following customizations are migrated:

Custom property and relationship definitions: To access the definitions of custom properties and relationships, go to Administration > Governance and catalogs > Asset and artifact definitions.
Custom property and relationship values: These values show up in the Details section of the asset or artifact properties.

Migrated glossary assets

The following table shows where in the Cloud Pak for Data user interface you can find glossary assets that were migrated from InfoSphere Information Server.

Asset type in InfoSphere Information Server	Location in Cloud Pak for Data
Categories	Governance > Categories
Data classes	Governance > Data classes The data classes are added to a separate category named Migrated DataClasses that is created during migration.
Information governance rules	Governance > Rules
Information governance policies	Governance > Policies
Labels	Labels are migrated as tags on the artifacts or assets that had the labels assigned in InfoSphere Information Server.
Terms	Governance > Business terms

Glossary asset properties are mapped to governance artifact properties as follows:

Term properties

Term in InfoSphere Information Server	Business term in Cloud Pak for Data
Name	Name
Parent Category	Primary category
Short Description	Description
Long Description	Description
Referencing Categories	Secondary categories
Labels	Tags
Stewards	Stewards
Governed by Rules	Related content
Abbreviation	Abbreviation
Additional Abbreviation	Abbreviation
Example	Description The migrated content is prefixed with `Example`.
Usage	Description The migrated content is prefixed with `Usage`.
Is a Type Of	Is a type of
Has Types	Has a type of
Is Of	Is a part of
Has A	Has a part of
Synonyms	Synonyms
Preferred Synonym	Synonyms
Related Terms	Other related business terms
Assigned Terms	Other related business terms
Assigned to Terms	Other related business terms
Assigned Assets	An assigned category is added as a secondary category. An assigned data class is added a related artifact. Information about assigned information assets is not migrated.
Custom attribute values of type Text, Predefined Values, Date, Number	Details
Notes	Comments in the activity log. Such comments have this format: `Subject: note_label Type: note_type Status: note status Note: note content`
Term history and development log	Activity log The initial (create) entry is tagged with and has the comment `A term was created`.

Category properties

Category in InfoSphere Information Server	Category in Cloud Pak for Data
Name	Name
Short Description	Description
Long Description	Description
Subcategories	Subcategories
Custom attribute values	Details
Notes	Comments in the activity log. Such comments have this format: `Subject: note_label Type: note_type Status: note status Note: note content`
-	Activity log The initial (create) entry in the activity log is tagged with and has the comment `An artifact was created`.

Rule properties

Information governance rule in InfoSphere Information Server	Governance rule in Cloud Pak for Data
Name	Name All migrated governance rules are added to the [uncategorized] category. Rules with duplicate names are automatically renamed: the names are suffixed with an underscore (`_`) and a consecutive number.
-	Primary category [uncategorized]
Short Description	Description
Long Description	Description
Labels	Tags
Stewards	Stewards
Related Rules	Related rules
Referencing Policies	Parent policies
Governs Assets	Related artifacts when it’s a term
Custom attribute values	Details
Notes	Comments in the activity log. Such comments have this format: `Subject: note_label Type: note_type Status: note status Note: note content`
-	Activity log The initial (create) entry in the activity log is tagged with and has the comment `An artifact was created`.

Policy properties

Information governance policy in InfoSphere Information Server	Policy in Cloud Pak for Data
Name	Name All migrated policies are added to the [uncategorized] category. Policies with duplicate names are automatically renamed: the names are suffixed with an underscore (`_`) and a consecutive number.
-	Primary category [uncategorized]
Parent Policy	Parent policy
Short Description	Description
Long Description	Description
Labels	Tags
Stewards	Stewards
Subpolicies	Subpolicies
Information Governance Rules	Rules
Custom attribute values	Details
Notes	Comments in the activity log. Such comments have this format: `Subject: note_label Type: note_type Status: note status Note: note content`
-	Activity log The initial (create) entry in the activity log is tagged with and has the comment `An artifact was created`.

Data class properties

Data class in InfoSphere Information Server	Data class in Cloud Pak for Data
Name	Name
-	Primary category Migrated DataClasses
Parent Policy	Parent policy
Short Description	Description
Long Description	Description
Example	Examples
Labels	Tags
Stewards	Stewards
Enabled	Enabled
Type	Matching method
Minimum Data Length	Minimum length of data value
Maximum Data Length	Maximum length of data value
Provider	Provider
Priority	Priority
Scope	Scope of code
Threshold	Threshold
Assigned to Terms	Related artifacts
Implements Rules	Related artifacts
Governed by Rules	Related artifacts
Custom attribute values	Details
Notes	Comments in the activity log. Such comments have this format: `Subject: note_label Type: note_type Status: note status Note: note content`
-	Activity log The initial (create) entry in the activity log is tagged with and has the comment `A data class was created`.