Where you can find the migrated data (IBM Knowledge Catalog)
After the migration, migrated data is available in core IBM Knowledge Catalog components that provide functionality equivalent to the removed legacy features.
- General overview
- User role mapping
- Migrated data connections
- Migrated import areas
- Migrated database assets
- Migrated data models
- Migrated business intelligence assets
- Migrated extended data sources
- Migrated extension mapping documents
- Migrated OpenIGC assets
- Migrated lineage information
- Migrated automated discovery jobs and results
- Migrated data quality projects
- Migrated customizations
- Migrated glossary assets
For information about known issues related to migration and migrated data, see Known issues for migration from InfoSphere Information Server.
General overview
The following table gives a general overview of how InfoSphere Information Server features can be mapped to IBM Knowledge Catalog features. The table also shows which Cloud Pak for Data user permissions must be included in a user role to work with the IBM Knowledge Catalog features. For more information about user permissions, see Predefined roles and permissions in IBM® Software Hub. For additional access requirements, see the individual feature information.
| InfoSphere Information Server feature | IBM Knowledge Catalog feature | Required Cloud Pak for Data user permissions |
|---|---|---|
| Automation rules | No replacement | None. |
| Custom attribute administration | Custom properties and relationships for governance artifacts and catalog assets | Manage glossary This permission is not granted with any predefined role. To create custom asset types or custom properties and relationships for assets, the user must also have the Manage catalogs permission. This permission is granted with the predefined Administrator role. |
| Custom asset display for information assets | No replacement | None. |
| Data discovery (automated discovery and quick scan) | Metadata import and metadata enrichment in projects |
The role in the project determines whether a user can view or manage metadata imports and metadata enrichment. For metadata imports that target a catalog the user running the import must also be a catalog collaborator with the Admin or Editor role. |
| Data quality projects | Metadata enrichment and data quality rules in projects Data quality is an optional feature that must explicitly be enabled. See Optional features and the component you need to enable. |
|
| Information assets view including lineage | Assets in catalogs including lineage | Access catalogs This permission is granted with the
following predefined roles:
|
| Administration for information assets lineage | No replacement | None. |
| Metadata import with a bridge or connector | Metadata import in projects | Manage asset discovery This
permission is granted with the following predefined roles:
The role in the project determines whether a user can view or manage metadata imports. |
| Glossary assets | Governance artifacts |
User role mapping
If the InfoSphere Information Server system is configured to use an LDAP user registry and both the InfoSphere Information Server system and the IBM Software Hub system are connected to the same LDAP server, user information can be migrated. InfoSphere Information Server user roles are mapped to roles in Cloud Pak for Data as shown in the following table. If several roles are listed in the Cloud Pak for Data column for an InfoSphere Information Server role, migrated users will have all of these roles assigned. For more information, see Predefined roles and permissions in IBM Software Hub.
For migration purposes, the following user roles are created in Cloud Pak for Data when the migration toolkit is installed: wkc_catalog_owner_role, wkc_catalog_editor_role, and wkc_catalog_viewer_role
These roles are used for granting the proper catalog collaborator role to migrated users and groups. Do not assign or delete any of these roles.
- Limitations
- Information Governance Catalog catalog permissions and Information Governance Catalog glossary development permissions allow to
restrict access to categories to specific users or user groups. However, these permissions cannot be
mapped to the roles and permissions model in Cloud Pak for Data and are not migrated.
Access to the migrated categories and the terms that they contain is granted based on the Information Governance Catalog security role that a user or group has.
| InfoSphere Information Server role | Roles and permissions in Cloud Pak for Data |
|---|---|
| Common Metadata Administrator Internal name:
|
|
| Information Analyzer Data Administrator |
|
| Information Analyzer Project Administrator |
|
| Information Analyzer User |
|
| Information Governance Catalog Glossary Administrator |
|
| Information Governance Catalog Glossary Author |
|
| Information Governance Catalog Glossary Basic User |
|
| Information Governance Catalog Information Asset Administrator |
|
| Information Governance Catalog Information Asset Assigner |
|
| Information Governance Catalog Information Asset Author |
|
| Information Governance Catalog Information Asset Reviewer |
|
| Information Governance Catalog User |
|
| Rules Administrator |
|
| Rules Author |
|
| Rules Manager |
|
| Rules User |
|
Migrated data connections
You can find the migrated connections in the catalog that is specified in the import_params.yaml file as the migration target, usually the default catalog. Go to and search for the catalog name.
The migrated connections are not added to the Platform assets catalog.
In general, any required connection properties except for credentials are migrated for each connection. You must edit each migrated connection and update credentials manually. See Post-migration tasks.
If a connection that was added to the catalog from a project or created from a platform connection has the same name and resource key as the migrated connection, the connection assets are not merged. Instead, a second connection asset is created for the migrated connection.
- For connections to data sources for which a connector is available in Cloud Pak for Data, ODBC connections are migrated to connections of the proper type. These connections are created with shared credentials.
- For connections to data sources for which no connector is available in Cloud Pak for Data, ODBC connections are migrated to connections of
the type Generic JDBC. These connections are also created with shared credentials.
If a JDBC driver for the data source is available in the platform, no further action is required.
If a JDBC driver for the data source is not available in the platform, upload the required driver .jar as described in Importing JDBC driver files for data sources and update the migrated connection. To find out whether a driver is available, edit the connection and click Test connection. If a driver is not available, the test fails, and the error message states that the driver class was not found.Important: By default, uploading JDBC driver files is disabled and users cannot view the list of JDBC drivers in the web client. An administrator must enable users to upload or view JDBC drivers as described in Enabling users to upload JDBC drivers.
Connections for the IBM Cognos TM1 Connector are migrated to connections of the type IBM Planning Analytics. You must update the new connection with the SSL certificate used with the IBM Cognos TM1 Connector.
Connections for the Apache HBase Connector and the File Connector - Engine tier are migrated to connections of the type Generic JDBC. Upload the required driver .jar as described in Importing JDBC driver files for data sources and update the migrated connection.
Migrated import areas
One single new project is created that contains all migrated import areas. The new project is
named Legacy migration - metadata asset manager import
areas-suffix. The suffix can be specified as an import parameter.
Otherwise, the default suffix migration is used. You can find the new project in
.
- Legacy import area is mapped to a metadata import JSON file. The import area name is mapped to a metadata import entity name. The area description becomes the description of the metadata import entity description. The import area's Assets to import information defines the metadata import scope.
- An equivalent metadata import job is created but not automatically run.
- For the data connection, a connection asset is created in the project and in the catalog that is used as the migration target unless an equivalent connection asset already exists.
- For connections that were not created from platform connections, no credentials are migrated. Test each connection and update it with appropriate credentials if required. Then, you can use the connection for further imports. See Post-migration tasks.
- The shared imported assets are added to the catalog that is used as the migration target. They are not added to the project.
Import areas with connections for the MITI bridges IBM Cognos and Tableau are migrated, but only the connections can be used further.
Migrated database assets
Migrated assets of the type database table and data file are added as data assets to the catalog that is set as the migration target, usually the default catalog. Migrated assets of the type database schema, data file folder, and data file field are stored as properties of the respective data assets.
Migrated database table aliases are added to the migration target catalog as data assets with the
table type ALIAS.
Database tables with the same name in the same schema that are imported from the same source are merged by default after migration even though they are stored under different hosts or different databases in the legacy metadata repository.
Data files with same name and the same path that reside on the same source server but were imported to the legacy metadata repository under different hosts are merged after migration.
Migrated assets of the type Host and Database are stored as connection details of connection assets in the target catalog, usually the default catalog.
Primary and foreign keys are migrated for data assets that are part of data quality projects and can be accessed through a metadata enrichment asset that includes the corresponding asset.
Migrated data models
Migrated logical and physical data model assets are stored in the migration target catalog, usually the default catalog.
Migrated business intelligence assets
| Source asset type | Asset type of the migrated asset |
|---|---|
| BI Report | Report |
| BI Report Query | Report query |
| BI Report Query Item | Report query item |
Migrated extended data sources
Migrated extended data source assets are stored in the migration target catalog, usually the default catalog.
Migrated extension mapping documents
| Source asset type | Asset type of the migrated asset |
|---|---|
| Extension Mapping Document | Lineage mapping group |
| Extension Mapping | Lineage mapping |
Migrated OpenIGC assets
Migrated OpenIGC asset types show up as filter criteria in the migration target catalog, usually the default catalog. You can view or update these asset types only by using the Watson Data APIs, for example, List all asset types defined for an account, catalog, project or space.
Migrated OpenIGC assets are stored in the migration target catalog. On an asset's Overview page, you can see all migrated properties and property values in the Details section. Within this section, the property groups are structured as defined in the OpenIGC asset type definition. Boolean properties are not migrated. Therefore, this section might be created but will be empty.
Asset types containment information is migrated in form of contextual relationships depending on the asset type definition.
The source and the target catalog have different set of supported languages. Translated labels can be migrated for Brazilian Portuguese (pt-BR), Chinese-Simplified (zh-CN), Chinese-Traditional (zh-TW), French (fr), German (de), Italian (it), Spanish (es), Japanese (ja), Korean (ko), and Russian (ru). Labels translated into Arabic (ar) or Hebrew (he) are not migrated.
Migrated lineage information
Migrated lineage information is available on the individual asset's Lineage tab in the migration target catalog.
Migrated automated discovery jobs and results
- For each data quality project that is referenced in an automated discovery job, a new project is
created. The name of the new project consists of the name of the data quality project plus a suffix
as specified in the import parameters file or the default suffix migration,
for example, SourceDQProjectName-migration.You can find the new project in
.
All members of the data quality project are added to the new project as collaborators.
- For each automated discovery job, a metadata import asset and a metadata enrichment asset are
created in the corresponding project, along with their associated jobs.
- Metadata import asset
- A metadata import asset is used to add assets from a connection to a project or a catalog
without any type of analysis. Analysis is done in metadata enrichment.
The metadata import asset is created with the data scope and for the connection defined in the automated discovery job. All assets are added to the new project. No metadata import job is run. A connection asset is also created if it doesn't exist.
The naming convention is MDI_AD_name_of_autodiscovery_job.
- Metadata enrichment asset
-
A metadata enrichment asset defines the scope of the data assets to be enriched and the settings to use. Metadata enrichment can be run on one data connection per metadata enrichment asset. The data assets to be analyzed must be available in this project, either imported through metadata import or added otherwise. Each data asset in an individual project can be in at most one metadata enrichment asset. A metadata enrichment job is created from this configuration and can be run automatically or manually. You can also schedule enrichment jobs and set up recurring runs.
The metadata enrichment asset is created with the data scope of the automated discovery job. For assets that were added to the data quality project by other means than automated discovery jobs, no metadata enrichment asset is created during migration nor are such assets automatically added to any other metadata enrichment asset. You must manually add them to existing metadata enrichment assets or create entirely new ones.
The naming convention is MDE_AD_name_of_autodiscovery_job.
The enrichment options of the new metadata enrichment are set as follows:- Enrichment objectives are selected based on the discovery options selected for the automated
discovery job.Tip: If the automated discovery job was configured for data discovery only, that is, without any analysis option, the new metadata enrichment is created with the Profile data enrichment option. Make sure to review such metadata enrichment. You might want to select additional enrichment options before you run the enrichment.
- All available categories are selected, regardless of the categories assigned data classes and terms belong to. Before you run the metadata enrichment for the first time, edit the metadata enrichment asset and update the list of selected categories. You might want to select a subset of categories if many categories are defined. You can check the assigned and suggested terms in the enrichment results to find out which categories to choose.
- The sampling settings of the data quality project are used except for the Use every
Nth value up to maximum number of records allowed option. This sampling option is mapped
to the default sampling setting for metadata enrichment, which is
Basic.
If the data quality project is set up without sampling, the new metadata enrichment assets are also created with the default sampling setting Basic. Metadata enrichment does not provide an option to not use sampling.
- Scheduling is disabled.
All business term and data class assignments are migrated. Manually and automatically assigned terms show up as such after the migration. Rejected terms are also migrated.
Profiling results are migrated.
- Enrichment objectives are selected based on the discovery options selected for the automated
discovery job.
- Permissions are granted at the project level:
- The data quality project owner gets the Admin role in the new project.
- Any other data quality project collaborators get the Editor role in the new project.
- The user who runs the migration becomes the owner of the assets that are created in the new project.
Migrated data quality projects
For each data quality project, a new project is created in IBM Knowledge Catalog.
The name of the new project consists of the name of the migrated project plus a suffix. You can specify a suffix in the import_params.yaml file. If you omit this parameter , the default suffix migration is used, for example, SourceProjectName-migration.
- Some special characters in data quality project names are replaced during migration. See Certain special characters in data quality project names are not preserved.
- Data quality project names that have more than 100 characters are replaced. See Import of data quality projects with names longer than 100 characters fails.
| Source item | Migrated item |
|---|---|
| Data rule definition | Data quality definition |
| Data rule | Data quality rule created from a data quality definition
By
default, rules with bindings to columns in virtual tables that are not built based on SQL statements
are not migrated. You can set the |
| Rule set | Data quality rule created from multiple data quality definitions
By
default, rules with bindings to columns in virtual tables that are not built based on SQL statements
are not migrated. You can set the |
| Rule history and results | Rule history and results of the corresponding data quality rule |
| Global logical variables bound to constants | DataStage parameter sets |
| SQL virtual table | SQL virtual tables are migrated as SQL-based data assets (data assets of the type
Query). You can set the migrate_sql_vt_as_sql_asset option
of the DQ_PERF_CONFIG parameter to false to revert to the previous
behavior. See Optional
export
parameters. |
| Supported relationships between assets in data quality projects and governance artifacts | Relationships on the respective data assets and data quality rules in the new project |
| Quality rules | Data quality rule created from a data quality definition The naming convention for new
rules is |
| Primary and foreign keys of migrated assets | Primary and foreign keys can be accessed through the metadata enrichment that includes the migrated asset. |
| Data quality project setting | Equivalent new setting |
|---|---|
| Steward | None. Setting is not migrated. |
| Enable drill-down security | Project collaborator with the required access to the data source. |
| Null threshold | Null threshold in enrichment default settings. |
| Cardinality settings | Cardinality settings in enrichment default settings. |
| Frequency distribution settings | No equivalent setting. Will not migrate. |
| Data classification settings | No individual selection of data classes. Selection of categories in the metadata enrichment configuration determines which data classes are applied. |
| Data quality threshold | Data quality threshold in enrichment default settings. |
| Data quality dimensions | Not available yet. |
| Ignore new data quality dimensions that are installed | None. Setting is not migrated. |
| Enable automation rules | No longer applicable. |
| Primary key settings: Minimum uniqueness allowed | None. Setting is not migrated. |
| Compound keys and relationships: Search for compound keys relationships | Can be configured for each individual run of primary key or in-depth key relationship analysis. |
| Foreign key settings: Maximum percentage of allowed orphan values | None. Setting is not migrated. |
| Minimum percentage of common distinct values | None. Setting is not migrated. |
| Minimum confidence for the relationships | None. Setting is not migrated. |
| Limit columns to speed up analysis | None. Setting is not migrated. |
| Sampling settings | No project-level sampling settings. Sampling can be configured individually for each
metadata enrichment. The metadata enrichment assets that are created for migrated automated discovery jobs are configured with the default sampling setting, which is Basic. The sampling settings of the data quality project are mapped to the equivalent settings in the metadata enrichment assets created for the migrated automated discovery jobs except for the Use every Nth value up to maximum number of records allowed option. This sampling option is mapped to the default sampling setting for metadata enrichment, which is Basic. If the data quality project is set up without sampling, the new metadata enrichment assets are also created with the default sampling setting Basic. Metadata enrichment does not provide an option to not use sampling. |
| Engine settings | No longer applicable. |
| Database settings | No longer applicable. |
| Automatically register data rule output tables as data assets | Can be configured individually for each data quality rule. |
| Maximum length for system generated columns | None. Setting is not migrated. |
| Users and group settings User groups were not supported. |
Data quality project roles as such are not migrated. The project owner gets the
Admin role in the new project. Any other data quality project collaborators get the Editor role in the new project. Owner of any migrated assets in the new project is the user who ran the migration. User groups are supported in the new project. |
| Retain the (DataStage) analysis jobs and the job logs | None. Setting is not migrated. |
| Automatically delete output tables for data rules and rule sets | None. Setting is not migrated. |
Migrated customizations
The following customizations are migrated:
- Custom property and relationship definitions
- To access the definitions of custom properties and relationships, go to .
- Custom property and relationship values
- These values show up in the Details section of the asset or artifact properties.
Migrated glossary assets
The following table shows where in the Cloud Pak for Data user interface you can find glossary assets that were migrated from InfoSphere Information Server.
| Asset type in InfoSphere Information Server | Location in Cloud Pak for Data |
|---|---|
| Categories | |
| Data classes | The data classes are added to a separate category named Migrated DataClasses that is created during migration. |
| Information governance rules | |
| Information governance policies | |
| Labels | Labels are migrated as tags on the artifacts or assets that had the labels assigned in InfoSphere Information Server. |
| Terms |
- Term properties
-
Term in InfoSphere Information Server Business term in Cloud Pak for Data Name Name Parent Category Primary category Short Description Description Long Description Description Referencing Categories Secondary categories Labels Tags Stewards Stewards Governed by Rules Related content Abbreviation Abbreviation Additional Abbreviation Abbreviation Example Description The migrated content is prefixed with
Example.Usage Description The migrated content is prefixed with
Usage.Is a Type Of Is a type of Has Types Has a type of Is Of Is a part of Has A Has a part of Synonyms Synonyms Preferred Synonym Synonyms Related Terms Other related business terms Assigned Terms Other related business terms Assigned to Terms Other related business terms Assigned Assets - An assigned category is added as a secondary category.
- An assigned data class is added a related artifact.
- Information about assigned information assets is not migrated.
Custom attribute values of type Text, Predefined Values, Date, Number Details Notes Comments in the activity log. Such comments have this format: Subject: note_label Type: note_type Status: note status Note: note contentTerm history and development log Activity log The initial (create) entry is tagged with
and
has the comment A term was created. - Category properties
-
Category in InfoSphere Information Server Category in Cloud Pak for Data Name Name Short Description Description Long Description Description Subcategories Subcategories Custom attribute values Details Notes Comments in the activity log. Such comments have this format: Subject: note_label Type: note_type Status: note status Note: note content- Activity log The initial (create) entry in the activity log is tagged with
and has the comment An artifact was created. - Rule properties
-
Information governance rule in InfoSphere Information Server Governance rule in Cloud Pak for Data Name Name All migrated governance rules are added to the [uncategorized] category. Rules with duplicate names are automatically renamed: the names are suffixed with an underscore (
_) and a consecutive number.- Primary category [uncategorized] Short Description Description Long Description Description Labels Tags Stewards Stewards Related Rules Related rules Referencing Policies Parent policies Governs Assets Related artifacts when it’s a term Custom attribute values Details Notes Comments in the activity log. Such comments have this format: Subject: note_label Type: note_type Status: note status Note: note content- Activity log The initial (create) entry in the activity log is tagged with
and has the comment An artifact was created. - Policy properties
-
Information governance policy in InfoSphere Information Server Policy in Cloud Pak for Data Name Name All migrated policies are added to the [uncategorized] category. Policies with duplicate names are automatically renamed: the names are suffixed with an underscore (
_) and a consecutive number.- Primary category [uncategorized] Parent Policy Parent policy Short Description Description Long Description Description Labels Tags Stewards Stewards Subpolicies Subpolicies Information Governance Rules Rules Custom attribute values Details Notes Comments in the activity log. Such comments have this format: Subject: note_label Type: note_type Status: note status Note: note content- Activity log The initial (create) entry in the activity log is tagged with
and has the comment An artifact was created. - Data class properties
-
Data class in InfoSphere Information Server Data class in Cloud Pak for Data Name Name - Primary category Migrated DataClasses Parent Policy Parent policy Short Description Description Long Description Description Example Examples Labels Tags Stewards Stewards Enabled Enabled Type Matching method Minimum Data Length Minimum length of data value Maximum Data Length Maximum length of data value Provider Provider Priority Priority Scope Scope of code Threshold Threshold Assigned to Terms Related artifacts Implements Rules Related artifacts Governed by Rules Related artifacts Custom attribute values Details Notes Comments in the activity log. Such comments have this format: Subject: note_label Type: note_type Status: note status Note: note content- Activity log The initial (create) entry in the activity log is tagged with
and has the comment A data class was created.