Integrate enterprise metadata with IBM InfoSphere and Cognos

Increase the reliability and agility of your data and establish common semantics across your enterprise

Knowledge about the quality and correctness of the huge volumes of data that drive day-to-day activities for enterprises and organizations is essential for effective decision making. Use this tutorial to learn how to gain visibility into your metadata, which in turn will lead to increased trust in data reliability, increased agility, and improved common understanding throughout your enterprise. This tutorial describes the significance of business and technical metadata integration and shows how heterogeneous metadata in an enterprise can be integrated using various IBM products. After a brief overview of the business issues and the integration solution, the tutorial provides a step-by-step guide showing you how to integrate metadata using tools from the IBM® InfoSphere™ and Cognos® product suites.


Ramya H. Prasad (, IT Specialist, IBM China

Ramya Prasad photoRamya Prasad is a business intelligence professional at IBM GBS with over four years of experience in the IT industry. She has extensive experience working with BI solutions such as DataStage, Abinitio, and Cognos. She also has deep database skills on Oracle , Teradata and DB2. Her domain of experience includes the financial and industrial sectors. As an accomplished innovator, she has one filed patent and three published disclosures.

Rishi S. Balaji, Advisory IT Architect, IBM

Rishi Balaji photoRishi Balaji is an Advisory IT Architect for the banking industry at the IBM Global Business Solution Center. He has more than nine years of experience in the IT industry, and has worked on design and development of applications based on service-oriented architecture as well as traditional J2EE technologies.

29 July 2010

Also available in Chinese


Why integrate your metadata?

Enterprises and organizations deal with huge volumes of data on a daily basis. This data drives many day-to-day activities such as business decisions and regulatory reporting. In order for you to use data effectively, your IT systems must provide visibility into your metadata. This visibility leads to increased trust in data reliability, increased agility, and improved common understanding throughout your enterprise.

Some of the common problems faced by businesses and technical users in regard to understanding data are:

  • Lack of trusted information
    • Where does the data come from?
    • How reliable is it?
  • Lack of agility
    • Was the data modified recently?
    • If I change the data, will other systems be affected?
  • Lack of common semantics
    • Do "customer" and "client" conceptually mean the same thing?
    • What should I call a specific entity so that the rest of the enterprise understands what I mean?

Solution overview

These problems can be solved through an enterprise architecture that integrates business and technical metadata from various sources. At a high level, metadata refers to information about data, such as its definition, structure, source, etc. For a more detailed description of metadata, refer to the "Integrating heterogeneous metadata" developerWorks article that is linked to in the Resources section.

A typical data integration architecture involves data from various source systems being extracted, transformed, and loaded (ETL) into data warehouses, marts, and cubes. To satisfy the needs of business users, the data is further integrated by reporting tools and visually presented as dashboard views. Technical users, on the other hand, work with the IT aspects of the integration. Based on this architecture, metadata integration (which is the solution to the issues in context) deals with integrating the following technical and business data:

  • Data about what source systems were used to populate the data warehouse, marts, and cubes
  • Data about what ETL jobs were used to perform the transformations
  • Data about what data mart tables were used to populate a specific report
  • Data about what specific terms in a report mean (business metadata)

The key to achieving such integration is to build an enterprise-level metadata repository that acts as a single source of truth for all metadata requirements. When such a repository is in place, different applications and users can link to this repository to build lineage and traceability solutions that help answer the questions listed in the previous section.

Figure 1 shows the outline of an enterprise metadata integration solution. The labels above the dotted lines refer to the various types of metadata (report names and packages, database tables, ETL jobs, and data source names).

Figure 1. Enterprise metadata integration solution
Data sources provide the foundation for the data warehouse, which provides the foundation for data marts, which then produce reports. All these levels provide input to the metadata repository.

Note: The figure above shows the technical perspective of the integration architecture. Therefore, it does not show business metadata being populated into the metadata repository.

IBM tooling for metadata integration

The solution described in this tutorial is based on IBM InfoSphere Information Server V8.1, Cognos 8 Business Intelligence (BI) V8.4, and Import Export Manager V8.1, fixpack 1.

IBM InfoSphere Information Server is a data integration software platform that helps organizations derive useful information and value to business from simple or complex data sourced from multiple systems. The platform is composed of various components that profile, cleanse, transform, and integrate data, in order to deliver useful and meaningful information. The components of InfoSphere Information Server V8.1 that this tutorial uses to demonstrate metadata integration are:

  • Metadata repository
  • InfoSphere Metadata Workbench (referred to in the rest of this tutorial simply as Workbench)
  • Import Export Manager
  • InfoSphere Business Glossary
  • InfoSphere DataStage

The InfoSphere metadata repository acts as a centralized data store for all the metadata that is available across the various other components such as DataStage and Business Glossary. Workbench plays a critical role in establishing the automated link between the data from various sources and provides useful lineage reports and impact analysis details. The Import Export Manager component consists of bridges (MetaBroker is also a bridge) that help in importing and exporting metadata to and from the metadata repository. It supports various data sources such as data files, database tables, data models, business glossary, Cognos reports, etc.

Figure 2 shows the relationship between the business user and technical user perspectives of the Cognos BI and InfoSphere integration.

Figure 2. Cognos BI and InfoSphere integration
Shows end user perspective (ERS Cognos Report, Metadata Workbench, and Business Glossary) as a layer above technical perspective (Report metadata, Cognos datastore, DataStage, ETL Metadata, and Metadata Repository)

The integration means that business users can use web links to navigate to the business glossary and lineage reports from the business reports.

For technical users, the integration means that:

  • The report metadata from Cognos BI, ETL job metadata from DataStage, and the warehouse metadata from Database, need to be mapped to each other to enable traceability through lineage reports.
  • The glossary data needs to be populated in the metadata repository to enable the glossary integration.
  • For the linkage between the Cognos BI reports and the InfoSphere tools (Glossary and Workbench) to work, Cognos BI must be configured with the right URIs.

Steps for integrating metadata

This tutorial explains how you can implement this process of data mapping and configuration by following these steps:

  • Use Import Export Manager to bring metadata about data files, data tables, business terms, reports, and models into Workbench.
  • Establish manual and automated links between the metadata.
  • Configure Cognos BI for InfoSphere integration.

The first step to integrate your metadata is to pull the metadata into the repository. MetaBrokers and bridges play a critical role in importing metadata from various tools into the metadata repository. You can also use these tools to export metadata from the repository to InfoSphere Information Server.

There are various types of bridges available for importing and exporting metadata for database tables, reports, models, user information, etc. This tutorial shows you how to use these bridges to import metadata into Workbench.

Importing metadata into Workbench

Importing metadata into Workbench involves the following tasks:

Importing database objects

Follow these steps to use ODBC 3.0 MetaBroker to import the metadata that describes the database schema and tables into Workbench:

  1. Open the Import Export Manager tool, select ODBC 3.0 MetaBroker for the source as shown in Figure 3, and click Start.
    Figure 3. Import Export Manager showing ODBC 3.0 metabroker
    shows selecting ODBC 3.0 MetaBroker from Import Export Manager tool

    Alternatively, from DataStage Designer you can use the Via bridges import option to display the Bridge Selection dialog as shown in Figure 4. From the dialog, select ODBC 3.0 MetaBroker and click OK.

    Figure 4. DataStage and QualityStage Designer showing bridge selection
    shows selecting ODBC 3.0 MetaBroker from Bridge Selection dialog
  2. Use the ODBC Data Source Administrator tool to set up the ODBC DSN name as the system DSN as shown in Figure 5.
    Figure 5. ODBC Data source administrator showing the configured system DSN
    shows the data sources
  3. From the Parameter Selection dialog, select the Verbose option as shown in Figure 6 and click OK.
    Figure 6. ODBC 3.0 MetaBroker parameters
    shows selecting Verbose optional parameter
  4. From the Schema\Table selection list pop-up window, select the preferred database schema and table that need to be imported as shown in Figure 7 and click OK.
    Figure 7. Database schema\table selection screen
    Shows selecting DB2BDW from the schema\table selection list
  5. After the processing completes, click Filter on the status pop-up window.
  6. From the Metadata Selection dialog, select specific tables or the entire schema as shown in Figure 8 and click OK.
    Figure 8. Metadata selection-database filter option
    Shows selecting Data collection, data connection, data schema, database, and host system from the metadata options
  7. The database metadata is now imported into the metadata repository and is available for use by all components of InfoSphere Information Server. Repeat these steps to import other database schemas and tables. You can view the imported database metadata in Workbench as shown in the Figure 9.
    Figure 9. Workbench showing data servers and tables
    Metadata Workbench screen with explorer view of data servers and tables

Importing and publishing sequential data files

The source data for DataStage jobs is either in the form of database tables or sequential files. Also, the lookup data for jobs can be in the form of sequential files. It is important to import the metadata about these files into the metadata repository for its use and for dependency analysis. To do this, follow these steps:

  1. From DataStage and QualityStage Designer, select Import > Table Definitions > Sequential File Definitions as shown in Figure 10.
    Figure 10. DataStage designer showing sequential file table definition option
    shows selecting Import -> Table Definitions from DataStage screen
  2. From the Import Meta Data (Sequential) dialog, enter details such as source directory, file name, and project folder as shown in Figure 11 and click Import to start the import process.
    Figure 11. Import metadata — sequential file
    shows selecting the dirctory, filetype, and filename
  3. From the Format tab of the Define Sequential Meta Data dialog, select the required delimiter as shown in Figure 12. Also, select the Fixed-width columns and First line is column names options as needed.
    Figure 12 Define sequential metadata format
    shows selecting comma as the delimiter
  4. From the Define tab of the Define Sequential Meta Data dialog, select the column name to use as the key as shown in Figure 13 and click OK to complete the import.
    Figure 13. Sequential file metadata — column definition
    shows selecting the PARTITION_KEY column as the Key
  5. Repeat these steps to import all the required sequential files.

Publishing the sequential file table definition to the metadata repository

Publishing the sequential file table definitions helps you obtain data dependencies. It also helps you understand the jobs that are dependent on the selected sequential files and gives a complete end-to-end data lineage report from source sequential file to Cognos reports. To publish sequential file table definitions, follow these steps:

  1. From the DataStage Designer, navigate to your newly created table definition from sequential files. Right click the table definition and select Shared Table Creation Wizard as shown in Figure 14. Select the respective table definition and click Next.
    Figure 14. Shared Table Creation Wizard selection option
    shows selecting Shared Table creation wizard from the context menu
  2. From the Create or Associate Tables screen of the Shared Table Creation Wizard, select Create New. From the Create New Table dialog, specify the host system and directory path of the file as shown in Figure 15 and add a note to document the intended use of the file. Click OK, confirm the settings, and create the shared table definition.
    Figure 15. Shared table creation wizard
    Shows creating new table
  3. You can now view the published sequential files in Workbench as shown in Figure 16.
    Figure 16. Workbench — Data File properties
    shows properties: file name, imported from, location, path, type, short description, long description, business term, data steward, data server, and contains elements

    The structure of an imported sequential file upon publication as a shared table includes the data file, data file structure, and data file field as shown in Figure 17.

    Figure 17. Structure of imported sequential file
    Figure showing various elements of the sequential file in Workbench: Data file, data file structure,and data file field

Importing Cognos OLAP models and reports into Workbench

The bridges to import, report, and model are:

  • Cognos 8 BI Reporting — Content Manager for reporting the multi-models, reports, etc.
  • Cognos 8 BI Reporting — Content Manager Packages for importing the models
  • Cognos 8 BI Reporting — Content Manager Report Studio for importing only the reports

Follow these steps to use the Import Export Manager to import the Cognos reports and OLAP models into Workbench:

  1. Figure 18 shows the Import Export Manager with the Cognos BI bridges highlighted in the Source selection list. Based on your requirements to import the model, report, or both, select the appropriate bridge and click Start.
    Figure 18. Import Export Manager showing Cognos bridge option
    Import Export Manager with Cognos bridge options highlighted in the Source selection list
  2. On the parameter selection pop-up window, enter values for the required parameters version, login server URL, and repository subset:
    1. If you are using Cognos BI 8.4, you must select Cognos 8.3 to 8.4 for the version parameter. Otherwise, it will not provide the lineage from Job (source) to Cognos report.
    2. Specify your Cognos dispatch URL information for the login server URL.
    3. Specify your Cognos content folder/package information for the repository subset. You can find this information in the Cognos Report property section as follows:
      1. Select Cognos Content > Report location to go to the Cognos Report folder.
      2. Click More.
      3. Click Set properties.
      4. Click View the search path, ID and URL as shown in Figure 19.
        Figure 19. Cognos search path
        option to Viewt he search path, ID and URL
      5. Copy the content of the Search path box.
      6. Click Close and paste the search path content into the Repository subset field. Do the same for all the reports in Cognos.
  3. Enter optional parameters (for example, host system name, database name, schema name, DBMS name, and DBMS server instance name) as shown in Figure 20.
    Figure 20. Import export manager showing the parameters for Cognos 8 BI reporting content manager
    shows optional paramaters and additional parameters
  4. When you are finished entering parameters, click OK to begin the import process.
  5. When the import process is finished and you receive a "Completed successfully" message in the status pop-up window, select Filter > Add All > OK.
  6. From the Parameter Selection pop-up window, enter the Metadata Server parameters as shown in Figure 21. The mandatory parameters are host name, port number, username, and password. If you want the bridge to check for duplicate objects with the same name before import, select the Check for Duplicates option. If a duplicate exists, the import fails and you would then need to execute the import activity again.
    Figure 21. Import Export Manager — parameter selection for Metadata Server
    shows the parameters
  7. Upon successful completion, you may have to repeat the steps to import other reports and models.

    You can import all the reports in a particular package by specifying //reports on the repository subset parameter as shown in the example below:

    Repository subset=/content/package[@name='DEMO ERS']//reports

    Once you have successfully imported the Cognos report and model into the metadata repository, you can view them from Workbench as shown in the Figure 22.

    Figure 22. Metadata Workbench showing the BI reports imported into repository
    shows a list of the BI reports that have been imported into repository

Viewing DataStage jobs in Workbench

All components of the InfoSphere product suite share the same metadata repository. Therefore, the DataStage jobs are present in the same metadata repository and need not be imported.

You can view the DataStage jobs from Workbench as shown in Figure 23.

Figure 23. Metadata Workbench showing the DataStage jobs and stages
Metadata Workbench showing the list of data stage jobs and detail for a particular job

Importing Business Glossary

You can import a Business Glossary into Workbench in three ways:

  • Create terms and categories manually from the Information Server console.
  • Import a csv files containing the glossary terms.
  • Import terms and categories from the glossary model in InfoSphere Data Architect, a modeling tool.

For more details on importing a Business Glossary, refer to the "Sharing a common vocabulary across InfoSphere Foundation Tools" developerWorks article that is linked to in the Resources section.

Figure 24 shows an example of how an imported business glossary appears in Workbench.

Figure 24. Metadata Workbench showing imported business terms
screen shows example business terms such adjusted financial collateral amount and adjusted guarantee amount

Creating stewards in Workbench

Stewards are users who are responsible for a particular IT asset. When you create stewards, you assign them specific assets that they are responsible for.

By using the Glossary tab of the Information Server web console, you can also create stewards who are the owners of an IT asset in Workbench. These stewards are the individuals who should be contacted in case of emergency or if someone needs additional information about the asset.

Figure 25 shows details of a steward user.

Figure 25. Information Server web console Glossary tab, showing stewards
shows contact information for the data steward

Figure 26 shows the assets that the steward manages.

Figure 26. Information Server web console Glossary tab showing asset managed by steward
shows list of assets that the steward is responsible for

Any user who has Information Server administration rights can add a steward, edit steward details, and add or remove assets managed by a steward.

You can view steward details in Workbench as shown in Figure 27.

Figure 27 Metadata Workbench showing the steward details
Workbench showing steward details such as ID, title, contact information, etc.

Mapping metadata in Workbench

In order to view lineage and impact analysis reports, internally map heterogeneous metadata that has been imported from various sources into Workbench to other related metadata. This section of the tutorial shows you how to perform the following types of mapping that are required to view lineage reports:

Mapping database instances

There may be two instances of the same database listed on the Data servers tab.

  • The first instance corresponds to the manual import of database objects.
  • The second instance is created when a Cognos report is imported with its database schema.

Link the DataStage jobs and Cognos reports by creating the following mappings between the databases mentioned above:

  • Cognos reports to the database tables
  • Database tables to the DataStage jobs

On the Advanced tab of Workbench, select Data source Identity as shown in Figure 28 and map both the databases.

Figure 28. Data Source Identity selection on Advanced tab of Metadata Workbench
Advanced tab of Metadata Workbench with Data Source Identity selection highlighted

Mapping database alias and parameters

On the Advanced tab of Workbench, select Database Alias as shown in Figure 29 and map all the variables to the respective database if they are not mapped automatically.

Figure 29. Database Alias selection on Advanced tab of Metadata Workbench
Advanced tab of Metadata Workbench with Database Alias selection highlighted

Automated mapping of DataStage stages to physical database tables

After completing the steps described above, select Automated Services on the Advanced tab of Workbench. Select the project for which you want the Workbench to automatically detect the linkage and click Run as shown in Figure 30. When the job is finished, the job status changes from "in progress" to "complete."

Figure 30. Automated Services selection on Advanced tab of Metadata Workbench
Advanced tab of Metadata Workbench with Automated Services selection highlighted

Automated services analyze the following types of DataStage stages:

  • Most stages that connect to databases and data source files
  • Passive stages that connect to sources of data

Automated services do not analyze stages from active stages that simply transform data, such as the transformer, aggregator, sort, and join stages.

Manual binding of DataStage job to physical database tables

There may be instances when automated services do not bind the DataStage stages to the data items. In these cases, select Data Item binding on the Advanced tab of Workbench as shown in Figure 31 to bind DataStage stages to physical data items.

Figure 31. Data Item Binding selection on Advanced tab of Metadata Workbench
Advanced tab of Metadata Workbench with Data Item Binding selection highlighted

Viewing reports in Metadata Workbench

The data lineage and job lineage reports show the movement of data within a job or across multiple jobs. Impact analysis reports show the dependencies between assets.

You can start a report from the task list of an asset information page or from the context menu of an asset in a results list. You can track the flow from source to target or from target to source. This section of the tutorial describes these three types of reports:

Data lineage

Data lineage reports show the flow of data to or from a selected metadata asset, through stages and stage columns, across one or more jobs, into databases and business intelligence (BI) reports. Open any report in the Workbench as shown in Figure 32 and click Data Lineage. This displays a Report Finder-Data Lineage window similar to what is shown in Figure 33.

Figure 32. Metadata Workbench showing imported Cognos reports
Metadata Workbench showing imported Cognos reports with report options highlighted

From the Report Finder-Data Lineage window, select the Where does data come from? radio button and click Create Report. This displays a Data Lineage report screen that shows the BI Report Collection, Database Tables, DataStage stages, and Data File element linked to the selected Cognos BI report.

Figure 33. Metadata Workbench showing data lineage for a BI report
Example of a BI report shown in the Metadata Workbench

To see a graphical view of the data lineage, click the Graph View button at the bottom of the Data Lineage report screen. Select the appropriate database column or DataStage stage to be traced back and click Graph View to see a graphical view similar to what is shown in Figure 34.

Figure 34. Metadata Workbench showing graphical view of data lineage
Example of a data lineage graphical view

Figure 35 shows an example of the linkage to the actual source data file.

Figure 35. Metadata Workbench showing data lineage report with linkage to data source file
Example of a data lineage report showing linkage between the data source file and DataStage job and physical database table

Job lineage

Job lineage shows the flow of data to or from a selected metadata asset across one or more jobs, database tables, database views, data file elements, business intelligence (BI) reports, and information services operations. From the BI report you have open in Workbench (Figure 32), select Job Lineage. This displays a Report Finder-Job Lineage window similar to what is shown in Figure 36. From the Report Finder-Job Lineage window, select the Where does data come from? radio box and click Create Report. This displays a Job Lineage report screen that shows information about the linked DataStage jobs along with DB tables and BI report collection.

Figure 36. Metadata Workbench showing a job lineage for BI report
Example of a job lineage for BI report shown in Metadata Workbench

To see a graphical view of the job dependency similar to what is shown in Figure 37, click the Graph View button at the bottom of the Job Lineage report screen.

Figure 37. Metadata Workbench showing job lineage graphical report
Example of Metadata Workbench showing a job lineage graphical report

Impact analysis

The impact analysis report displays the other assets that depend on the selected asset or the other assets that the selected asset depends on. To view the impact analysis report, select an asset, such as a physical database table, and click Impact Analysis. This displays data such as dependent jobs, tables, reports, and models in an impact analysis report similar to what is shown in Figure 38.

Figure 38. Metadata Workbench showing Impact analysis report
Example of an impact analysis report shown in the Metadata Workbench

Figure 39 shows an example of the graphical view of the dependency between DataStage stages and the database tables.

Figure 39. Graphical view of Impact Analysis
Example of the graphical view of and impact analysis report

Configuring Cognos 8 BI for lineage reports and Business Glossary

To view a lineage report, select the Lineage link that appears on the pop-up menu of the Cognos report. You can view the glossary definition of the report field by selecting the Glossary link. Figure 40 shows an example of these links.

Figure 40. Cognos report showing lineage and glossary option
Example of a Cognos report with the Lineage and Glossary menu options

To configure lineage and glossary from Cognos reports to InfoSphere Metadata Workbench, you need to set the URI value by navigating to Cognos Administrator > Configuration > Dispatcher and Services Properties as shown in Figure 41.

Figure 41. IBM Cognos Administrator showing selection of Dispatcher and Services Configuration Properties
Screenshot of IBM Cognos Administrator with selction path made for Dispatcher and Services Configuration Properties

To configure lineage, set Metadata Information Services URI to a value of:
where <ip> is the Workbench server IP address.

To configure glossary, set the Business Glossary URI to a value of:
Replace localhost with the server IP address.

Viewing the glossary for any report field

As shown in Figure 40, you can open any report in Cognos Connection and right click on any term to view a link to a glossary. Click the Glossary link, login on the InfoSphere Business Glossary login screen, and go to the list of Business Glossary terms and categories screen. The terms and categories screen contains a set of extended search results that includes terms for all the values that match the words in the term you selected. For example, the term:
Value adjustments and provisions associated with the original amount
would be split into values, adjustments, provisions, original, etc. as shown in Figure 42

Figure 42. Glossary search result for selected report field
Example of returned glossary search results contining a list of eight terms and their definitions

Viewing lineage for any report field from Cognos 8 BI

To view a data lineage report, open any report in Cognos Connection or any other Cognos 8 BI product component (for example, Framework Manager, Report Studio, etc.) and select the lineage link on the pop-up menu as shown in Figure 40. This takes you to the InfoSphere Metadata Workbench and shows a data lineage report for the selected BI report field similar to what is shown in Figure 43.

Figure 43. BI report field data lineage in Metadata Workbench
Example of a BI report field data lineage shown in the Metadata Workbench


This tutorial showed you how to enhance the business intelligence system of an enterprise with lineage reports and a business glossary by using IBM Cognos and InfoSphere products to integrate the technical and business metadata of your the enterprise.


Special thanks to Marc Haber and Hayden Marchant from IBM Metadata Development at the Israel Software Lab for reviewing this tutorial and for their guidance and support on metadata products. Thanks also to Chintamani M. Chhatre who is the Industry Architect for Banking and Financial Markets at Global Business Solution Center, India for his support in publishing this tutorial.



Get products and technologies

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.



developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Information management on developerWorks

Zone=Information Management, Big data and analytics
ArticleTitle=Integrate enterprise metadata with IBM InfoSphere and Cognos