Aligning business and IT: A solution for a master data-driven business glossary

From a governance perspective, it is crucial that an InfoSphere® Master Data Management (MDM) Server's logical domain definitions and their dependencies to physical assets are easily accessible within the entire organization. The integration between InfoSphere Data Architect (IDA) and InfoSphere Business Glossary (BG) helps accelerate this process by establishing a business vocabulary based on approved logical data model (LDM) entities and attributes. The current IDA/BG integration however, would not retain the dependencies to their physical model definitions.
Based on the example of MDM Server's logical and physical data models, the article provides a solution to generate an LDM-based Business Glossary which retains it's relationships to the underlying physical data model. Maintaining such relationships is critical for the overall success and value of the Business Glossary for both Business and IT.

Share:

Beate Porst (porst@us.ibm.com), InfoSphere Integration Architect, IBM

Photo of Beate PorstBeate Porst is an architect in the IBM Information Management Engineering and Solution group at IBM's Silicon Valley Laboratory. Her current focus is on developing reusable architectural patterns and solutions to support a richer metadata integration amongst IBM Information Management products. Beate has more then 10 years experience in Information Management, holding engineering and product management roles in DB2, Infosphere Federation Server, and Infosphere Information Server. Beate holds a Masters Degree in Computer Science from the University of Rostock/Germany.



20 January 2011

Also available in Chinese

Introduction

A Master Data Management (MDM) repository, such as the one used by IBM InfoSphere MDM Server, is critical not only from the well understood aspect of the data that is being managed there, but also from the data's logical domain definitions. If the MDM data is regarded to be the single version of the truth, then the metadata, to a large degree, should also be considered the approved terminology for the organization in which the MDM system is being applied. As a consequence, it is important that the MDM Server's domain definitions and their relationships to enterprise assets be easily accessible within the entire organization.

To support MDM Server's SOA and object model centricity, MDM Server provides its supported domain definition in the form of logical and physical data models in order to enable customization and extensions of these domains.

From a data modeling perspective, a Logical Data Model (LDM) typically represents an independent, business-driven view of an organization's data. LDM Entities and Attributes and their descriptions are often either derived from, or later become, an organization's business vocabulary. On the other hand, a physical data model (PDM) which carries the DBM extension in InfoSphere Data Architect, takes into account the constraints and facilities of a given Database Management System. In a project life cycle, a PDM is typically derived from an LDM.

To ensure traceability between the logical and physical representation, data modeling tools like IDA provide functionality to automatically establish links (dependencies) between a logical model object like an Attribute, and a physical model object like a Column during the transformation process.

As said earlier, the business information inherent in a logical data model, and the information where these business definitions are being physically deployed is highly desirable and relevant for the entire business process life cycle. Therefore, organizations should strive to make the information available to a larger audience. However, a data modeling tool is not a suitable interface for reaching a large or business-driven audience.

To make business definitions and their relationships to enterprise assets accessible throughout the entire organization for everyday communication, IBM provides IBM InfoSphere Business Glossary (BG), which is a component of IBM's InfoSphere Information Server platform (IS) that supports customers in building and governing an authoritative dictionary of their terms and relationships, including references to IT assets that implement, deploy, or further describe business terminology. Although a business glossary could be created from scratch, organizations usually want to leverage existing assets and terminology, including those contained in the MDM Server data models.

What is the challenge?

IBM provides a number of integration points between IS and IDA to accelerate the process for establishing a business vocabulary based on approved logical entities and attributes. For example, logical models can be imported into BG, they can be represented there as categories and terms (IDA MetaBroker), or Business Glossary terms can directly be assigned to data model objects within the IDA Eclipse environment using the BG Eclipse plugin. Despite the broad integration, there are some scenarios which aren't supported directly out of the box.

In the article's use case of a "master-data" driven Business Glossary, the goal is not just to generate a Business Glossary from an LDM, but more importantly to retain the existing dependencies between the LDM-based Business Glossary and the underlying physical data model after their import into Information Server, as shown in Figure 1. Preserving those relationships is critical for the overall success and value of the Business Glossary for both Business and IT audience.

Figure 1. Use-case goal
Image of the use case goal

The challenge from an Information Server point of view is that only inter-model relationships between a PDM and an NDM are automatically re-established. The MDM Server dependencies however, exist between a PDM and an LDM. This means that, although the model itself can be imported into Information Server, the inherent model dependencies can not be imported, and therefore would be lost.

What is the solution providing?

As shown in Figure 2, a solution is provided that generates a set of compliant and semantically equivalent IDA models (an NDM/PDM model pair) based on an existing LDM/PDM model pair like your MDM Server LDM and PDM set. These generated NDM and PDM models will then provide the required inter-model relationships to allow for automatic re-creation of the inherent dependencies in Information Server.

Figure 2. Solution outline
Image of graphical solution outline

Solution outline

  1. Validate System Requirements
  2. Download and unzip the provided Dependency Mapping package
  3. Generate an IDA Glossary model
  4. Execute the Dependency Mapping utility
  5. Export IDA data model to Information Server
  6. Review exported assets in Business Glossary

Step 1: Validate system requirements

In order to successfully execute the described solution, your system must meet the following system requirements:

  • A client system running a supported Windows operating system (Windows XP SP2, Windows Vista, Windows 7, or Windows Server 2003) which has the following components installed:
    • IDA v7.5.2 or higher (including feature: Information Server Integration)
    • Information Server 8.1.x or 8.5 Metadata Broker for IDA
    • Java JRE v1.5 or higher and Java.exe contained in the system path
  • The system must be able to connect to an Information Server 8.1 or 8.5 server to import into Information Server.

Step 2: Download and unzip the Dependency Mapping package

Download the DependencyMapper_SourceCode zip file and unpack the file into your C:\ folder. This will create a sub-folder named DeveloperWorks

Although the DeveloperWorks directory does not have to be located in the root directory, or even on your Windows C:\drive, it is recommend because the sample.properties files used by the mapping utility uses absolute path names. If you extract the package into a different location you need to adjust the path settings in the sample.properties file.

After you unzip the file, you should see a DeveloperWorks directory containing the following subdirectories:

  • DependencyMapping -- this folder contains the Dependency Mapping utility files
  • IDA_Workspace -- This IDA workspace folder contains a Data Design project named DeveloperWorks which includes two sample data models representing tiny snippets of the Customer data models provided by InfoSphere MDM Server.
    • a Logical Data Model named: MDM_CUSTOMER_SAMPLE.ldm
    • a Physical Data Model named: MDM_CUSTOMER_SAMPLE.dbm

Step 3: Generate an IDA Glossary model

You will use IDA to generate an IDA Glossary Model file (NDM) from the MDM_CUSTOMER_SAMPLE logical data model.
The functionality to transform an LDM into an equivalent NDM is provided through the Information Server MetaBroker for IDA and the Information Server integration in IDA.
In order to generate an LDM-equivalent IDA Glossary model, you are actually "pretending" to export an LDM file to the Metadata Server.

  1. Start IDA and launch workspace: C:\DeveloperWorks\IDA_Workspace.
  2. In the Data Project Explorer, expand Data Models. Right-click on file MDM_CUSTOMER_SAMPLE.ldm and select Export to open the Export wizard.
  3. Expand the Data folder and select Export a Glossary Model to the Metadata Server as shown in Figure 3. Then click Next.
    Figure 3. Export a Glossary / LDM to Metadata Server
    Image showing export of a Glossary / LDM to Metadata Server
  4. From the Select the Model to Export screen, as shown in Figure 4, click the Transform and export a logical or domain model into a glossary model in Metadata Server radio button to make LDM files visible. Select MDM_CUSTOMER_SAMPLE.ldm from the DeveloperWorks project and click Next.
    Figure 4. Select the Logical model
    Select the Logical Model
  5. From the Transformation Messages window, click Finish.
  6. From the Status window, click Cancel as shown in Figure 5.

    The IDA Glossary file has been generated during this first step. Therefore you can safely cancel the remaining of the Export process.

    The generated IDA Glossary can be found in the %TEMP% folder.

    Figure 5. Transformation Status
    Image showing transformation Status
  7. Open Windows Explorer, and type %TEMP% in the Address field, and then click Enter, as shown in Figure 6.
    Figure 6. Navigate to the %TEMP% directory
    Navigate to the %TEMP% directory
  8. In your %TEMP% folder, locate the generated IDA Glossary file (Figure 7). The file will have a generated name with an .ndm extension, for example, lTon_1289524238643.ndm.
    Figure 7. Generated IDA Glossary File
    Image shows the generated IDA Glossary File
  9. Rename the file to MDM_CUSTOMER_SAMPLE.ndm.
  10. Copy the renamed glossary file into your IDA DeveloperWorks project folder, which should be something like, C:\DeveloperWorks\IDA_Workspace\DeveloperWorks.
  11. Return to the IDA Data perspective. Right-click the DeveloperWorks project in the Data Project Explorer view and select Refresh, as shown in Figure 8.
    Figure 8. Refresh the IDA Project to see the copied Glossary File
    Refresh the Project to see the copied Glossary File

Step 4: Execute the Dependency Mapping utility

You will now execute the Dependency Mapping utility (MapModels.bat) to redirect dependency links. In your MDM Server models (including the sample models), dependencies point from objects in the physical model to objects in the logical model, as shown in Figure 9. After running the mapping utility, the dependencies in the physical model will point to semantically equivalent objects in the IDA glossary model generated in the previous step.
Note: The mapping utility will not modify the original data model files. Instead a new PDM is generated. The original model is being renamed using the pattern: <PDM File Name>_ORIGINAL.dbm.

Figure 9. Dependency between Column: name and Attribute: Product Name
Image shows the PDM / LDM Dependency
  1. Open a Windows command-line interpreter (cmd.exe) and navigate to: C:\DeveloperWorks\DependencyMapping.
    • If you are using your own data model files, or have unpacked the zip file into a different location, you can provide these settings by adjusting the parameters in the sample.properties file.
    • If your java.exe version 1.5 or higher is not in your system path, then open the MapModel.bat file in edit mode and set the complete path to your java.exe for the JAVA_EXE environment variable.
  2. Execute the MapModels.bat batch file from within the DependencyMapping folder as shown in Figure 10.
    • The mapping utility will not overwrite the original physical data model. Instead the original data model is renamed to MDM_CUSTOMER_SAMPLE_ORIGINAL.dbm, and the physical data model generated by the mapping tool will now be named MDM_CUSTOMER_SAMPLE.dbm.
    • At the end of the mapping process, both the original and generated files can be found in your IDA Data Design project: DeveloperWorks. Refresh the project to see the changes.
    Figure 10: Executing the mapping utility
    Image of the mapping utility screen being executed

Step 5: Export generated IDA data models to Information Server

In this task you will export the IDA Glossary and modified physical data model to the Information Server (also called the Metadata Server). Both files can be exported simultaneously by using the Physical Model to the Metadata Server export wizard.

  1. To open the export wizard, in the IDA Data Project Explorer, right-click MDM_CUSTOMER_SAMPLE.dbm and then click Export.
  2. Expand the Data folder and select Export a Physical Model to the Metadata Server, as shown in Figure 11. Then click Next.
    Figure 11: Export the PDM and Glossary to Metadata Server
    Image showing export the PDM and Glossary to Metadata Server
  3. From the Select the Model to Export screen, as shown in Figure 12, you can do the following:
    • Select MDM_CUSTOMER_SAMPLE.dbm in the DeveloperWorks project.
    • Click the Set classified objects for business terms based on check box. Selecting this is very important because this will ensure that both, the glossary and the physical model are exported to the Metadata Server and objects remain linked.
    • In the Host System name field, set a Host name under which the imported physical model will be grouped in Information Server.
    • Click Finish.
    Figure 12: Select the Model to Export
    Image shows how to select a model to export.
  4. From the following Status window, click Select All.
  5. From the Parameter Selection window, set your Information Server connection parameters, as shown in Figure 13, to the following:
    • Host Name / Port Number: the host name and port number of your Metadata Server.
    • User Name / Password: an Information Server Administrator User Id and Password.
    • To speed up the import process, clear the Check for Duplicates check box if you know that you hadn't imported it yet.
    • Click OK to start the import into the Metadata Server.
    Figure 13: Information Server connection parameter
    Image shows the Information Server connection parameter screen.

Step 6: Review the exported assets in Business Glossary

In this final step, you will use the InfoSphere Business Glossary Browser to verify that the glossary and physical model were successfully exported.

  1. Open a web browser and enter the Business Glossary Browser URL in the address field.
    • The Business Glossary Browser uses the following URL pattern: http://<InformationServer host name>:<InformationServer port number>/bg
    • For host and port number, use the same host name and port number as used during the export process, for example: http://localhost:9080/bg
  2. To authenticate from the login page, you can use the same user name and password that you used during the export step.
  3. After you have successful logged into the Business Glossary, click Category Tree. Underneath top-level category MDM Core Table.ER1 you will find the entities (now Categories) and attributes (now Terms) of the MDM_CUSTOMER_SAMPLE logical data model, as shown in Figure 14.
    Figure 14: Imported MDM Customer Sample Glossary
    Image shows Imported MDM Customer Sample Glossary
  4. Click the PARTY category, which lists the contained terms on the right side of the Category Tree view.
  5. From the right side, click the term Party ID. This will open the detailed page for the Party ID term.
  6. By expanding the Assigned Assets section you can see that Party ID contains a dependency to its physical implementation, as shown in Figure 15.
    Figure 15: Association between a term (Attribute) and its physical column
    Image shows association between a Term (Attribute) and its physical column

Conclusion

It is very important to be able to leverage your MDM Server's logical domain definition as approved business vocabulary, and have the ability to retain the dependency to the physical asset for auditing and data analysis purposes. You can achieve this by using the solution described in this article.
Although the article depicts MDM Server data models for its scenario, the provided solution can be applied against any IDA 7.5.x LDM/PDM model pair.

Acknowledgments

The author would like to thank Paul van Run for his feedback and review of this article.


Download

DescriptionNameSize
Java Source Code for Dependency Mapping ToolDependencyMapper_SourceCode.zip84KB

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=608439
ArticleTitle=Aligning business and IT: A solution for a master data-driven business glossary
publish-date=01202011