A Master Data Management (MDM) repository, such as the one used by IBM InfoSphere MDM Server, is critical not only from the well understood aspect of the data that is being managed there, but also from the data's logical domain definitions. If the MDM data is regarded to be the single version of the truth, then the metadata, to a large degree, should also be considered the approved terminology for the organization in which the MDM system is being applied. As a consequence, it is important that the MDM Server's domain definitions and their relationships to enterprise assets be easily accessible within the entire organization.
To support MDM Server's SOA and object model centricity, MDM Server provides its supported domain definition in the form of logical and physical data models in order to enable customization and extensions of these domains.
From a data modeling perspective, a Logical Data Model (LDM) typically represents an independent, business-driven view of an organization's data. LDM Entities and Attributes and their descriptions are often either derived from, or later become, an organization's business vocabulary. On the other hand, a physical data model (PDM) which carries the DBM extension in InfoSphere Data Architect, takes into account the constraints and facilities of a given Database Management System. In a project life cycle, a PDM is typically derived from an LDM.
To ensure traceability between the logical and physical representation, data modeling tools like IDA provide functionality to automatically establish links (dependencies) between a logical model object like an Attribute, and a physical model object like a Column during the transformation process.
As said earlier, the business information inherent in a logical data model, and the information where these business definitions are being physically deployed is highly desirable and relevant for the entire business process life cycle. Therefore, organizations should strive to make the information available to a larger audience. However, a data modeling tool is not a suitable interface for reaching a large or business-driven audience.
To make business definitions and their relationships to enterprise assets accessible throughout the entire organization for everyday communication, IBM provides IBM InfoSphere Business Glossary (BG), which is a component of IBM's InfoSphere Information Server platform (IS) that supports customers in building and governing an authoritative dictionary of their terms and relationships, including references to IT assets that implement, deploy, or further describe business terminology. Although a business glossary could be created from scratch, organizations usually want to leverage existing assets and terminology, including those contained in the MDM Server data models.
What is the challenge?
IBM provides a number of integration points between IS and IDA to accelerate the process for establishing a business vocabulary based on approved logical entities and attributes. For example, logical models can be imported into BG, they can be represented there as categories and terms (IDA MetaBroker), or Business Glossary terms can directly be assigned to data model objects within the IDA Eclipse environment using the BG Eclipse plugin. Despite the broad integration, there are some scenarios which aren't supported directly out of the box.
In the article's use case of a "master-data" driven Business Glossary, the goal is not just to generate a Business Glossary from an LDM, but more importantly to retain the existing dependencies between the LDM-based Business Glossary and the underlying physical data model after their import into Information Server, as shown in Figure 1. Preserving those relationships is critical for the overall success and value of the Business Glossary for both Business and IT audience.
Figure 1. Use-case goal
The challenge from an Information Server point of view is that only inter-model relationships between a PDM and an NDM are automatically re-established. The MDM Server dependencies however, exist between a PDM and an LDM. This means that, although the model itself can be imported into Information Server, the inherent model dependencies can not be imported, and therefore would be lost.
What is the solution providing?
As shown in Figure 2, a solution is provided that generates a set of compliant and semantically equivalent IDA models (an NDM/PDM model pair) based on an existing LDM/PDM model pair like your MDM Server LDM and PDM set. These generated NDM and PDM models will then provide the required inter-model relationships to allow for automatic re-creation of the inherent dependencies in Information Server.
Figure 2. Solution outline
- Validate System Requirements
- Download and unzip the provided Dependency Mapping package
- Generate an IDA Glossary model
- Execute the Dependency Mapping utility
- Export IDA data model to Information Server
- Review exported assets in Business Glossary
Step 1: Validate system requirements
In order to successfully execute the described solution, your system must meet the following system requirements:
- A client system running a supported Windows operating system (Windows
XP SP2, Windows Vista, Windows 7, or Windows Server 2003) which has
the following components installed:
- IDA v7.5.2 or higher (including feature: Information Server Integration)
- Information Server 8.1.x or 8.5 Metadata Broker for IDA
- Java JRE v1.5 or higher and Java.exe contained in the system path
- The system must be able to connect to an Information Server 8.1 or 8.5 server to import into Information Server.
Step 2: Download and unzip the Dependency Mapping package
Download the DependencyMapper_SourceCode zip file and unpack the file into your C:\ folder. This will create a sub-folder named DeveloperWorks
Although the DeveloperWorks directory does not have to be located in the root directory, or even on your Windows C:\drive, it is recommend because the sample.properties files used by the mapping utility uses absolute path names. If you extract the package into a different location you need to adjust the path settings in the sample.properties file.
After you unzip the file, you should see a DeveloperWorks directory containing the following subdirectories:
- DependencyMapping -- this folder contains the Dependency Mapping utility files
- IDA_Workspace -- This IDA workspace folder contains a
Data Design project named DeveloperWorks which includes two
sample data models representing tiny snippets of the Customer data
models provided by InfoSphere MDM Server.
- a Logical Data Model named: MDM_CUSTOMER_SAMPLE.ldm
- a Physical Data Model named: MDM_CUSTOMER_SAMPLE.dbm
Step 3: Generate an IDA Glossary model
You will use IDA to generate an IDA Glossary Model file (NDM) from the
MDM_CUSTOMER_SAMPLE logical data model.
The functionality to transform an LDM into an equivalent NDM is provided through the Information Server MetaBroker for IDA and the Information Server integration in IDA.
In order to generate an LDM-equivalent IDA Glossary model, you are actually "pretending" to export an LDM file to the Metadata Server.
- Start IDA and launch workspace: C:\DeveloperWorks\IDA_Workspace.
- In the Data Project Explorer, expand Data Models. Right-click on file MDM_CUSTOMER_SAMPLE.ldm and select Export to open the Export wizard.
- Expand the Data folder and select Export a
Glossary Model to the Metadata Server as shown in Figure 3. Then click Next.
Figure 3. Export a Glossary / LDM to Metadata Server
- From the Select the Model to Export screen, as shown
in Figure 4, click the Transform and
export a logical or domain model into a glossary model in Metadata
Server radio button to make LDM files visible. Select
MDM_CUSTOMER_SAMPLE.ldm from the DeveloperWorks
project and click Next.
Figure 4. Select the Logical model
- From the Transformation Messages window, click Finish.
- From the Status window, click Cancel as shown in Figure 5.
The IDA Glossary file has been generated during this first step. Therefore you can safely cancel the remaining of the Export process.
The generated IDA Glossary can be found in the %TEMP% folder.
Figure 5. Transformation Status
- Open Windows Explorer, and type %TEMP% in the Address
field, and then click Enter, as shown in Figure 6.
Figure 6. Navigate to the %TEMP% directory
- In your %TEMP% folder, locate the generated IDA Glossary file (Figure 7). The file will have a generated name
with an .ndm extension, for example,
Figure 7. Generated IDA Glossary File
- Rename the file to MDM_CUSTOMER_SAMPLE.ndm.
- Copy the renamed glossary file into your IDA DeveloperWorks project folder, which should be something like, C:\DeveloperWorks\IDA_Workspace\DeveloperWorks.
- Return to the IDA Data perspective. Right-click the DeveloperWorks
project in the Data Project Explorer view and select
Refresh, as shown in Figure
Figure 8. Refresh the IDA Project to see the copied Glossary File
Step 4: Execute the Dependency Mapping utility
You will now execute the Dependency Mapping utility
MapModels.bat) to redirect dependency links.
In your MDM Server models (including the sample models), dependencies
point from objects in the physical model to objects in the logical model,
as shown in Figure 9. After running the mapping
utility, the dependencies in the physical model will point to semantically
equivalent objects in the IDA glossary model generated in the previous
Note: The mapping utility will not modify the original data model files. Instead a new PDM is generated. The original model is being renamed using the pattern: <PDM File Name>_ORIGINAL.dbm.
Figure 9. Dependency between Column: name and Attribute: Product Name
- Open a Windows command-line interpreter (cmd.exe) and navigate to:
- If you are using your own data model files, or have unpacked the zip file into a different location, you can provide these settings by adjusting the parameters in the sample.properties file.
- If your java.exe version 1.5 or higher is not in your system
path, then open the
MapModel.batfile in edit mode and set the complete path to your java.exe for the JAVA_EXE environment variable.
- Execute the
MapModels.batbatch file from within the DependencyMapping folder as shown in Figure 10.
- The mapping utility will not overwrite the original physical data model. Instead the original data model is renamed to MDM_CUSTOMER_SAMPLE_ORIGINAL.dbm, and the physical data model generated by the mapping tool will now be named MDM_CUSTOMER_SAMPLE.dbm.
- At the end of the mapping process, both the original and generated files can be found in your IDA Data Design project: DeveloperWorks. Refresh the project to see the changes.
Figure 10: Executing the mapping utility
Step 5: Export generated IDA data models to Information Server
In this task you will export the IDA Glossary and modified physical data model to the Information Server (also called the Metadata Server). Both files can be exported simultaneously by using the Physical Model to the Metadata Server export wizard.
- To open the export wizard, in the IDA Data Project Explorer, right-click MDM_CUSTOMER_SAMPLE.dbm and then click Export.
- Expand the Data folder and select Export a
Physical Model to the Metadata Server, as shown in Figure 11. Then click Next.
Figure 11: Export the PDM and Glossary to Metadata Server
- From the Select the Model to Export screen, as shown
in Figure 12, you can do the following:
- Select MDM_CUSTOMER_SAMPLE.dbm in the DeveloperWorks project.
- Click the Set classified objects for business terms based on check box. Selecting this is very important because this will ensure that both, the glossary and the physical model are exported to the Metadata Server and objects remain linked.
- In the Host System name field, set a Host name under which the imported physical model will be grouped in Information Server.
- Click Finish.
Figure 12: Select the Model to Export
- From the following Status window, click Select All.
- From the Parameter Selection window, set your Information Server
connection parameters, as shown in Figure 13,
to the following:
- Host Name / Port Number: the host name and port number of your Metadata Server.
- User Name / Password: an Information Server Administrator User Id and Password.
- To speed up the import process, clear the Check for Duplicates check box if you know that you hadn't imported it yet.
- Click OK to start the import into the Metadata Server.
Figure 13: Information Server connection parameter
Step 6: Review the exported assets in Business Glossary
In this final step, you will use the InfoSphere Business Glossary Browser to verify that the glossary and physical model were successfully exported.
- Open a web browser and enter the Business Glossary Browser URL in the
- The Business Glossary Browser uses the following URL pattern: http://<InformationServer host name>:<InformationServer port number>/bg
- For host and port number, use the same host name and port number as used during the export process, for example: http://localhost:9080/bg
- To authenticate from the login page, you can use the same user name and password that you used during the export step.
- After you have successful logged into the Business Glossary, click
Category Tree. Underneath top-level category
MDM Core Table.ER1 you will find the entities
(now Categories) and attributes (now Terms) of the MDM_CUSTOMER_SAMPLE
logical data model, as shown in Figure 14.
Figure 14: Imported MDM Customer Sample Glossary
- Click the PARTY category, which lists the contained terms on the right side of the Category Tree view.
- From the right side, click the term Party ID. This will open the detailed page for the Party ID term.
- By expanding the Assigned Assets section you can see
that Party ID contains a dependency to its physical implementation, as
shown in Figure 15.
Figure 15: Association between a term (Attribute) and its physical column
It is very important to be able to leverage your MDM Server's logical
domain definition as approved business vocabulary, and have the ability to
retain the dependency to the physical asset for auditing and data analysis
purposes. You can achieve this by using the solution described in this
Although the article depicts MDM Server data models for its scenario, the provided solution can be applied against any IDA 7.5.x LDM/PDM model pair.
The author would like to thank Paul van Run for his feedback and review of this article.
|Java Source Code for Dependency Mapping Tool||DependencyMapper_SourceCode.zip||84KB|
- Go to the InfoSphere area on developerWorks, to get the resources you need to advance your skills on InfoSphere products.
- Learn more about InfoSphere Business Glossary and InfoSphere MDM Server
- Refer to developerWorks article The information perspective of SOA design, Part 2: The value of applying the business glossary pattern in SOA (developerWorks, Oct 2008) to better understand the need for a business glossary within the context of an SOA.
- Refer to developerWorks article Use InfoSphere Business Glossary to define a common business language among modeling tools (Mar 2010) to learn about using Business Glossary with other modeling tools.