Implementing a Transaction Hub MDM pattern using IBM InfoSphere Master Data Management Server

Learn to use the Transaction Hub Master Data Management (MDM) pattern to implement your MDM systems. This typical architectural pattern is described in terms of capabilities of the MDM Reference Architecture. Get an introduction to the IBM® InfoSphere™ MDM Server and see why this software solution is a good choice for implementing this architecture pattern. Finally, learn about an upcoming book, Enterprise Master Data Management: An SOA Approach Managing Core Information, that deals with these topics in greater detail.

Eberhard Hechler (ehechler@de.ibm.com), Executive IT-Architect, IBM

Photo: Eberhard HechlerEberhard Hechler is a Senior Certified IT Architect (SCITA) and Executive IT-Architect, who joined the IBM Boeblingen Lab, Germany in 1983 as a junior programmer. He has worked in software development, performance optimization and benchmarking, solution architecture and design, software product planning, management, technical consultancy and technical alliance management. In 1992, Eberhard began to work with DB2 for MVS, focusing on testing and performance measurements of new DB2 versions. Since 1999, his focus was on Information Management and DB2 UDB on distributed platforms. He is currently a Technical Enablement Architect for IBM’s Information Platform and Solutions, working with System Integrators throughout Europe. Eberhard holds a Master’s Degree in Mathematics (Diplom-Mathematiker) from Hamburg University.



Martin Oberhofer (martino@de.ibm.com), Senior Technical Consultant for Master Data Management, EMC

Martin08Martin Oberhofer joined the IBM Silicon Valley Labs in the US at the beginning of 2002 as a software engineer. He currently works as Senior Technical Consultant and member of the world-wide IBM Software Group Master Data Management Center of Excellence. His areas of expertise are Master Data Management, database technologies, Java development and IT system integration. He provides MDM architecture and MDM solution workshops to customers and major system integrators across Europe. The synchronization and distribution of master data into the operational IT environment is his focus area, particularly master data exchange with SAP application systems. He holds a Masters degree in mathematics from the University of Constance/Germany.


developerWorks Contributing author
        level

Paul van Run (pvanrun@ca.ibm.com), Senior Technical Staff Member, MDM Architect, IBM

Paul van RunPaul van Run has almost 10 years experience in MDM and 15 years in IT. At DWL he was part of the Research and Development leadership team developing DWL Customer, one of the first dedicated CDI products in the market. After the acquisition of DWL by IBM in 2005, he became a Senior Technical Staff Member (STSM), and he is responsible for the architecture of the IBM Master Data Management products: WebSphere Customer Center and WebSphere Product Center, both market leaders in their segments. Before coming to DWL, Paul worked as a software developer in the Insurance Industry for an ING Group subsidiary in Canada. Paul holds a Master’s degree in Information Science from the Technical University of Eindhoven, the Netherlands, and a Master’s degree in Computer Science from the University of Waterloo, Canada.



20 March 2008

Introduction

Master data such as customers, products, accounts or locations is the foundation of critical business decisions and is used in important business processes. Today, many companies have their master data scattered and inconsistent across numerous front- and back-office systems and lack a consistent, complete, and accurate enterprise view of this data. This inefficient handling of data costs businesses millions of dollar in lost revenue. Master Data Management (MDM) solves these issues and improves revenue by enabling cross- and up-sell opportunities.

Implementing a MDM solution successfully requires a well-designed MDM system as a core component of the solution. In the first part of this article, we introduce the Transaction Hub MDM pattern as a specific type of MDM architecture pattern, and describe the key capabilities of an MDM system designed with this pattern. The Transaction Hub MDM pattern is one of three MDM hub patterns that can implement an MDM system. The two related patterns are the Registry Hub pattern and the Coexistence Hub pattern. Table 1 compares the three MDM hub patterns to provide the context for the detailed discussion of the Transaction Hub pattern.

Table 1: Comparison of the three MDM hub patterns
MDM hub patternRegistry HubCoexistence HubTransaction Hub
PurposeCentral referenceHarmonizationTransactional access
System typeSystem of referenceSystem of referenceSystem of record
Method of useOperationalOperational, collaborative, analyticalOperational, collaborative, analytical
Type of accessRead-onlyRead-only (sometimes write)Read/write access
CorrectnessOnly key attributes are materialized (cleansed); all other attributes remain unchanged in source applications with low qualityFull master data model materialized and master data is cleansed and de-duplicated on initial load.
Once changed, correctness is delayed in MDM system due to delayed propagation from source systems
Given at all times since access exclusively uses MDM business services
CompletenessOnly through reference to the source system through virtualization/federationComplete due to full materialization on initial loadComplete due to full materialization on initial load
ConsistencyNo overall consistency because master data remains inconsistent for non-key attributes in source systemsConverging consistency: Multiple source systems and MDM system updates master data requiring conflict resolutionConverging to absolute consistency: Master data is only changed through MDM business services and then propagated to consuming applications asynchronously

Later in this article, the features of the IBM InfoSphere MDM Server are mapped against the required set of capabilities. This mapping helps support the notion that the IBM InfoSphere MDM Server is the right software solution to implement the Transaction Hub MDM pattern.

The transaction hub MDM architecture pattern

First it is important that you understand the value provided by the Transaction Hub MDM pattern. In general terms, an architecture pattern is a solution to a reoccurring problem in a given context; the same is true for MDM architecture patterns. The Transaction Hub MDM pattern presented here has, as part of the pattern description, the context where the pattern is usually applied, the problem which is solved by the pattern and the solution description. The pattern description concludes by demonstrating which forces and constraints an IT architect needs to balance when applying this pattern and then discusses the optimal conditions for using this pattern.

The value of the Transaction Hub MDM architecture pattern

One of the key values of implementing the Transaction Hub MDM pattern is the tight integration of the MDM system into the transactional environment. This goes beyond the capabilities of traditional data warehouse implementations, and integrates usage and business services against master data with the operational legacy systems. The following advantages show why implementing a transactional hub pattern is so valuable:

Correct, consistent and complete master data

  • Only an MDM system built with the Transaction Hub MDM pattern can act as an enterprise-wide system of record because it provides correct, consistent and complete master data at all times. This is a major difference to an MDM system built with the Registry Hub or the Coexistence Hub pattern. These patterns can only function as a system of reference where consistency across the data model for master data is either not possible at all, as with the registry hub, or only achieved with a certain delay, as with the coexistence hub.
  • Correct, consistent and complete master data supports critical business decisions and enables a variety of crucial business optimizations. For example, correct, consistent and complete household-related information enables efficient cross- and up-selling strategies.

SOA enablement

  • Without MDM, SOA enablement can lead to proliferation of low-quality data through SOA services. MDM delivers data quality, and thereby allows for a successful SOA enablement.
  • Decoupling of master data from applications: An MDM system built with the Transaction Hub MDM pattern decouples master data from applications. It provides consistent access to master data through a services-oriented interfaces and has a large number of business services out-of-the-box, which can be invoked through a variety of technologies (JMS, Web services, RMI, and so on). Since the MDM system maintains high-quality master data, proliferation of low-quality master data is eliminated. Without an MDM system, an SOA-only approach carries the risk of dealing with low-quality master data.
  • Reuse of services: The Transaction Hub MDM pattern enables the construction of an MDM system which manages all master data in real-time. Thus, all applications which consume master data can reuse the same service interface. Standardizing which services are used for which tasks related to master data results in consistent treatment of master data.
  • Orchestration of business services: Business services provided by the MDM system can be orchestrated as part of large business processes in an SOA. The MDM system supports global transactional support for its business services. For example, if a large business process does not complete successfully, the service transactions invoked from the MDM system can be rolled back seamlessly as part of a global transaction.

Reduction of risk and fraud

  • High-quality customer information is the foundation for any effective kind of risk mitigation related to customers. Efficient credit scoring, anti-money laundering (AML), anti-terrorist financing (ATF) or homeland security processes require the capability to identify a customer specifically. This capability is enabled by an MDM system built with the Transaction Hub MDM pattern.
  • If the customer identity is known and relationships between customers are maintained accurately, fraud detection and, thus, mitigation is enabled. For example, fraudulent insurance claims can be detected much easier if you know your customers due to access to high-quality customer master data.

Time-to-market advantage

  • If master data, such as product or customer information, is available in a central place, applications consuming master data just need to be integrated with a single system. This is a significant time-to-market advantage if a new application needs to be deployed because integration is needed with a single system only.

Reduced total cost of ownership (TCO)

  • If applications consuming master data need to be integrated with multiple systems for accessing master data, this is costly and creates many point-to-point connections between an application and the master data source systems. If this is done for a large number of applications, the costs grow quickly. Applying the Transaction Hub MDM pattern enables master-data-consuming applications to get all relevant master information from a single place. This reduces the number of point-to-point connections, integration complexity and maintenance costs which results in a reduced TCO.
  • Introducing a central MDM system often reduces the number of redundant, inconsistent copies of master data in the IT environment. This reduces storage as well as administrative costs.
  • Cost of high-availability infrastructure can be limited to the MDM system instead of having to provide it to multiple source systems.

Improved performance

  • The transactional hub pattern stores all master data in a single place which improves performance because federation, transformation and aggregation of master data attributes all come from a single source. Thus, applications that consume master data can access and edit it more quickly and efficiently than information stored with a registry or coexistence pattern.

Improved availability

  • When the Registry Hub MDM pattern is used, the MDM system depends on the availability of all source systems; therefore, if even one of these source systems fail, the view that allows search on master data may no longer be available. Because a transaction hub does not rely on the availability of many source systems, it, therefore, has greater availability.

Improved reporting

  • Unless correct, complete and consistent master data is available, reports based on operational data can be unreliable. The Transaction Hub MDM pattern enables reports by running analytics on the MDM system itself. Again, because the master data is coming from a single source, the reporting is much more accurate.

Improved data governance and streamlined compliance

  • An MDM system built with the Transaction Hub MDM pattern acts as a system of record enterprise-wide; it enables improved data governance on master data because it can be applied and managed effectively in a single place. Since the data governance policies on master data can be applied to all master data consistently in the same system, this yields to improved data governance. Also, business rules on master data can be enforced more easily and more consistently since all master data is in the same system.
  • Finding information regarding who changed what and when the changes were made is easy to when all the information comes from a single place; again, this ensures compliance with industry standards.

Context

When an MDM solution is considered, master data is usually scattered in a silo-like IT environment as Figure 1 shows.

Figure 1: Silo-like IT environment
Silo-like IT environment

This prevents a 360 degree view of master data entities like products and customers which causes a significant business problem. We mention just two examples:

  • If a customer is registered through the sales Line of Business (LOB), the application uses a function to create the master data record. If the same customer would register through the eCommerce platform, a different function would be used to create the master data record. This could lead to a record "Mr. Robert Smith" in the sales LOB system and a "Mr. Bob Smith" (using a nickname for Robert) in the eCommerce platform system. Thus, it’s hard to tell how many customers this enterprise truly has since the number of duplicates is unknown.
  • Similarly, different systems in an IT environment will have different information on products, making it hard or even impossible to accurately report revenue on products.

Given this deployment context, applying the Transaction Hub MDM pattern to construct an MDM system is a two-step process: First, a Master Data Integration (MDI) phase has to be implemented. This usually requires data model discovery in all source systems across all IT silos followed by data profiling to determine the actual state of the data quality. This is a prerequisite for an accurate sizing of the efforts to extract, cleanse, standardize, de-duplicate, transform and, finally, load the master data into the MDM System. Some applicable architecture patterns for the MDI can be found in Resources. Secondly, the MDM system built using the Transaction Hub MDM pattern requires integration with all applications systems requiring access to master data. Using the transaction interception pattern, ESB and messaging patterns or information synchronization patterns are typically means to implement this integration.

Problem statement

As seen in Figure 1, the stovepipe-like IT environment does not allow a 360 degree view of master data. From a business perspective, this redundant, inconsistent and incomplete master data can lead to:

  • Lost revenue opportunities
  • Inaccurate customer and product revenue reports
  • Inefficiencies in the supply chain
  • Costly exchange of product master data between suppliers and retailers

From a technical perspective, the IT architect designing an MDM solution with a system created by applying the Transaction Hub MDM pattern faces the following challenges:

  • Improvement of master data quality during the MDI phase
  • Integration of the MDM System with the legacy systems (for instance, LOB applications) and other down-stream systems which consume and act on master data

From an organizational and political perspective, the IT Architect may be challenged by:

  • Political issues between LOB owners caused by a refinement of spheres of influence if a horizontal data governance board is introduced to centrally govern the MDM system

Solution

The Transaction Hub MDM pattern captures a specific instantiation of the MDM services component as Figure 2 shows. In this case, the instantiation of the MDM services component means that the Transaction Hub MDM pattern requires a full-blown implementation of this component with all capabilities. An MDM system built with the Registry Hub or Coexistence Hub pattern would only have a sub-set of these functions. Thus, it is important to understand the functional aspects of the Transaction Hub MDM pattern in terms of the capabilities of the MDM services component. A full description of each of these capabilities can be found in a later section in this article.

Figure 2: Capabilities of an MDM system built with the Transaction Hub MDM pattern
Figure 2: Capabilities of an MDM System built with the Transaction Hub MDM Pattern

The MDM services component is composed of the following components shown in Figure 2:

Interface services: These services provide a consistent entry point to invoke MDM services through a variety of technologies such as JMS, Web services or RMI supporting a common request and response model. The interface services invoke the same MDM services regardless of how the service is called. The interface services should provide a real-time interface as well as a batch interface for initial loads and delta loads of master data into the MDM system. Furthermore, support for mechanisms such as publish/subscribe is part of this component to ease integration into the existing IT environment. Security and privacy services from the base services component are invoked by the interface services for authorization before invoking services from the lifecycle management component. In addition, the interface services have the ability to accept multiple request message formats through support for pluggable parsers. Furthermore, service composition should be supported.

Lifecycle management services: Lifecycle management services provide business and information services for all master data domains such as customer, product, account or location to create, access and manage master data held within the master data repository. They are invoked by the interface services and invoke the lower-level services as needed. Examples of these business services include services to create customers, to maintain product master data, or to manage location and account master information.

Data quality management services: The services in this group can be divided into two groups:

  • Data validation and cleansing services provide capabilities to specify and enforce data integrity rules. Furthermore, standardization and normalization services for names or address information help to maintain high quality master data. Search results are also better on standardized master data.
  • Reconciliation services provide matching services, conflict resolution services and merge, collapse and split services. Matching services check whether or not a new customer is a duplicate to an existing customer. Collapse and split services are used by data stewards to reconcile duplicates.

Master data event management services: The master data event management services provide the ability to create business rules to react to certain changes on master data and to trigger notification events. Examples could be notifications on cross- and up-sell events or life events (anniversaries, birthdays, retirements, address changes, and the like). Supporting master data governance events could also notify data stewards if a record appears to be a duplicate of an existing record.

Hierarchy and relationship management services: Hierarchy services create and maintain hierarchies. An example of a hierarchy is a product hierarchy where products are classified into categories of the hierarchy. Another example would be a hierarchy for an organization representing a business partner or a customer. In this case, the hierarchy would consist of parent-child relationships between individual departments of the organization. Relationship management services provide the capability to create and maintain relationships and groups. A few examples for relationships are the relationships between customers representing a household or the relationship between products to establish a cross-sell relationship.

Authoring services: Authoring services are used to define or extend the definition of master data entities, hierarchies, relationships and groupings. Check-in/check-out functions on attribute level support the collaborative authoring in workflows such as New Product Introduction.

Base services: The base services component provides services in the following four groups:

  • Privacy and security services implement authorization on four different levels:
    • Service level: determines who is allowed to use the service
    • Entity level: determines who is allowed to read/write a particular entity
    • Attribute level: determines who can read/write which attribute
    • Record level: determines who can update which particular records
  • Audit logging services have the ability to write a complete history of all transactions and events which occurred for a complete trace on what happened in the MDM system. The information traced by the audit logging services can also be used for problem determination or to comply with certain legal requirements.
  • The workflow services support the workflows needed for collaborative authoring of master data in processes like New Product Introduction. They enable business rules and delegation of tasks to external components.
  • Search services allow you to look up and retrieve master data. The search services should be configurable regarding the amount of detail they return (for example, just the basic customer information such as name and address or the full customer record with all attributes). The search services support pre-defined queries or ad-hoc queries using wildcard characters, for example.

Master data repository: The master data repository can consist of one or multiple databases and has the following parts:

  • The metadata: This part of the repository has all relevant metadata stored such as a description of the data model for the master data.
  • The master data: This part of the repository is where the master data is physically stored.
  • The history data: The history data is a complete history on all the master data entity changes in the repository. This enables point-in-time queries against the MDM data.
  • The reference data: Here lookup tables such as country codes, measurement units for products, marital status, and the like are stored.

An MDM system built with the Transaction Hub MDM pattern provides all capabilities for the MDM services component of the MDM Reference Architecture as outlined above. Many of these capabilities are dependent on the persistent nature of the data storage in a transaction MDM hub and are not possible in some of the hubs based on other patterns. Now, learn to deploy the Transaction Hub MDM pattern for an MDM system in the IT environment as Figure 3 shows.

Figure 3: MDM system built with the Transaction Hub MDM pattern
Figure 3: MDM System built with the Transaction Hub MDM Pattern

MDM solutions are rarely deployed on the green field – there is always an existing IT environment in which the MDM system has to be integrated. The master data is scattered and inconsistent across existing applications when the MDM solution is deployed.

Thus in a first step (1), the master data from the source application system has to be extracted, cleansed, standardized, de-duplicated, transformed and loaded into the MDM system (2). These steps are performed in the Master Data Integration phase. In an ideal world, once the MDM system built with the Transaction Hub MDM pattern is complete, all redundant copies of the master data in the source application systems can be deleted as indicated by the white color of the master data parts of the persistence. Furthermore, the source applications are "MDM enabled".

This means, whenever a transaction (3) is invoked on the source application system which affects transactional data (for example, a billing data) and master data, the master data portion of this transaction invokes a master data service of the MDM system for processing. Only the transactional part is processed locally. This requires changes in the application system to intercept the transactions (see the transaction interception pattern described here). Applications consuming master data (4) invoke the MDM services to retrieve master data in a read-only way. An MDM UI (5) on enterprise level is used to create and change master data. An MDM UI can be part of an enterprise portal implementation, for example. The key imperative is that all changes to master data by any source system are only performed through services of the MDM system. This guarantees the required level of master data consistency at all times.

Forces

This section describes some of the challenges an IT architect needs to consider when applying this pattern.

Master Data Integration

Since the Transaction Hub MDM pattern requires a full materialization of the complete data model for master data as well as a complete load of all master data entities, the Master Data Integration phase for building a transaction hub is a key undertaking. Applying the Transaction Hub MDM pattern could fail if the Master Data Integration phase is not properly executed.

Using leading-edge data model discovery and profiling tools on all source systems mitigates the risk to underestimate the efforts needed to cleanse and harmonize the master data from a variety of legacy systems. Otherwise, similar to data warehousing projects, the overall project could be at risk due to budget and time constraint violations. Furthermore, loading a huge amount of master data into the MDM system might require bypassing the MDM services to directly write to the database tables because the load window (often only a weekend) might be too short to load all records through the MDM services interface. However, this procedure runs the risk that if the loaded master data violates integrity constraints enforced by the MDM system, they surface only at a much later time and cannot be traced to the initial load anymore. Therefore, careful testing of such a procedure is very important. This MDI phase is usually underestimated and is taken into consideration rather late in the overall MDM project-execution phases. The authors, therefore, often highlight the importance of MDI in the very early planning phases of any MDM project.

Consistency

When an MDM solution is deployed, an IT architect needs to address master data consistency requirements across all systems in the IT environment using master data. There are two types of consistency -- absolute consistency and convergent consistency. Absolute consistency means that, at any point in time, the master data is consistent across the entire IT environment. Convergent consistency means that consistency is only achievable with a certain time delay. For example, if the Coexistence Hub pattern would be used for the MDM system, a master data change in an application system might be synchronized with the MDM system only with a delay. If the required master data synchronization is improved by reducing the delay time of change propagation, consistency converges towards absolute consistency – hence the name. Usually the Transaction Hub MDM pattern is used with the objective to achieve absolute consistency which might mean:

  • If an application cannot be re-engineered in such a way to remove the local, redundant copy of master data (for example, SAP applications), then two-phase commit protocols need to be applied whenever an MDM service is invoked to update the MDM system and the application system synchronously.
  • If the service transaction changing master data is triggered on a source application, this transaction has to be intercepted. Again, asynchronous communication with the MDM system is needed with a two-phase commit mechanism.

In both cases, the performance of the MDM services decreases due to the two-phase commit mechanism. Therefore, consistency requirements need to be balanced against performance requirements. In addition, changing applications for transaction interception and deploying two-phase commit infrastructure for global transactions could be significantly more expensive than asynchronous integration with convergent consistency. Keeping the balance between integration costs and gains from absolute consistency is another task for an IT architect considering this pattern. If there is an application which cannot be changed at all, then maybe only a MDM system built with the Coexistence Hub pattern is possible until this application is not needed any longer or a new version exists which can be changed.

High availability and disaster recovery

The Transaction Hub MDM pattern creates a very powerful MDM system acting as a system of record on an enterprise level – all applications requiring master data depend on it. However, this central system also creates a single point of failure. Thus an IT architect should pay special attention when designing the operational model for the MDM solution to address high availability and disaster recovery requirements for the transaction hub.

Compared to Registry Hubs or Coexistence Hubs, the Transaction Hub MDM pattern usually has stronger requirements regarding all facets of continuous availability (including high availability and continuous operations to cope with unplanned and planned outages respectively). The reason for this is that no application system with the need to access master data functions without the MDM system. Recovery time without proper facilities would also be longer than for a registry hub because more data is involved. The transaction hub could have been used to augment the master data with information only available in the MDM system, not in the source system – this implicates that the MDM system cannot just be rebuilt anymore from the sources (as it can for a pure registry hub).

Evolution of MDM systems

When an MDM solution is considered, it is usually not for a single project across the entire enterprise. The IT architect in collaboration with business stakeholders selects the area where the return on investment (ROI) can be achieved in the easiest way. This often means two things:

  • A Registry Hub pattern or Coexistence Hub pattern is used to build the MDM system since they are often easier and quicker to deploy
  • In the first project, only implement the LOB systems with the business processes which benefit the most from the MDM system materializing the ROI

However, adding transactional MDM capabilities unlocks the full benefits of MDM. So, adding more and more transactional capabilities to the initial MDM system which might have been built as a registry or coexistence hub improves the business benefits further. This, over time, transforms the initial MDM system into a transaction hub -- an evolution that is a key characteristic of MDM.

Business processes change – thus master data changes. It should be an inherent characteristic of the MDM system to be adaptable to this evolutionary nature of master data. The ability to react to changing requirements, additional domains, new business service demands, changes in integrity and business rules or required configuration changes regarding the operational model quickly and at low cost distinguishes state of the art MDM solutions from others. Thus, an IT architect performing software selection for an MDM system should keep an eye on this evolutionary aspect as well.

Conclusion

Implementing an MDM system using the Transaction Hub MDM pattern is feasible, if one or more of the following conditions apply:

  • A system of record for master data is needed
  • Performance
  • No latency
  • Out-of-the-box business services
  • Persistent master data model
  • As start point for implementing a SOA is anticipated or required

The risk areas are:

  • The introduction of an MDM system built with the Transaction Hub MDM pattern creates a single point of failure enterprise-wide. Thus, the IT architect designing a solution with this MDM architecture pattern needs to take special care in the design of the operational model ensuring through proper high-availability and disaster-recovery measures the required business continuity.
  • The efforts for MDI are often underestimated. The overall project schedule might be jeopardized unless accurate estimates for MDI are derived early on.
  • Application integration: If a transaction changing master data is still triggered on the application side instead on the MDM system side, this might require the use of the transaction interception pattern. However, if this pattern is needed, this usually means a deep intrusion into the application system which might be a challenge.
  • If the Transaction Hub MDM pattern is used, this means standardization on the data model and the services accessing master data across all lines of business. Unless strong executive backing if given for the MDM project, the project might be at risk due to political reasons when LOB owners disagree. This is closely related to data governance aspects such as who owns the data model for master data or the master data itself.

Part 2: Using IBM InfoSphere MDM Server to apply the pattern

The IBM InfoSphere MDM Server is based on a high-level architecture as Figure 4 shows:

Figure 4: IBM InfoSphere MDM Server overview
Figure 4: IBM InfoSphere MDM Server overview

The MDM Central Transaction Server is a solution with the following three major components:

A J2EE MDM application designed with SOA principles consisting of:

  • An SOA service interface: This interface supports invocation of the MDM business services through a variety of technologies such as Web services, JMS, CICS or RMI to name a few.
  • MDM business services: The solution has more than 800 out-of-the-box business services supporting both simple and complex business processes on master data. The MDM business services operate on a proven, comprehensive data model. All business services are open and extensible using the MDM design workbench.
  • MDM integrity: These are functions built into the MDM business services that prevent duplicates and enforce master data integrity rules.
  • MDM intelligence: These are functions built into the MDM business services that enable events based on business rules such as notifications. An example would be to notify a bank clerk in case of risk events (a credit score below a critical value) or life events (a master data change indicating that the customer might need a mortgage).
  • MDM data governance: These are functions built into the MDM business services that allow authorization on service and attribute level. This enables privacy and security of data.

The runtime is a J2EE compliant application server. WebSphere Application Server and BEA WebLogic are currently supported.

The master data repository of the MDM Central Transaction Server uses a relational database consistently. DB2 for Linux, UNIX and Windows, DB2 z/OS and Oracle are currently supported. The master data repository stores the master data, the metadata, the history and reference data.

The MDM design workbench is a development tool that can either modify existing services or add additional services. Based on a data model for a new master data entity, this tool is able to generate the necessary business services to manage this entity. This supports a model-driven development approach.

The MDM applications which are shipped with the IBM InfoSphere MDM Server are:

  • MDM Data Stewardship User Interface: This intuitive user interface allows easy, collaborative creation of hierarchies and groups. It also supports the typical tasks of data stewards to split and merge suspect duplicates.
  • MDM Event Management Client: This client allows triggering events and scheduling processing on a customer level.
  • MDM Batch Job Processor: This client is designed to manage batch processing efficiently. It supports multi-threading, pacing and logging for batch jobs.
  • MDM Fast Track Server: This is a component used when DB2 z/OS is used continually.

We will map the features of the MDM Central Transaction Server to the Transaction Hub MDMD pattern to show why the IBM InfoSphere MDM Server is well-suited to implement this pattern. Before we can complete this mapping, it is important you get a more detailed understanding of how the services work.

AddParty service example

Figure 5 shows a description of how the AddParty service works. This is just a brief introduction to show you how the MDM business services, which are available with the MDM Central Transaction Server, work.

Figure 5: Illustration of the AddParty service
Figure 5: Illustration of the AddParty Service

We assume the AddParty service is called to add a person or customer and works as follows:

  • An application invokes the AddParty service (1) through a Web service call.
  • The SOA service interface receives the call through the Web service interface (2) and invokes the AddParty service(3). It also establishes the transactional context for the AddParty service and invokes this service as part of global transactions in a two-phase commit protocol. For Web services, this could mean support for the WS-Atomic transaction standard.
  • The AddParty service (3) invokes a security service for authorization (4).
  • Assuming authorization succeeds, the business rule MustSearch (5) for duplicates forces the invocation of a SearchPerson service (6) to check for duplicates.
  • This service uses a PartyMatching service (7) to check whether or not the new customer is a duplicate or a suspected duplicate.
  • Assuming that the new customer is not marked as a suspect (8), normal processing continues.
  • In this case, this means that the AddPerson service (9) is called. This is a composite service consisting of several fine-grained services (10) which are all invoked.
  • One of these fine-grained services, the AddPartyAddress service (11), invokes a data validation service (12) for address standardization and verification.
  • Assuming all fine-grained services complete successfully, the AddPerson service call now creates a new customer record in the master data repository (13). Due to a business rule "NewPartyAdded" (14) an event notification is triggered creating a JMS message (15) as notification to the SOA service interface (2).

Now, the AddParty service created a new customer successfully. As shown, the AddParty service is a coarse-grained service that invokes a large number of services when it is executed.

Mapping analysis

For the mapping analysis, the structure of the MDM services component of the MDM Reference Architecture is used, as Table 2 illustrates. The last column indicates whether or not IBM InfoSphere MDM Server supports the capabilities required.

Table 2: Mapping analysis
MDM pattern componentFunction groupAvailableMDM Server component
Interface servicesReal-time request/responseYesSOA service interface
Batchyes
Support for invocation by various means such as JMS, Web services or RMIyes
Lifecycle management servicesOut-of-the-box MDM services for all master data domainsYesMDM business services
Data quality management servicesData validation and cleansing services for data integrity and master data standardizationYesMDM integrity
Reconciliation servicesyes
Master data event management servicesBusiness rules creation and enforcementYesMDM intelligence
Trigger notification based on events (life events, risk events, cross- and up-sell events, and so on)yes
Hierarchy and relationship management servicesHierarchy services YesMDM business services
Relationship servicesyes
Authoring servicesAuthoring servicesYesMDM business services
Base servicesPrivacy and security servicesYesMDM data governance
Audit logging servicesyesMDM business services
Workflow servicesyes
Search servicesyes
Master data repositoryMetadatayesMaster data repository
Master datayes
History datayes
Reference datayes

As you can see from the table above, the IBM InfoSphere MDM Server is software well-suited to implement the Transaction Hub MDM pattern. Of course, the IBM InfoSphere MDM Server can be used to implement the Registry and Coexistence Hub MDM patterns as well since they only need a subset of the functions required by a transaction hub. Furthermore, the MDM Central Transaction Server can address several aspects discussed in the forces section of the Transaction Hub MDM pattern description.

For example, regarding high availability and disaster recovery, the MDM Central Transaction Server offers:

  • The WebSphere Application Server supports vertical and horizontal clustering techniques with a load balancer mechanism enabling the MDM application for high availability.
  • Similarly, the DB2 database has key features for high availability: DB2 z/OS supports the Parallel Sysplex feature for maximum availability. DB2 on the distributed platforms has for example the High Availability and Disaster Recovery (HADR) feature and a Hot Standby feature using data mirroring techniques on the storage level. Depending on the available hardware and the high availability requirements, the IT Architect can choose which one fits best.

For example, regarding the evolutionary nature of MDM Systems, the MDM Central Transaction Server offers:

  • Easy extension of existing services or addition of new services using the MDM Design Workbench
  • Support for many open-industry standards enables flexible integration with existing and new application systems as the need arises

Summary

Applying best practices and reusable assets such as proven architecture patterns is crucial for the successful deployment of Master Data Management solutions. This article showed you how the Transaction Hub MDM pattern is a great resource for implementing an MDM System. You have also seen why the IBM InfoSphere MDM Server is the right software solution for implementing this pattern in a concrete project.


Book Preview

The Transaction Hub MDM architecture pattern described in this article is an adaptation of the full MDM architecture pattern description in the following book: Enterprise Master Data Management: An SOA Approach to Managing Core Information. The book was written by Allen Dreibelbis, Eberhard Hechler, Ivan Milman, Martin Oberhofer, Paul van Run and Dan Wolfson and will be published in June 2008 by Pearson Publishing (ISBN-10: 0132366258, ISBN-13: 9780132366250). The book covers many of the key aspects for understanding what is meant by Master Data Management, the business value of Master Data Management and how to architect an Enterprise Master Data Management solution. This book provides a comprehensive guide to architecting a Master Data Management Solution that includes a reference architecture, solution blueprints, architectural principles, patterns and properties of MDM systems. The book describes the relationship between MDM and Service Oriented Architectures and the importance of data governance for managing master data. It also describes this material vendor and software product agnostic focusing on the principles and methodologies to design the right architecture for an MDM Solution. For a chapter-by-chapter description, see the sidefile.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Architecture
ArticleID=295925
ArticleTitle=Implementing a Transaction Hub MDM pattern using IBM InfoSphere Master Data Management Server
publish-date=03202008