InfoSphere MDM for master data governance with MDM workflow

Use MDM workflow to improve your master data governance

This article explains the important role that master data management (MDM) workflow plays when considering master data governance within an MDM implementation.


Jay Limburn (, Senior Technical Staff Member, IBM

Jay Limburn photoJay Limburn is an IBM Senior Technical Staff Member and IBM Senior Inventor currently working as a Software Architect on IBM's core MDM product set in IBM Software Group's Information Management division based in Hursley UK. Jay is chief architect for the client applications and interfaces division, where he is responsible for designing capabilities that empower business users with master data driven applications, encompassing data governance, business process management, data stewardship, and mobile. Jay has a solid background in MDM technologies and has additional areas of expertise in model-driven development, solution delivery and design, Eclipse, and J2EE technologies. Jay has become a recognized industry expert in his field, presenting on MDM and data governance related topics at a number of conferences and has produced a number of articles on this subject. Jay has also filed 10 patents in these areas.

Trey Anderson (, Product Manager, IBM

Trey AndersonTrey Anderson has had various leadership positions in engineering, professional services, and product management. As an MDM product manager, Trey is focused on strategic integrations and solutions between InfoSphere MDM and other IBM technologies. He is currently responsible for defining the GTM strategy and solution for IBM BPM. Trey graduated from Texas A&M University with a degree in accounting and management information systems. He enjoys exploring Austin’s various nature trails with his two boys.

Harsha Kapre (, Product Manager, IBM

Harsha KapreHarsha Kapre serves as a product manager for IBM's Infosphere Master Data Management offering. He has ownership of the product strategy and roadmap specifically for Collaborative MDM with focus on Product Domain. In his previous role, Harsha was the MDM Client Advocate responsible for client relationships and references world wide. Harsha has been with IBM for over 14 years and holds a degree in Electrical Engineering and Computer Science from the University of California at Berkeley.

18 July 2013

Also available in Chinese


MDM workflow plays an important role when considering master data governance within an MDM implementation. The authors of this article are leading experts in the field of master data governance and have spent many hours working with large organizations looking to become successful with the master data implementations. With experience across many industry segments and providing expertise across many different styles of MDM, the information within this article is designed to assist enterprise architects in understanding the importance of ensuring that the workflow capabilities of their MDM systems are appropriately considered. As you will discover throughout this article, to fully maximize the benefits that MDM can bring to your business, you should consider MDM workflow as both internal to the MDM system and also external across the business as a provider of data to broad enterprise processes.

MDM styles

Typically, MDM use cases can be separated into 4 different styles. Each style can be used in isolation; however, typically the implementation of one style will evolve over time into a combination of additional implementation styles once the business stakeholders further appreciate the benefits that are brought by a master data implementation.

Master data is the key pieces of information against which a business operates. Customers, products, accounts, patients, suppliers, vehicles, employees, and assets are just a few examples of master data domains. This important data is usually fragmented across multiple systems. Information relating to the same customer, product, or supplier can be inconsistent across thee multiple systems, leading to bad decisions based upon incomplete data, poor customer service as a result of inaccurate data, or data becoming untrusted and underutilized by lines of business.

IBM InfoSphere® MDM provides a rich and dynamic set of capabilities for creating a trusted view of master data within an organization. It provides industry leading functionality in the areas of matching, suspect duplicate processing, data stewardship, security, and performance. IBM InfoSphere MDM also provides the single foundation on which mission critical data is pushed out to lines of business for consumption, ensuring business users have absolute trust in their data. This means those users can make decisions quickly, knowing that they have the most accurate information on hand to make those decisions.

Collaborative MDM

Collaborative MDM provides capabilities for creating and managing master data where authoring and collaboration are key requirements. Collaborative MDM is typically associated with products, but it is appropriate for managing any domain with use cases for authoring, enrichment, and workflow. Master data created within collaborative MDM is typically exported to consuming systems, or channels, that leverage the trusted information for specific initiatives. The master data can also be exported to a multidomain operational system to support larger concurrent transactions and relationships with other domains. In industries such as retail where a large number of products can be introduced and updated and streamlining business processes is important, a collaborative authoring approach to master data is typically appropriate.

In this implementation style, it is typical for large product catalogs to be created and new product introductions to occur frequently as part of seasonal adjustments or changing market needs. Typically, there is also a requirement for the solution to remain agile to keep up with rapidly changing conditions. A collaborative MDM style allows the MDM system to adapt easily to changing requirements and for new products to be introduced with proper governance (approvals, process flows, hierarchy management, attribute changes, etc). IBM InfoSphere MDM Collaborative Edition provides a rich platform that addresses these uses cases. It allows product authors to create new products and enrich information received from suppliers. This information flows through customizable workflows to manage the introduction and maintenance of those products. Additional master data governance can be enforced through business rules and validations and approval processes for new product hierarchies and new attributes. Check in/check out capabilities and parallel processing features ensure data integrity is enforced, allowing multiple authors to manipulate portions of the catalogs or products at the same time.

In some cases, a collaborative MDM style is integrated with an operational style to allow the point of consumption by the lines of business to take advantage of high volume transactions and relationships between different domains.

Registry MDM

A registry style of MDM is often considered the first and most simplistic entry point into creating a master data system. In this style, data from multiple sources can be referenced and loaded into the master data hub. Once in the hub, the records can be matched and linked providing the single consistent golden view of the data. In this style, source systems remain as the system of record and are used to capture any updates to the data. The matched MDM data can be considered as a system of reference, allowing systems to see a real-time calculated view of master data (temporary golden view) or identify the records from source systems that comprise the master data entity (index of linked records). Often, registry MDM implementations are seen as less disruptive to the business because the data sources stay intact and the MDM system becomes an additional system that consumes the existing data and a point of consumption for the lines of business.

Implementation styles vary; however, often the source systems import a delta of any updates into the MDM registry on a daily basis. The updated data is then again matched into the existing MDM data and the matching and duplication services within the MDM system update the golden view of the data.

IBM InfoSphere MDM Standard Edition provides an industry-leading platform for creating a registry MDM implementation within an organization. Offering advanced automatic matching and de-duplication capabilities, it can create a single consistent golden view of data spread out across an organization. IBM InfoSphere MDM Standard Edition also provides rich capabilities for manual intervention by a team of data stewards to provide an additional layer of governance of the automatic matching and de-duplication algorithms.

As part of the IBM InfoSphere platform, it provides an easy entry point into your first MDM engagement and provides a basis for maturing your MDM implementation over time.

Operational MDM

Operational MDM styles take the MDM implementation one step further then a registry implementation. In an operational MDM implementation, the MDM system becomes the system of record. Data from source systems is moved from the existing data sources into the MDM system, and ownership of that data moves from the existing data source to the MDM system. Once within the operational MDM system, the data is not only read by other lines of business systems, but it is also updated by any system that is required to make changes to the data.

IBM InfoSphere MDM Advanced Edition includes the capability to create a high performing, secure, and robust platform for creating an operational MDM system. Taking advantage of the same matching and de-duplication capabilities that exist within the Standard Edition, the Advanced Edition provides a platform that allows changes to data stored within the MDM hub to be integrated quickly into a golden view of the data. Also provided is a rich set of flexible business services that provide a secure and robust way for data within the operational system to be consumed by the lines of business.

Hybrid MDM

In a hybrid MDM implementation, the MDM system becomes the system of record for part of the business. Data from source systems is moved from the existing data sources into the MDM system, and ownership of that data moves from a portion of the existing data sources to the MDM system. Alternatively, all the existing data sources could maintain their ownership of the master data; however, the hybrid system of record would provide ownership for existing or new processes and applications that don't have master data to contribute to the MDM systems.

IBM InfoSphere MDM Advanced Edition builds upon the registry style MDM capabilities provided by IBM InfoSphere MDM Standard Edition and compliments it with the ability to implement a hybrid style of MDM, or if required, a fully operational style of MDM.

Workflow and master data governance

Workflow provides the coordination of human and system interaction with master data. Master data governance is the practice of improving data quality through policy enforcement. An effective master data governance strategy requires MDM workflow to coordinate the interactions with the data to enforce data quality.

MDM workflow

All editions of InfoSphere MDM provide workflow capabilities appropriate to support these use cases. MDM workflow is provided natively within the MDM engine and integrated with an embedded version of IBM Business Process Manager. IBM Business Process Manager is an industry leading BPM product that lets you rapidly construct, test and deploy business processes within an organization. IT also provides advanced capabilities to integrate with other systems.

Utilizing an embedded version of IBM BPM allows for easy enrichment of business processes with mastered data, ensuring business processes are always operating against an organizations most trusted and accurate information, as well as applying a layer of data governance to an MDM system by ensuring that the quality of the data is as accurate as possible and allowing business processes orchestration to guide a data steward through the steps required to remediate data inaccuracies within the MDM system. The MDM Application Toolkit, provided by IBM as a component of the IBM InfoSphere Master Data Management platform provides advanced accelerators that can be used to facilitate rapid construction of business processes that utilize master data.

When you are considering a master data management implementation, you can significantly benefit from also considering business process management (BPM). When considering the two strategies together either as a green field project or to complement an existing MDM or BPM implementation, you should consider leveraging the benefits of aligning the implementations.

Types of data governance

When tied to MDM workflow, data governance can be categorized by two main areas:

Passive data governance

Passive data governance is the application of governance for data quality as an ongoing operational task within an organization. Typically, a team of data stewards responds to events triggered by the MDM system, alerting them to the fact that a data quality issue has been detected and needs addressing, or that a business policy has been violated by a piece of data and manual intervention is required to remediate that data to maintain integrity over that data. Passive data governance is the process of reactively addressing data quality issues that have already arisen within the master data.

Active data governance

Active data governance is the application of governance for data quality at the point of consumption or creation by the users of the data. Whether it be consumption of the data stored within the MDM system or initial creation of the master data by the lines of business, allowing the lines of business to improve data quality by ensuring the enterprise processes are enriched with master data and taking actions based upon that data can drastically improve the quality and trust in your data. Business users and their interface with the persons, organizations, products or accounts that represent the master data are best placed to take the correct action with the data as part of an enforced business process since they are the experts on that data. Effective business processes are important when adopting active data governance to ensure that no undue risk is introduced as part of a new creation process; however, any active data governance implantation should always be backed by a solid passive data governance strategy.

Figure 1 shows the difference between active and passive data governance:

Figure 1. Difference between active and passive data governance
Difference between active and passive data governance

Typical use cases and implementation styles

Operational hub with MDM workflow

Within an operational MDM implementation, MDM workflow can be implemented using the embedded IBM Business Process Manager capabilities.

For data stewardship

Applying MDM Workflow to data stewardship tasks within an operation environment provides a layer of governance around the steps that are required to monitor and remediate data quality issues within the operational hub. Let's look at a scenario concerning suspect duplicate processing.

The MDM system has identified a number of customer records that could relate to the same individual, but the score of the possible match indicates that human intervention is required to either confirm or reject the match and allow the records to be collapsed if required. MDM workflow can be used to respond to the suspect match event within the MDM system. On receiving the suspect match event from the MDM system, MDM workflow can kick off a suspect duplicate processing task and coordinate the steps that should be taken by individuals or systems within the organization ensuring that the suspect duplicate within the MDM system is remediated in the correct manner.

Figure 2 depicts the role of MDM workflow for operational data stewardship:

Figure 2. Role of MDM workflow for operational data stewardship
role of MDM workflow for operational data stewardship

The MDM workflow that dictates how this suspect duplicate event should be handled within the MDM system enforces a layer of data governance to any changes made to the system. It does this by:

  • Ensuring that the task is routed to the best individual to make any change to the data.
  • Ensuring that any approvers required to approve the data are notified and are taking appropriate action in a timely manner.
  • Enforcing any business rules that need to be applied to ensure the quality of the data.
  • Providing a platform to manage and track effectively the steps and actions taking place within an organization as a result of a particular suspect duplicate processing event.

For enterprise use cases

Let's look at an enterprise level use case to understand how MDM Workflow can improve data quality within the MDM system at the point of creation or consumption by the lines of business.

A large bank wants to let individuals register for a new offering. It is expected that a large number of these individuals will be existing customers, and obvious value can be sought by ensuring that existing customer data be leveraged from the master data system where it exists. In addition, the organization wants to ensure that for each new customer, a complete and comprehensive master record is created based on what other lines of business require - not just the minimum information for the particular product purchase or account being created. No existing system can house this complete record, and this information is not owned by one division, it is owned by the enterprise.

Using MDM workflow, the customer on boarding process for this new offering can be defined ensuring that the organizations' existing master data is leveraged through each step of this process and that duplicate customer records are not created in the system. Using MDM workflow data governance is applied at the point of creation of the master data to ensure that the MDM system is searched to locate customers with similar credentials before a new customer record is created. If a match is found, then the existing customer data is used to complete the registration for the new offering; otherwise, a new customer record is created and the business process can continue to complete the rest of the required steps.

Figure 3. Role of MDM workflow for enterprise use cases
Role of MDM workflow for enterprise use cases

By enriching enterprise processes, such as customer on boarding with master data:

  • Data quality is enforced at the point of entry following the principals of active data governance.
  • A greater level of data quality in the system reduces the overhead on data stewards who may have otherwise had to complete a suspect duplicate processing task to resolve duplicate customer records.

Registry hub with MDM workflow

Within a registry MDM implementation MDM workflow can be implemented using the embedded IBM Business Process Manager capabilities.

For data stewardship

Applying MDM workflow to data stewardship tasks within a registry MDM environment provides a layer of governance around the steps that are required to monitor and remediate data quality issues within the registry hub and its sources. Let's look at a scenario concerning changes made to the data sources of a registry style of MDM and how MDM workflow can be used to enforce the quality of the data within the other sources.

As mentioned previously, in a registry style implementation of MDM, ownership of the data sources remains outside of the MDM system, and a snapshot of the data held within the sources is brought into the MDM system, allowing a golden view of that data to be created. Changes to the sources are brought into the MDM system and rolled up into the golden view; however, updates to the golden view are reflected in the other data sources. You can introduce MDM workflow here to govern the steps that should be taken to determine if an update to a source is required outside of the system and also manage the steps that should be taken to apply the update to the system.

For example, a medical provider has implemented an MDM system that pulls in data from three different data sources to provide a golden view of a patient. A change of address is made to the 'Doctors Office' source, and this change of address is pushed into the master data system and rolled into the golden view. This ensures that any applications that read from the MDM registry will use the newly updated address. However the 'Hospital' source, which is used directly by a consultant working at the hospital, is unaware of the change of address. This could lead to important correspondence being sent to the incorrect address, unless the patient calls the hospital to update their address. Using MDM workflow, you can define a business process that notifies the data source owner of the 'Hospital' system that the address update has occurred and should be updated in the 'Hospital' data source. You can implement business rules within the business process to determine which data sources should be notified when an attribute within a particular source should be updated and what action should be taken. The business process can then further enforce the steps that should be taken to ensure that the data quality issue is resolved.

Figure 4 shows the role of MDM workflow as part of a registry-based data stewardship use case:

Figure 4. Role of MDM workflow as part of a registry- based data stewardship use case
role of MDM Workflow as part of a registry based data stewardship use case

Implementing MDM workflow in this scenario:

  • Ensures that systems or users that don't utilize the master data registry can still benefit from updates to other systems within the master data registry.
  • Allows complex rules to be implemented to determine the conditions under which a particular source should be updated.
  • Provides a platform that allows governance to be applied to the update of the data source to ensure satisfactory completion of the update.

For enterprise use cases

Customer centricity increases customer satisfaction by improving the quality of customer service at the point of engagement. To realize this opportunity, these key points of customer interaction often require a cross enterprise customer view.

Organizations struggle to provide consistent levels of services and properly promote offers as users have limited information available at the point of the engagement. These system users typically follow a business process that only has access to one or a few systems within the organization. This is a very limited view considering the number of customer accounts stored in various systems and different geographies. In addition, the many customer records and their many accounts are not easily linked. Manually connecting these customers and accounts across many systems (links) and within a system (duplicates) is time-consuming and error-prone.

Let's look at an example: A large organization wants to provide a comprehensive view to individuals that work closely with customers (for example, customer service representatives (CSR) and account managers). With a complete view of the customer, CSRs can identify and provide the correct level of service (in other words, bronze, silver, or gold). Presenting relevant cross/up-sell offers requires that account managers have visibility into past purchases, perhaps across channels and partners, and understand lifetime value.

The first step to realize these opportunities is ensuring master data from across the organization is centralized in MDM. MDM's powerful matching capabilities identify and link customers from various systems (matches) and within the same system (duplicates). Workflow can then be introduced to interact with the master data at the point of engagement. These processes start by searching MDM to find the single, trusted version of the customer. The same advanced matching algorithms used by the MDM system to link customer records across systems are employed to help users find customer records through searching. As a result of this, the correct customer record is identified regardless of errors in the customer data or errors entered by the user in the search screen. The enterprise process that is being executed then becomes much more valuable to the business as it is working against a set of complete trusted data that represents the customer considering all of the information we hold that represents them.

Enriching enterprise processes with master data in this scenario has:

  • Ensured that the business users are able to provide the most effective service to their customers by having a complete view of their customers available to them.
  • Ensured that the business users have all the information they require to further cross sell and up-sell services to that customer.
  • Ensured that the correct customer is identified within the system quickly, regardless of data entry errors and inconsistencies.

Collaborative authoring with MDM workflow

Within a collaborative MDM implementation, MDM workflows can be implemented using either the embedded IBM Business Process Manager workflow engine or the integrated collaborative workflow engine, which has been optimized for high collaboration use cases such as product authoring.

For data stewardship

Applying MDM workflow to data stewardship within a collaborative style of MDM offers the ability to manage the interaction between the users and the MDM system until the data quality issue has been remediated. This is a particularly important characteristic within a collaborative MDM environment where common data stewardship tasks such as product matching and de-duplication require interaction across multiple lines of business and numerous approval steps before the remediation can be considered complete. The embedded workflow engine provided by IBM Business Process Manager offers the capabilities that are most suited to these types of scenarios.

For enterprise use cases

When considering a product introduction process, then the collaborative workflow engine integrated within InfoSphere MDM Collaborative Edition provides an enterprise ready, flexible workflow engine that has been optimized against the collaborative edition data model for collaborative authoring of MDM data. In use cases that involve collaborative creation of master data, then the collaborative workflow engine should strongly be considered as the preferred MDM workflow capability to provide this functionality.

The collaborative workflow engine, shown in Figure 5, includes APIs that can be leveraged by larger enterprise-wide business processes, allowing product introduction workflows to act as micro-processes to larger business processes defined across an organization. IBM Business Process Manager may be a suitable choice in this scenario.

Figure 5. Role of the integrated collaborative workflow engine as part of an enterprise use case
role of the integrated collaborative workflow engine as part of an enterprise use case

Implementing MDM workflow in this scenario:

  • Provides a structured, orchestrated set of steps that must be completed to control the introduction of new products and offerings.
  • Utilizes a workflow capability that has been optimized for high collaboration environments and frequently changing data models.
  • Allows the creation of the master data within the collaborative MDM system to be controlled as part of a broader enterprise wide process.

In scenarios where collaborative MDM implementations require data stored within the MDM system to be surfaced directly to participate in a business process, then the advanced integration capabilities provided by IBM Business Process Manager make it the preferred and most robust option to satisfy these use cases.

Choosing an implementation style

The decision concerning which MDM implementation style to choose and the role of MDM workflow within that implementation style can be a complex one. Here, we cover some of the key items that should be considered when making this choice.

What domain can your master data typically be associated with?

  • Purely product data would tend to suggest a collaborative MDM solution due to its advanced product catalogue and hierarchy support.
  • Party account or product data that includes cross-domain dependencies would typically suggest an operational or hybrid style of MDM, which can utilize the industry leading out-of-the-box data models supplied by the Advanced Edition of InfoSphere MDM.

What are the workflow requirements of the system?

  • Workflow requirements surrounding product introduction and authoring would typically best be suited to that of a collaborative style of MDM, such as that provided by InfoSphere MDM Collaborative Edition.
  • Workflow requirements that require product introduction or authoring as part of an enterprise process may be best suited to an implementation that utilizes InfoSphere MDM Collaborative Edition and IBM Business Process Management embedded workflow.
  • Workflow requirements in high volume, high transaction environments, regardless of domain type, would typically be best suited by a combination of InfoSphere MDM Standard or Advanced Editions with IBM Business Process Management embedded workflow.
  • Workflow requirements surrounding data stewardship to apply data governance to enforce the quality of your master data typically fit well with the capabilities provided by the Standard or Advanced Editions in conjunction with IBM Business Process Management embedded workflow.
  • Workflow requirements concerning master data consumption would typically be consumed as part of an Enterprise Business Process Management strategy and would typically be a good fit for Standard or Advanced Editions of InfoSphere MDM.
  • What are the workflow requirements concerning monitoring of active workflows, mobile support and social interaction of users? The IBM BPM based embedded workflow engine provides rich capabilities to address these types of requirements.

What makes sense for my existing IT infrastructure?

  • Large organizations typically already have existing investment in BPM and MDM. What makes most sense from an IT perspective? Can the existing IT investment in BPM/MDM be leveraged?

Organization strategic direction

  • It is common for organizations to have in place corporate edicts aimed at streamlining technologies, vendors, or reducing data center costs through minimizing the number of machines and so on. Quite often, these non-functional requirements of the system will have an overbearing effect on the implementation decision and should be considered fully.


The value of bringing together MDM and BPM projects to ensure trusted data is available to an organizations' business process and to ensure that data governance is fully enforced to improve the quality of data is an important consideration that should not be ignored if investment in such projects is going to yield the business benefits that are required.

This article provided an overview of the different styles of MDM that can be implemented and the important role MDM workflow can play to complement each MDM style. We covered some high-level use cases and also provided important considerations that should be fed into the decision process.

MDM implementations vary between industry and organization size and structure, so these considerations are deigned to provide information to guide the decision process down the correct path; however there are no hardened rules that determine which implementation style is the right one because different implementations can provide a robust solution to a number of common use cases. Therefore, the information in this article should be used as a guide to formulate a discussion with the business stakeholders, the IT organization, and the IBM subject matter experts to ensure the optimum solution is chosen and implemented for maximum effect.





developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Information management on developerWorks

Zone=Information Management
ArticleTitle=InfoSphere MDM for master data governance with MDM workflow