An MDM solution enables an enterprise to govern, create, maintain, use, and analyze consistent, complete, contextual, and accurate master data information for all stakeholders, such as line of business systems, data warehouses, and trading partners. It provides a customizable framework of components that control the lifecycle management of master data, quality and integrity of the data, and stateless services to control the consumption and distribution of data. An MDM solution:
- Provides business value by standardizing the way that data is used across an enterprise treating master data as a unique corporate asset
- Provides the authoritative source for master data within the enterprise
- Provides high value actionable services over the data that create business value, such as by triggering data governance policies to resolve name conflicts and triggering actions based upon changes to data, such as when a name or an address changes.
An MDM solution is more than maintaining a central repository of master data within the enterprise. The MDM reference architecture provides a resilient, adaptive architecture to enable and ensure high performance and sustained value. Some of the key architecture drivers that influence the design for the solution architecture are the following:
- It should provide a framework to manage and maintain master data as an authoritative source, and securely deliver accurate, up-to-date master data across the enterprise to authorized users and systems.
- It should support the need to coordinate and manage the lifecycle of master data.
- It should make accurate critical business information available as a service that can be used in the context of a business process at the right time by any authorized user, application, or process within the enterprise.
- It should provide the ability to cleanse data being used operationally, to improve the quality and make it more consistent for use in the operational environment.
- It should provide the need to make master data active by detecting and generating operations to manage master data, implement data governance policies, and create business value.
Links to more information regarding MDM offerings from IBM can be found in the Resources section.
Before you dive into MDM architecture patterns, embark on a little excursion to clarify what is meant by architectures, patterns, architecture patterns, master data, MDM, and MDM solutions. Although the terms MDM solutions and MDM solution patterns are used, this article concentrates on MDM architecture patterns.
- Master data: At a very high-level, there are essentially three different types of data management systems: transactional data management systems, such as order entry processing transactional data, analytical data management systems, such as data warehouses processing analytical data, and master data management systems processing master data.
- Transactional data describes a customer's operation activities and state with regard to business transactions, as there are payments, deposits/withdrawals, invoices, claims, transfers, and service and sales activities. Transactional data represents the business actions and activities that occur over a period of time or even at a given point in time.
- Analytical data is related to, for instance, a customer's behavior and performance from historical and future perspective. This analytical data is prepared and then used to generate reports for sales and marketing analytical purposes, profitability analysis per business segment, and insight into a buyer's behavior.
- Master data is the enterprise-wide data and facts describing key business entities. There are many different types of master data, such as facts and relationships among customers, employees, partners, and suppliers, as well as details, facts, hierarchies of products, items, materials, and bill of materials. Master data could also be facts and relationships of locations, entities, devices, and equipment. In the customer case, for instance, this data contains relatively static attributes of customers, such as attributes on the identity, detailed profile, and relationships among customers.
- Master data integration (MDI): MDI is a set of disciplines, strategies, technologies, and solutions, used to perform the following tasks:
- Understand: Understanding and analyzing the data, including discover, model, and govern information quality and structure of master data.
- Cleanse: Standardization, merging, and correcting master information.
- Transform and move: Transform, enrich, place, and synchronize master information.
- Federate: Provide virtualized access to master data in a heterogeneous data landscape.
- MDM: MDM is a set of disciplines, strategies, technologies, and solutions, used to create and maintain consistent, complete, contextual, and accurate business data for all stakeholders (users and applications) across and beyond the enterprise. These disciplines and strategies can be combined with information management products and services in order to provide a single view of all different types of master data, such as customers and products. An important component of MDM is the set of processes to build and maintain such a single view over the lifetime of the MDM system. This can be achieved through a physical or logical master information hub. This includes policies and procedures for access, update, data and information governance, and overall synchronization and coordination of this information hub with other participating systems enterprise-wide. This view could even be expanded to multiple enterprises. MDI is considered part of any MDM solution. In some cases, customers are interested solely in implementing MDI solutions to consolidate, merge, migrate, and assess their master data.
- Architecture: The term architecture when expressed in an IT context can be ambiguous unless it is further qualified in terms such as, enterprise architecture, solution architecture, or application architecture. These architectures address specific situations or problems to be solved; where as, other generalized architecture definitions, such as a reference architecture, technical architecture, or product architecture may be used as a starting point for defining one of the situational architectures. The term architecture blueprint is defined as a prescriptive approach for developing adaptable architectures rapidly by leveraging proven reusable assets, methodologies, patterns, and models. Enterprise architecture is the linkage among business strategy, IT strategy, and IT implementation.
- MDM architecture: MDM architecture captures the best practices for implementing an MDM solution. This MDM architecture is comprised of a detailed set of architectural views that describe MDM solutions that have been repeatedly built and deployed in a consistent, high-quality, supportable fashion with a cross-industry and industry-specific application. MDM architectures should also consider the various Enterprise Master Data business and technical strategies, master data implementation approaches, and MDM methods of use. This architecture is described using an MDM reference architecture, technical architecture, MDM architecture patterns, and design templates that, when tailored, solves a class of customer problems.
- Patterns: In general, patterns are artifacts that have been used, tested, and successfully proven in the majority of recurring situations (80:20 rule). These situations can be characterized by the relation between problems and solutions in a given context. This is illustrated in Figure 1. There are, of course, different types of patterns, such as business patterns that are focusing on processes, workflows, procedures, and so forth. There are deployment patterns, security patterns, application patterns, and patterns that deal with operational and runtime aspects. Lastly, there are architecture patterns, which are artifacts that deal with recurring situations in the context of various architectures, such as enterprise, reference, integration, application architectures, and so forth. Patterns should be captured in guidelines, best practices and services, blueprints and frameworks, and most importantly in tools to accelerate usage and deployment.
Figure 1. Patterns
- Architecture patterns: Architecture patterns, in general, should describe proven and prescriptive models for a system, along with definitions of the the typical elements and subsystems that make up the system. These patterns can be used to address various architectural problem spaces when applied to a specific context. Basic and composite architecture patterns can serve as a template or stereotype for further completion, refinement, adaptation, and customization. Finally, these architecture patterns should be captured in best practices, blueprints, recipes, and most importantly in tools (such as Rational Software Architect) to facilitate reuse.
- MDM architecture pattern: Architecture patterns, when applied to an MDM context, should address the entire scope and the architectural aspects of MDM, for example MDI. Consideration of the various Enterprise Master Data business and technical strategies, master data implementation approaches, and MDM methods of use, are also included in the MDM architecture pattern.
- MDM solution pattern: MDM solution patterns describe solutions to recurring MDM problems. These solutions are usually composites of one or more MDM-specific or other architecture pattern.
- MDM Solution: An MDM solution is comprised of the MDM components, described above, which are complemented with an:
- MDM strategy: This MDM strategy is a value-enabling combination of business and technical components. It needs to include the business participation, business motivation, and overall guidelines from the business. Of course, this is customized for a particular customer situation.
- Architecture(s): This is a wide set of different architectures, such as a systems architecture, a reference architecture, a technical architecture. It also includes a product and component mapping at the end.
- Product and technologies: Yes, products and technologies (IBM is well positioned here, see the Resources section) are an essential ingredient of any MDM solution. This includes MDM products to build and operate information hubs, and the entire MDI scope of functions. The MDI technologies also include a set of enterprise application integration (EAI), enterprise information integration (EII), and extract, transform, and load (ETL) technologies.
- Best practices: These best practices address the integration, build, deployment, governance, and operational aspects. Furthermore, there also needs to be master data best practices (such as to address master data categorization needs). These best practices usually fall into the domain expertise of any System Integrator.
Value proposition and solution goals
Composing MDM architecture and MDM solution patterns into a comprehensive MDM solution, the key value propositions are:
- Advantage through single version of the truth (consistency and quality advantage): The MDM solution functions as the authoritative source for master data within the enterprise, by decoupling the data from individual, isolated applications that may have inconsistent values for that data.
- Holistic life cycle management for master data: Building an MDM solution provides actionable services over master data that create business value, such as by triggering data governance policies to resolve name conflicts and triggering actions based upon changes to data, such as when a name or an address changes.
- Reusability advantage: As highlighted in the Introduction section, a very broad range of consumers needs to access master data and its related functionality. The MDM Solution creates an authoritative source of trusted information for master data entities. It also provides a set of standardized services for this data that can then be leveraged across and beyond the enterprise. By doing so, it is guaranteed that the most critical information in an enterprise is treated consistently.
- Improved governance: The MDM solution includes specific capabilities to guarantee the appropriate governance for master data entities.
An architecture principle is a comprehensive and fundamental law, doctrine, or assumption that provides overarching guidance for development of a solution. A good architecture principle is not outdated by advancing technology and has objective reasons for advancing it instead of alternatives. The following principles are core architecture principles that should be considered for guiding the development of an MDM solution.
- The MDM solution should provide the ability to decouple information from enterprise applications and processes to make it available as a strategic asset for use by the enterprise. This is a fundamental concept of Information on Demand, founded upon Service Oriented Principles to deliver information at the right time, in the right context, to the right application or user.
- The MDM solution should provide the enterprise with an authoritative source for master data that manages information integrity and controls the distribution of master data across the enterprise in a standardized way that enables reuse. The primary motivation for this principle is to centralize the management of master data to reduce data management costs and improve the accuracy and completeness of that data.
- The MDM solution should provide the flexibility to accommodate changes to master data schema, business requirements, and regulations, and support the addition of new master data. This improves the ability of a business to quickly respond to business changes that may require the addition of new master data elements or changes to existing master data.
- The MDM solution should be designed with the highest regard to preserve the ownership of data, integrity, and security of the data from the time it is entered into the system until retention of the data is no longer required. The objective of this principle is to ensure that core business data that is critical to the success of the enterprise is secure, and to comply with privacy laws and regulations.
- The MDM solution should be based upon industry accepted open computing standards to support the use of multiple technologies and techniques for interoperability with external systems and systems within the enterprise. This guides development of the architecture to remain open and flexible so it can easily integrate with a variety of vendor software that may already exist within the enterprise and future "unknown technologies."
- The MDM solution should be based upon a architectural framework and reusable services that can leverage existing technologies within the enterprise. This principle guides the architectural decisions to leverage existing investments in technologies, such as those that facilitate connectivity and interoperability or information integration where it makes sense in order to implement an MDM solution.
- The MDM solution should provide the ability to incrementally implement an MDM solution so that an MDM solution can demonstrate immediate value.
MDM methods of use
MDM is a set of software, information standards, and governance infrastructure that enables your enterprise to create, maintain, use, and analyze consistent, complete, contextual, and accurate information for all stakeholders. MDM requires capabilities to rationalize master data across enterprise applications, treat master data as a unique corporate asset, and bridge structured as well as unstructured data. MDM can be a dramatic paradigm shift within an enterprise because it requires a pro-active enterprise view of master data, and must provide new technologies and governance to manage and use master data across multiple data domains and with multiple methods of use that include collaborative, operational, and analytical.
MDM supports the management of master data throughout its lifecycle. This requires the ability to collaborate, define, and publish master data, operational processes to manage and maintain master data throughout its transactional stages, and analytical capabilities to provide better insight and leverage embedded information. Multi-Form MDM is a term used to address the fact that MDM supports multiple styles of use for master data (collaborative, operational, and analytical) and spans multiple data domains, such as customer and product. It is not uncommon for multiple methods of use to be applied even to the same data domain within a large enterprise environment.
Method of use: Collaborative style
The collaborative style of MDM supports the definition, creation, and synchronization of master data. This style is often associated with the creation, augmenting, or altering of master data to support processes, such as the new product introduction and definition process or data stewardship. There are always business processes associated with maintaining master information, whether it's setting up new products to be sold, hiring new employees, or managing suppliers. The MDM system participates in such processes, either driving the entire process or it can be called by another system.
Figure 2. Collaborative MDM
Collaborative MDM provides the ability to maintain information in one place that is typically maintained across many internal applications, using a single master process to ensure that the information is complete and validated. Collaborative MDM requires services to support workflow and check-in, check-out services to control the creation, management, and quality of master data. After the information is complete and validated, collaborative MDM supports the integration and the synchronization of master data with legacy systems, enterprise applications, and data repositories within the enterprise, and the exchange and synchronization of information with business partners.
Method of use: Operational style
The operational style of MDM supports the consumption of master data by operational systems to perform transactions, and the MDM repository is considered the authoritative source of master data. Furthermore, in operational mode, master data is leveraged by applications through services, where services provide control over master data creation, management, quality, and access. For example, as part of a process to add a new customer, a Line of Business (LOB) system would consume an MDM service to validate if this customer is a unique customer or an existing customer. The MDM service would cleanse and standardize the new customer information and perform matching logic against the MDM repository to determine if the customer already exists within the LOB system or within the enterprise.
If it is determined that the customer is a new customer for that LOB, the LOB system could commit the new customer information to its transactional database. The MDM system would now have the new customer information in the MDM repository as well as the LOB system. After the information has been successfully processed, operational MDM would support the integration and the synchronization of new master data with legacy systems, enterprise applications, and data repositories within the enterprise, and the exchange and synchronization of information with business partners.
Figure 3. Operational MDM
MDM systems are used to provide a complete view of a master data object without persisting all of the information within the MDM system itself. Operational MDM provides business and information services to use and maintain master data within the MDM system as well as the ability to reference master data across multiple systems. MDM services can be consumed to maintain cross-reference links to master data consisting of both structured and unstructured data across heterogeneous systems, and to provide a complete view of a master data object, such as a person. For example, registry information in the MDM repository can be used to consume a federated query service to create a virtual record consisting of structured and unstructured data that spans heterogeneous systems, and return the results to an authorized user, application, or process.
Operational MDM is especially important in a Service-Oriented Architecture (SOA). MDM systems include libraries of common services on master data that other systems can call (for example, one centralized procedure that any application can call to query customer information, to adjust the price of a product, or to create a new supplier) in order to ensure information quality and consistency. MDM provides common services to support information-centric procedures across all applications. MDM enables companies to realize internal efficiencies by reducing the cost and complexity of processes that use master data. It reduces manual translation and analysis to improve repeatability and speed to insight. In addition, MDM improves the ability to share, consolidate, and analyze business information quickly, both globally and regionally. It also makes it possible to assemble new, composite applications based on accurate master information and reusable business processes rapidly.
Method of use: Analytical style
In analytical MDM, master data from the MDM system is used as the accurate, clean source for master data to provide the dimensional source for analytical environments, and addresses the need to augment MDM operational services with in-line decision support analytics. In-line decision support analytics can be used to support regulatory compliance, perform conflict management, and detect threat and fraud. Therefore, reducing the risk of increased costs and mitigating potential damage to an organization's reputation. In-line analytics is the analytical activity that takes place on a transactional basis with an understanding of how the master data is being used by the application consuming the MDM service. For example, identity analytics can be used to detect threat and fraud scenarios or be used to prevent anti-money-laundering (AML) activities in order to mitigate risk and adhere to regulatory compliance.
Figure 4. Analytical MDM
Analytical MDM also enables accurate business intelligence, and allows accurate objects and structures to be automatically synchronized with data warehouses and analytic applications. Historically, data warehousing initiatives attempted to address data quality problems downstream from applications. Data warehousing does not fix the business processes that create inaccurate master data in the applications, nor does it correct the master data back in the applications. MDM gives businesses a way to correct bad data and the processes that create bad data at the source. Conversely, data derived from analysis in the data warehouse (for example, lifetime customer value, cross-sell, and up-sell suggestions) could be important data to persist in the MDM system from a data warehouse feed.
Implementing an Enterprise MDM solution is an iterative process that requires the ability to deliver value to the business in incremental stages in order to meet the needs for all stakeholders. An MDM system that continues to deliver sustained value to the enterprise requires the ability to provide Multi-Form MDM support for the management of master data throughout its lifecycle and support the needs of all stakeholders.
MDM architecture patterns attributes
Attributes are used to further describe and characterize the various types of architecture patterns. For MDM architecture patterns, a proposed set of attributes are outlined in the following table:
Table 1. Attributes used for MDM architecture pattern description
|Name||Name of the MDM architecture pattern|
|Type||Type of pattern|
|Methods of use||MDM style where pattern occurs often|
|Objective||Primary objective what pattern tries to achieve|
|Context||Deployment context of pattern|
|Solution||Possible solution space|
|Results||Advantages and disadvantage of using the pattern|
|Relations||Related patterns or known sub-types|
|MDM solutions||One to two most important MDM solutions where the pattern is used|
|Comments||Useful additional considerations|
The name of the pattern is the unique identifier of this pattern and used whenever the pattern is discussed. The type of pattern identifies to which group of MDM patterns the pattern belongs. The methods of use section links the pattern to one or more of the three styles of MDM usage described earlier, where the pattern is most often encountered. The objective briefly summarizes the primary objective of this pattern. The problem section lists the most important problem or problems the pattern addresses. Forces are reasons why the problem(s) the pattern tries to solve are difficult. The context provides information about the assumptions of the deployment context of the pattern. For example, here you might find if its typically deployed in a SOA architecture or a non-SOA architecture and how the environment might affect the deployment of the pattern. The solution provides more details in which cases the pattern is feasible to deploy outlining the solution space. The results section outlines the advantages and disadvantages encountered when the pattern is used. The relations section describes the relations the pattern might have to other patterns. For example, here you would find information on patterns leveraged by this pattern or details why this pattern is related, but different from a known pattern. This section also lists known sub-types of this pattern. The MDM solutions section lists the MDM solutions where this pattern is often used. Finally, any other relevant comments are found in the comments section.
MDM architecture pattern taxonomy
MDM architecture patterns help to accelerate the deployment of MDM solutions, and enable organizations to govern, create, maintain, use, and analyze consistent, complete, contextual, and accurate master data for all stakeholders, such as LOB systems, data warehouses, and trading partners. As composite patterns, MDM patterns sometimes leverage information integration patterns and provide additional capabilities, such as governance, master information life cycle management, and master information business services. The MDM architecture pattern specification helps data, information, and application architects make informed decisions on enterprise architecture and document decision guidelines.
Given the terminology described in the above sections, MDM architecture patterns play at the intersection between MDM architectures (with the consideration of various Enterprise Master Data technical strategies, master data implementation approaches, and MDM methods of use) on one side, and architecture patterns (as the proven and prescriptive artifacts, samples, models, recipes, and so forth) on the other side.
Since there are multiple MDM architecture patterns, a pattern taxonomy helps to classify them into different categories, helping architects to find the patterns needed to solve the problem at hand faster. The following are three proposed categories for MDM architecture patterns:
- MDM application integration patterns: All patterns in this category are EAI centric patterns focusing on processes motivated by MDM use cases. Here, you see patterns that are related to publication/subscription-based integration and message-based integration techniques. Most importantly, these patterns deal with transaction interceptions supporting the deployment of the transactional MDM solution pattern.
- MDM information integration patterns: These patterns are also EAI patterns, however with a clear focus on information integration. These patterns deal with the build and the operational phase of MDM solutions, and are therefore related on ETL, initial data load, and patterns related to build MDM hubs. In addition, EII-related patterns are often leveraged by these MDM information integration patterns as well. For instance, to address patterns regarding hub-style information synchronization, or hybrid (bi-directional) synchronization with transactional systems.
- MDM enterprise system deployment patterns: The MDM patterns in this group are more related to the various deployment scenarios, where the focus is on the integration of other enterprise systems, such as analytical systems, data warehouses, data marts, enterprise resource planning (ERP), customer relationship management (CRM), product lifecycle management (PLM), and other systems. The focus here is not on transaction interception to allow for MDM hub implementations, but the focus is on collaboration with other systems to, for instance, enrich a customer information hub with analytical systems.
Below is a list of patterns you see in these three categories. As MDM solutions become more mainstream in the future, and the areas of deployment broaden, list is expected to expand with new patterns or grow with the identification of new sub-types of known patterns.
- MDM application integration patterns:
- MDM transaction interception pattern
- MDM publish/subscribe pattern
- MDM message-based integration pattern
- MDM information integration patterns:
- MDI pattern
- MDM information synchronization pattern
- MDM enterprise system deployment patterns:
- MDM business intelligence (BI) analytical pattern
- MDM data warehouse pattern
- MDM multiple system pattern
Now each of these patterns will be sketched to provide insight into their major purpose and typical use case scenarios. A fully detailed description, including implementation considerations and technology mapping, is beyond the scope of this initial article on MDM patterns.
MDM transaction interception pattern
The MDM transaction interception pattern is relevant for application systems integration, such as SAP, in the context of the transactional MDM solution pattern. The assumptions for using this pattern are as follows:
- The application system using master data exists and is used after the MDM hub is built.
- Some or all of the users maintain and process either a subset or all attributes of the master data records through the UI of the existing application. The reason for this could be that the project cost does not allow for developing a new UI and workflows as part of the MDM project, and the number of users that would require training on the new master data application front-end is too high.
- Some or all of the applications dealing with master data have a local database storing this information and maybe non-master data.
- A transactional MDM solution pattern with enterprise-wide enforcement regarding validation and business rules, providing master data functions as a single version of the truth, and provides master data to downstream systems. All changes of master data are executed direct or indirect against this system.
If most of these assumptions are given, you will have the need to intercept the business transactions. When a transactional MDM hub is deployed, the transaction interception pattern would provide the following real-time or near real-time integration. Before the application business transaction commits the change of master data, the transactional MDM hub is notified (such as through messaging). Then the MDM hub performs validation or de-duplication, as needed, commits it locally to the transactional MDM hub database, and informs (such as through messaging) the business application that the master data change can be committed. Of course, the notification to the application system must, in this case, include any changes the central MDM system applied to the record received from the business application, which means the business application might commit a (slightly) different version of the master data record compared to the version that it has sent to the MDM hub. Only after the business application receives the answer from the transactional MDM hub does it commit the change to its local system. Note that a "commit" on the application system is not necessarily in the sense of a database or application commit. For example, if you use a status for a master data record, then you could do the following:
If the application system creates a new master data record, then through the transaction interception mechanism, this event would be reported to the transactional MDM hub. Nonetheless, right after the interception occurred, the application transaction commits the change to its database -- marking the new master data record with the status created. As soon as the response from the transactional MDM hub arrives, this record, created locally by the application, is updated with the validated information from the hub. Only once this operation completes, does the new master data record becomes visible to all users of the application by a change of status, for example from created to active.
Table 2. Summary of the MDM transaction interception pattern
|Name||MDM transaction interception pattern|
|Type||MDM application integration pattern|
|Methods of use||Operational|
|Objective||Support construction of transactional MDM hub|
|Context||This pattern can be deployed in an SOA architecture. However, SOA is not a prerequisite for it, and it can be used outside.|
|Solution||Whenever an enterprise-wide transactional MDM hub is deployed, but a slave application system continues to change master data after the hub is built, this pattern might be applicable.|
|Results||The advantage of this pattern is the possibility to deploy the transactional MDM hub solution pattern if applications exist that cannot be separated from their data. The major disadvantage is that depending on the application, the deployment of this pattern is a complex EAI effort.|
|Relations||The MDM message-based integration pattern might be considered a weaker version of this one.|
|MDM solutions||It is often encountered when the transactional MDM solution pattern is deployed.|
|Comments||This pattern is often encountered when SAP application systems require integration in the context of the transactional MDM solution pattern.|
MDM publish/subscribe pattern
This pattern is relevant for integrating pure downstream systems, such as an eCommerce Web site or a print catalog system, which consume master data but do not themselves create or modify master data. An MDM system implemented with the Registry MDM solution pattern, Hybrid MDM solution pattern, or the transactional MDM solution pattern would publish the changes on MDM data on queues to which the downstream systems are subscribed to using this pattern.
Table 3. Summary of MDM publish/subscribe pattern
|Name||MDM publish/subscribe pattern|
|Type||MDM application integration pattern|
|Methods of use||Collaborative|
|Objective||Integrate downstream systems, such as print solutions and eCommerce systems, which read master data, but which do not modify it.|
|Problem||Downstream systems require read access to high quality, up-to-date master data.|
|Forces||Relational data from an MDM system is usually only one source of master data information for printing and an eCommerce system, and usually contains pointers for unstructured data from content management systems that need integration as well.|
|Solution||This pattern can always be used whenever a downstream system requires only read access to master data.|
|Results||The advantage of this pattern is that downstream systems use high quality, consistent master data.|
|Relations||The MDM message-based integration pattern is related to this one.|
|MDM solutions||This pattern is used in the retail MDM solution pattern.|
|Comments||This pattern is usually implemented with messaging middleware.|
MDM message-based integration pattern
This pattern is often used for MDM systems that are used mainly for referential purposes. In such a the MDM systems functions as referential repository only with the lowest set of validation and business rule enforcement representing the smallest common set across all systems. This pattern only triggers a message being sent from the application systems processing master data to the central MDM system that a certain change on master data was performed in order to keep the central, referential MDM hub up to date. The update on the central MDM hub would happen after the fact which means the application system would have persisted the change already locally. This pattern is for example applicable whenever business application systems such as Siebel or SAP continue to function as master system for the processing of master data and a central MDM system is only used as reference master data system.
Table 4. Summary of MDM message-based integration pattern
|Name||MDM message-based integration pattern|
|Type||MDM application integration pattern|
|Methods of use||Collaborative|
|Objective||Support construction of a referential or registry MDM system using the referential MDM solution pattern or the registry MDM solution pattern.|
|Problem||A centralized MDM system is needed for reference purposes or to support a central registration process for customers or products.|
|Context||This pattern requires messaging infrastructure and should be fairly easy to deploy in a SOA architecture with an ESB and transformation services between the application systems and the central MDM system.|
|Solution||If this pattern is chosen, usually only the MDM solutions using the referential MDM solution pattern, or the registry MDM solution pattern, are possible. Its certainly not a good enough approach to build a transactional MDM hub.|
|Results||The advantage of using this pattern is that application users can continue to work with their applications as before, and no training is required. The disadvantage of this pattern is that the central MDM system is not transactional and the master data might not be up to date to the latest version in the application systems.|
|Relations||This pattern is a weaker version of the MDM transaction interception pattern.|
|MDM solutions||A pattern is often used to build MDM solutions using the referential MDM solution pattern or the registry MDM solution pattern.|
|Comments||If the approach for a central MDM system is taken where this pattern is used, usually the master data is still stored in a redundant copy within each database for each application, keeping the storage costs high.|
This pattern describes the master data integration required for building an MDM hub. Implementing this pattern leverages patterns, such as the data consolidation pattern (see the Resources section). The distinguishing aspect of this pattern compared to the base data consolidation pattern, for example, is the integration of metadata management and data governance capabilities on an enterprise scale. This means if the MDI pattern is applied, not only is the MDM system built using patterns from the ETL space, but the technical infrastructure to manage the life cycle of metadata, to manage a centralized, enterprise glossary of terms to improve communication between business and technical employees are deployed as well. Depending on the MDM solution deployed, it might also require that the cleansing and transformation functions are re-usable after the MDM system is initially built to ensure that the way the master data is moved from applications to the MDM system is the same (and therefore consistent) once the MDM system is populated. The IBM Information Server (see the Resources section) enables cleansing and transformation functions to be available as re-usable services. The deployment of these infrastructure components and their integration with the MDM system under construction are the key to successfully applying this pattern. So within an EAI infrastructure, the same cleansing and transformation tasks are reused to keep the central MDM system after construction consistent with the business and validation rules used for building it, as long as these rules stay valid. This pattern is always used with one or multiple MDM architecture patterns to build MDM solutions.
Table 5: Summary of MDI pattern
|Type||MDM information integration pattern|
|Methods of use||All MDM styles|
|Objective||Build an MDM system with metadata management and reusable cleansing and transformation service for reuse while running the MDM system after construction.|
|Solution||This pattern is part of the entire MDM solution space, since it's the foundation of building any MDM system.|
|Results||This pattern is the foundation for any MDM work -- the better this pattern is deployed, the higher the benefits of the MDM system can be.|
|Relations||This pattern is related to the data-consolidation pattern (see the Resources section).|
|MDM solutions||There is no MDM solution without the usage of this pattern.|
|Comments||This pattern is the basic MDM pattern and functions as a mandatory building block in designing any MDM solution.|
MDM information synchronization pattern
The MDM information synchronization pattern is a pattern often encountered when transactional systems and the central MDM systems change master data. The problem with this setup is that in order to keep the master data consistent, these systems need to be integrated with synchronization. Depending on the requirements, the synchronization can be real-time or near real-time. In addition, at least the following topologies (also a mixed thereof) can be encountered:
- Master-slave topology
- Peer-to-peer topology
So for example, it could be that the MDM system is the master and the transactional systems are the slave systems. If it's the other way round, this means that master data is changed only in the transactional systems and the MDM system is read only. For the retail industry, there is a use case where this pattern also applies. Global data pools, such as 1Sync, store attributes and hierarchies for the product master data domain. This information is crucial for retailers in order to get the required product attributes that are published by their suppliers into these global data pools. So retailers need to integrate with these global data pools by means of synchronization. For this particular use case, there is a know sub-type of this pattern called a global data synchronization pattern, because the interfaces of the global data pools are standardized and require synchronization infrastructure complying with them. For more information on global data synchronization, see the Resources section.
Table 6. Summary of the MDM information synchronization pattern
|Name||MDM information synchronization pattern|
|Type||MDM information integration patterns|
|Methods of use||The method of use is collaborative for the known sub-type of this pattern called global data synchronization pattern. Otherwise, the method of use is operational.|
|Objective||The key objective is to synchronize a transactional MDM hub (see the MDM solution patterns section) with other systems. In the case of the known sub-type of the global data synchronization pattern, the purpose of this pattern is to synchronize with external data pools, such as 1Sync.|
|Problem||If the master data is changed outside the central MDM system, the transactional systems doing the change and the central MDM system must synchronize. In the retail industry, external global data pools, such as 1Sync, require integration.|
|Forces||If multiple transactional systems change master data in addition to the central MDM system, then keeping all these systems in sync (in real-time) is difficult. It can be even further complicated if a whole set of different technologies is required to accommodate for different interfaces of internal and external transactional systems.|
|Context||This pattern can be used in SOA and non-SOA architectures. Depending on the synchronization requirements (real-time or near real-time), the synchronization technology might be different. The pattern can appear in peer-to-peer and master-slave synchronization topologies.|
|Solution||This pattern is often applicable if one of the following topologies between the central MDM system and the transactional systems is encountered:
|Results||The advantage of this pattern is its flexibility to connect multiple transactional systems in different topologies with a central MDM system.|
|Relations||The global data synchronization pattern is a known sub-type of this pattern.|
|MDM solutions||The MDM retail solution pattern uses the sub-type of this pattern called global data synchronization pattern.|
MDM BI analytical pattern
This pattern is different from standard information integration patterns used to build data warehouses or data marts. Traditionally, a BI data warehouse receives data from source systems (usually the operational online transaction processing [OLTP] systems) but never provides data back to them. Since a master data hub for the customer or product domain can also feed customer or product core attributes to data warehouses, the question arose whether or not there are use cases where insight gained in the BI system has relevance for the MDM system as well. There are use cases identified by now justifying a two-way integration between MDM hubs and BI analytical systems. For example, a company, after identifying in the BI analytical system the 10 percent of the customers who contributed the most over the last quarter or year, might want to change some attributes in the MDM hub for these customers by providing them a better customer service response time or a better credit card. Other analytical systems that should be integrated using this pattern in a two-way data exchange are entity analytics solutions (EAS) systems to feed insight (for example, requirements in the "Know Your Customer" [KYC] area) detected in the customer data back to the MDM systems. Systems processing AML processes have the need to feedback any insight gained on money transaction inconsistencies back to the MDM system. These analytical systems might even require real-time or near real-time integration with the MDM system.
Table 7. Summary of the MDM BI analytical pattern
|Name||MDM BI analytical pattern|
|Type||MDM enterprise system deployment pattern|
|Methods of use||Analytical|
|Objective||The objective of this pattern is to enhance MDM systems with insight from analytical systems.|
|Problem||Legal requirements, such as:
|Context||This pattern is often deployed when KYC and AML requirements are addressed in financial institutions.|
There are two areas of solutions with MDM systems where this pattern is usually deployed:
|Results||The advantage of this pattern is that the master data is enriched with analytical data leading to avoidance of risks (for example, not doing business with customers on black lists) or by allowing to improve the relation with special customer segments, leading to higher customer satisfaction.|
|Relations||The MDM data warehouse pattern is related for BI systems that read master data but do not update it.|
|MDM solutions||MDM KYC/AML solutions|
MDM data warehouse pattern
This pattern describes the integration between MDM systems and data warehouses and data marts, where these systems are downstream systems and are not providing updates back to the MDM system. The lack of feedback to the MDM hub distinguishes this pattern from the BI analytical system pattern. In addition, this pattern is distinguished from traditional ETL patterns used for building data warehouses, because for the master data part, the data requires less cleansing and transformation while being feed into the data warehouse.
Table 8. Summary of the MDM data warehouse pattern
|Name||MDM data warehouse pattern|
|Type||MDM enterprise system deployment pattern|
|Methods of use||Operational|
|Objective||Feed master data into data warehouses that require master data read-only.|
|Problem||If master data is centralized managed, the construction of a data warehouse requires the integration of master data from the central MDM system as well as the integration from the non-master data portion from the operational systems.|
|Forces||This problem is difficult to solve because the MDM system must be able to support the bulk extraction of the master data while the data warehouse is built, in addition to serve as the MDM system for all applications. This is particularly challenging if a transactional MDM hub is deployed, because then OLTP master data changes are running against the same database, while a huge online analytical processing (OLAP)-like extract for the bulk master data load of the data warehouse might occur, which requires special tuning on many available database offerings.|
|Context||The deployment context of this pattern requires backbones between the MDM system and the data warehouse traditional ETL for the data transfer, because messaging infrastructure might not be able to handle the bulk extract from the MDM system to the data warehouse efficiently enough. For smaller amounts of master data transfer from the MDM system to data warehouse systems, messaging infrastructure, such as ESBs, in an SOA architecture might be good enough.|
|Solution||Since most enterprises run data warehouses today, this pattern is likely part of MDM deployments in many companies.|
|Results||The advantage of using this pattern is that the results of data warehousing improve if the latest available, consistent, and complete master data is used.|
|Relations||For smaller amounts of master data extraction, this pattern is related to the MDM message-based integration pattern.|
|MDM solutions||In MDM solutions for data warehousing, this pattern is used.|
MDM multiple system pattern
This pattern is needed if, after a merger or acquisition, at least two central MDM systems require integration. The issue here is that often the MDM systems are built with different technologies from different vendors. Another use case is that for a set of application systems from a specific vendor, the MDM task can be simplified if these application systems are integrated with the MDM solution from this vendor for this portion of the system landscape. The integration might be simplified with this approach because instead of connecting each of these application systems to the enterprise-wide MDM system, only the MDM system for this portion of the landscape needs integration with the enterprise-wide MDM system, reducing EAI efforts. Or, maybe an LOB already consolidated all their application systems regarding MDM before the decision is made to implement MDM enterprise-wide. Then, instead of integrating all application systems from this LOB individually with the enterprise-wide MDM system, it might be easier, cheaper, and sufficient to just integrate the MDM system this LOB has already created.
If any of these cases apply, then this pattern is applicable.
Table 9. Summary of the MDM multiple system pattern
|Name||MDM multiple system pattern|
|Type||MDM enterprise system deployment pattern|
|Methods of use||All|
|Objective||Integrate multiple MDM systems.|
|Problem||After merger and acquisitions, multiple MDM systems require integration.|
|Forces||If multiple MDM systems, after merger and acquisitions, require integration, often MDM systems built with different technologies require integration. Budget constraints might not allow you to integrate each application individually with a central MDM system (could be anyone of MDM systems after the merger), so that it is cheaper to just integrate the MDM systems among each other. A similar reasoning might apply if a set of applications are easy to integrate with the MDM solution of the application provider. Then, this MDM system as well as all other applications, are then integrated with the enterprise-wide MDM system.|
|Context||Architecture wise, there is no limitation where this pattern might need to be deployed.|
|Solution||The solution provided with this pattern addresses the need for an enterprise-wide central MDM system.|
|Results||The advantage of this pattern is that there might be cost savings if only MDM systems for certain areas of the system landscape are integrated, instead of all applications individually with only one enterprise-wide MDM system after a merger or acquisition.|
MDM solution patterns
None of these categories or types of MDM architecture patterns are sufficient to build and operate MDM systems -- the key to successful MDM solutions is the appropriate composition of chosen MDM architecture patterns. The composition of architecture patterns yield architecture blueprints, which are the architectural underpinning of Enterprise MDM systems and solutions. The MDM enterprise systems deployment patterns, but also the MDM application and information integration patterns, are the key ingredients to develop these MDM solutions. But where needed, this composition needs to include further architecture patterns from other architecture pattern domains. The area of MDM solution patterns contains patterns for complete MDM solutions. Their core characteristic is that they usually require a number of individual MDM architecture patterns or other architecture patterns. For example, the MDM retail solution pattern requires the MDI pattern, the global data pool synchronization pattern, and likely the MDM publish/subscribe pattern to integrate a downstream eCommerce system.
The following are the four key, basic MDM solution patterns:
- Referential MDM solution pattern: This setup of a master data system implies that the application systems are still the master systems of record. Each application system sends out, often in batch mode, updates to the referential master data system (long) after the changes are committed to the local system. The purpose of this master data system is just to have a reference of all master data records in use across all systems in this master data system, but the master data managed and physically stored in the application systems. The purpose is only identification of a specific master data record without support for real-time access.
- Registry MDM solution pattern: At least a skeleton master data system is build to provide a system of record for a real-time reference access. The data is still stored in the application systems, but all attributes are linked from the master data system to allow a dynamic assembling for a complete view of a master data record. This view is usually read-only because modifying access is only possible through the application systems that are still the master systems for the master data but are integrated in a centralized view through linkages. The purpose of such a master data system is real-time reference access.
- Hybrid MDM solution pattern: The hybrid master data hub physically consolidates, at least to a certain degree, the master data records in a centralized database with a single data model. However, as the word hybrid already indicates, the databases serving the applications continue to exist. Therefore, transformations whenever updates from the local application sources are pushed to the hybrid master data hub are necessary because usually different applications store the same entity with slightly different attributes locally. It is used to create and publish a complete view for each master data record without guarantee though, that the version presented is necessarily the latest available. In addition, this master data hub style usually only provides read-only access (write is supported only in rare cases) and the modifying transactions on the master data records are still done through the application systems. This master data hub style allows for data harmonization across systems as well as being able to serve as a central reference system.
- Transactional MDM solution pattern: This style is the true master information hub, where the hub is the master system of record. It physically stores all attributes for all master data records, and supports transactional access to the master data records for all applications from legacy to SOA, usually through service oriented interfaces. Creation, modification, and deletion through the entire lifecycle of a master data record is done through the transactional interfaces of this master data hub, which functions as a single version of the truth for all applications needing access to master data in the system landscape.
Further discussion of these MDM solution patterns are outside the scope of this article.
Further publications will dive into the details of the MDM architecture patterns sketched above, particularly focusing on implementation and deployment aspects along with technology mappings. MDM solution patterns and blueprints will be detailed in future work as well.
- "Information service patterns, Part 1: Data federation pattern" (developerWorks, July 2006): Learn more about information services with the data federation pattern.
- "Information service patterns, Part 2: Data consolidation pattern" (developerWorks, December 2006): Learn more about information services with the data consolidation pattern.
- IBM Information Integration: Learn more about data integration.
- Master Data Management page: Learn more about the SOA-based middleware that provides a uniquely flexible framework to support enterprise structured and unstructured data and business services. Also see the IBM Master Data Management page.
- Learn more about the IBM industry models for Banking, Insurance, Retail and Telecommunications.
- "IBM Global Data Synchronization": Learn more about the global standards-based data synchronization model.
- developerWorks Information Management zone: Learn more about DB2. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
Get products and technologies
- Learn more about the industries first Information Server platform, the IBM Information Server product. Check out the Information Server components at overview as well as the individual components WebSphere Information Analyzer, WebSphere Business Glossary, WebSphere QualityStage, WebSphere DataStage, WebSphere DataStage MVS Edition, WebSphere Federation Server, WebSphere Replication Server, WebSphere MetaData Server and WebSphere Information Services Director.
- Websphere Customer Center: Learn more about the SOA-enabled product for customer data integration (CDI).
- Websphere Product Center: Learn more about the comprehensive product for product information management (PIM).
- Entity Analytics Solutions page: Learn more about this product.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in developerWorks blogs and get involved in the developerWorks community.
Dig deeper into Information management on developerWorks
Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.
Experiment with new directions in software development.
Software development in the cloud. Register today to create a project.
Evaluate IBM software and solutions, and transform challenges into opportunities.