Phase 4: Designing the data mart
Business users complete the phases described in the first two phases of this tutorial. Designing a data mart is rather IT-oriented, and it is better-suited for IT users (data designers) to complete. The design layer of IIW consists of the atomic part (the enterprise model) and of the analytical part (conformed dimensions model and data mart models). This section describes the analytical part.
The conformed dimensional model is the enterprise-wide repository for analytical data. It contains dimensional data structures based on fact entities that enable easy distribution of analytical data to downstream analytical models, such as data marts.
The structure and business content of the dimensions in the conformed dimensional model are based on and re-use the data warehouse model. The measures in the fact tables support the measures defined in the analytical requirements.
In addition to the data structures and content originated from the data warehouse model, the conformed dimensional model includes the following:
- Additional relationships to build the aggregation paths on which measures can be analyzed
- Additional entities for efficiency of storage and analysis, such as profile and helper entities
In InfoSphere Data Architect, there are several pre-defined IIW data marts logical models, as shown in Figure 15.
Figure 15. Sample cut-out from the IIW data mart structure
Each data-mart model itself can consist of a set of analytical subsets. Figure 5 shows that claim efficiency analysis consists of four analytical packages (hold data type) and one association package (hold relationship entities). This is the standard structure of logical models in IBM InfoSphere Data Architect.
The conformed dimensional model is like a super data mart. However, consider end-user queries with caution because of the following reasons:
- The conformed dimensional model's dimensional structures are organized as snowflakes, which make the queries more complex
- Some fact entities might be too general, which can result in business measure names that are less meaningful.
- The number of dimensions per fact entity might be too high. The granularity of dimensions is as low as possible to allow for most granular analytical requirements without requiring any design maintenance. This could potentially result in too many facts and bad response times on queries.
The conformed dimensional model is based on the following two main design principles:
- Conformed dimensions
- A conformed dimension is a master dimension for which all parties in the enterprise have agreed to the contents. Conformed dimensions enable re-usable aggregation paths for measures across multiple fact tables.
- Conformed facts
- A conformed fact is a measure for which all parties in the enterprise have agreed to the business definition. The fact can be used in analytic computations across separate data sources and with other conformed facts.
The two design principles define consistency across fact tables, improve the quality of analytical results, and facilitate analysis techniques, such as drilling across.
The conformed dimensional model is partly denormalized to ease extracting the dimensional structures to populate downstream data mart models.
When using IBM Industry Models to develop BI solutions, the conformed dimensional model uses the following artifacts:
- Fact entity
- An aggregate entity that regroups a set of measures (facts) that all share the same dimensions. The primary key of a fact entity is defined as the concatenation of all foreign keys of the entities that are used as its dimensions. The fact entity is the core entity of a dimensional data structure. It can serve as the basis to create subsets of the structure that are distributed to data marts.
- Helper entity
- A helper entity manages analysis of a complex hierarchical structure of variable depth of a dimension. A helper entity can be used only if the hierarchical structure is a tree and not a network. This entity contains one instance for each separate path from each node in the hierarchical tree to itself and to every node below it. The geographic area helper entity is an example.
- Supportive entity
- An entity that is used only to support the analytical data structures. The calendar date entity is an example.
- Value group entity
- An entity that groups together attributes used to perform analysis on a fact entity. Each attribute of a value group entity has a set of discrete values on which it is possible to aggregate the measures of a fact entity. The customer profile entity is an example.
- Dimension relationship
- A relationship between a fact entity and a fundamental entity or a classification entity from the data warehouse model.
- Complex attribute
- A measure that is calculated from other measures. The definition includes a formula of which the parameters are attributes.
- Derived attribute
- An attribute of which the value is derived from the value of one or more attributes from the data warehouse model. Typically this value includes fact entity measures that are not complex attributes. Their value is calculated from atomic data in the data warehouse model.
- Other artifacts from the data warehouse model
- The conformed dimensional model re-uses entities from the data
warehouse model, typically to define its dimensions and its
derived attributes. Refer to the artifacts of the enterprise model
in Phase 3. Following are some of the
- Fundamental entity
- Classification entity
- Supportive entity
- Basic attribute
- Relationship attribute
Typically, the customization performed on the analytical requirements in the context of the project scope drives the customization of the conformed dimensional model.
Thanks to the mappings to the analytical requirements, the customization of the conformed dimensional model is quite straightforward for all pre-existing Industry Model elements in analytical requirements. If a pre-existing element of analytical requirements is customized in the context of the scoped project, the customization should be reflected accordingly into the conformed dimensional model. For example, definitions and examples of a scoped customized measure need to be reviewed and customized in the conformed dimensional model.
For model elements that need to be created in the conformed dimensional model, you can find the description of this task in a special task description of the Industry Model Data Warehouse Development Method.
The following two sections offer examples of the rules to follow when you create new conformed dimensional model elements.
If a new fact entity is added to the conformed dimensional model, the primary key needs to be added to the list of attributes (measures) originated from the analytical requirements. This entity is composed of all relationship attributes that are the foreign keys on the dimension relationships. The verb phrases are either for dimension or is dimension of.
- For a snapshot fact entity, the time dimension is replaced by a candidate key attribute, such as a reference date.
- The relationship attributes that compose the primary key should be named according to the dimension they represent. For example: Selling agent ID on a dimension relationship with the channel role fundamental entity.
- This will generate a primary key attribute in the EME Data Project Explorer that you should rename as the fact entity name suffixed with PK"
- You need to add the following technical attributes to the primary
- A valid from date attribute, which represents the transaction time for the beginning of the time period during which the values of this recorded data are true in the source system. The valid from date attribute is also classified as a basic attribute, and it uses a data type timestamp of [TIMESTAMP].
- A valid to date attribute, which represents the transaction time for the end of the time period during which the values of this recorded data are true in the source system. The valid to date attribute is also classified as a basic attribute, and it uses a data type timestamp of [TIMESTAMP].
- You need to map the new fact entity to the analytical requirements it supports.
If a new dimension is added on a fact entity in the conformed dimensional model, the new dimension needs to be defined as a dimension relationship that links a fundamental or classification entity of the data warehouse model to the fact entity. The verb phrases are either for dimension or is dimension of.
- You should define the relationship attribute that represents the newly added dimension in the fact entity as a component of the primary key of the fact entity.
- The relationship attribute's name must reflect the dimension it represents. For example: Selling agent ID on a dimension relationship with the channel role fundamental entity.
Figure 16 shows a cutout of an example data diagram for a dimensional data model for a claim handling performance fact table. This fact table is a predefined fact table in the Claim Efficiency Analysis (CEA) application's conformed dimensional model. The attributes are omitted to improve readability.
Figure 16. Example of dimensional model for claim handling performance
(View a larger version of Figure 16.)
The Claim Efficiency Analysis (CEA) application addresses the factors impacting the efficiency of the claims handling process by doing the following:
- Monitoring incoming recovery payments from third parties and re-insurers
- Assessing the distribution of claims amongst intermediaries and the loading of claim handlers
- Reconciling outstanding claims versus claims estimates
- Performing claims statistical analysis that might influence product development
- Analyzing the distribution of claims across all types of loss events.
The CEA solution enables the analytical fact table claim handling performance to monitor and identify inefficiencies in the claims handling process. Resulting reports help to optimize suppliers' networks and to improve operational efficiency and customer satisfaction.
This section described the conformed dimensional model and how to customize it. A conformed dimensional model can be transferred to the physical model using the InfoSphere Data Architect Wizard.