Phase 3: Designing the data warehouse logical model
Business users complete the phases described in the first two phases of this tutorial. Designing IIW logical models is rather IT-oriented, and it is better-suited for IT users (data designers) to complete. The design layer of IIW consists of the atomic part (the enterprise model) and of the analytical part (conformed dimensions model and data mart models). This section describes the IIW enterprise model.
The structure of the enterprise model is similar to the analysis data model. The model is packaged into several packages. Each package consists of a set of artifacts (entities, attributes, and relationships) and some diagrams for visualization needs, as shown in Figure 13. This is the standard structure of logical models in IBM InfoSphere Data Architect.
Figure 13. Sample cut-out from the enterprise model structure
The data warehouse model is a data model that represents the enterprise-wide repository of atomic data used for informational processing. This model includes the history of the value changes of business information that may vary over time. You might want to keep track of this history for analytical purposes.
The data warehouse model defines the following attribute types:
- Fundamental entity
- Contains atomic business information. A fundamental entity is either versioned or non-versioned. It requires either an anchor entity if versioned or a root entity if non-versioned to maintain its versions
- Anchor entity
- Acts as an anchor to maintain different versions of an entity instance. Anchors are also used as time-invariant keys (TIKs) in a data warehouse environment.
- Root entity
- Acts as super-type for non-versioned fundamental entities.
- Associative entity
- Serves as relationship between two root or anchor entities.
- Population characteristic entity
- Contains information regarding ETL jobs.
- Classification entity
- Instantiates specific semantic attributes of data type enumeration.
The entities are connected using relationships of different types as follows, defined for the enterprise model:
- Design relationship
- Connects two entities, usually one-to-many, for design purposes (as navigation or better performance).
- Anchor relationship
- Connects the fundamental entity with the related anchor entity.
The entities have attributes. The attribute types defined in the enterprise model are as follows:
- Basic attribute
- Contains business data.
- Candidate key
- Contains the business key that uniquely identifies an entity instance.
- Relationship attribute
- Contains a foreign key attribute.
- Derived attribute
- Contains a value derived (or duplicated) from the value of one or many of the other attributes.
Figure 14 shows a cutout from the data diagram of the claim domain. The attributes are omitted to improve the readability.
Figure 14. Cutout from the enterprise model for claim domain
The approach of the enterprise model customization is described in detail in the online help installed with Enterprise Model Extender. There are specific techniques to add or modify an entity, relationship, or attribute of any type. The techniques are figured out in the form of eight transformation rules. The rules describe the customization approach by adding semantic association entities and semantic associations; transforming super-semantic entities and sub-semantic entities; and coping with derived and duplicate attributes.
When customizing the enterprise model, the data designer should keep in mind that the elements of the enterprise model should be linked to the appropriate elements of the analysis data model.
Unlike the analysis data model, the enterprise model is IT-oriented. The enterprise model contains the following features:
- Data versioning
- One of the most important aspects of the data
warehousing design. The fundamental entities of the enterprise model
(EM) are featured with the following four attributes:
- Primary keys
- The primary keys defined in the EM are 4-bit integers. They are surrogate keys without business semantic. They are not derived from the primary keys of the source systems.
- Data population
- The entities contain the data population attributes that describe the information about the ETL process.
This section described the enterprise model and the approach to customizing it. You can transfer the enterprise model to the physical model using the InfoSphere Data Architect wizard.