Before you start
Data warehousing design and data modeling is a well-known, significant blend of computer science and IT. The technology grew up in the early 1990s using several approaches developed during that time. The most significant methods were defined by Ralph Kimball (top-down) and W. H. Inmon (bottom-up) (see Resources).
Commercial data modeling products are valuable because of their content-specific knowledge, which is based on practical experience and business expertise. IBM offers an intellectual capital product family in this space called IBM Industry Models. The IBM Industry Models products consist of mature and well-tested pattern frameworks for data modeling (relational and multi-dimensional), packaged for several industries. This article presents an overview of the Information Insurance Warehouse (IIW), which is a part of the IBM Industry Models product defined for the insurance industry.
This tutorial introduces the method to develop data models for data warehouse (DWH) using the IBM Industry Model IIW. The tutorial illustrates the approach for the development of the core data warehouse (CDW) models (highly normalized data models that hold the atomic data elements) and the data mart (DM) models (de-normalized data models that implement the structure of multi-dimensional data models). Multi-dimensional data models are characterized by the definition of measures, which are stored in fact tables, and by the definition of dimension tables, which defines the axes or dimensions of the analysis.
The method described in this tutorial is the IIW roadmap for developing data models. The IIW roadmap is based on the top-down approach, which starts with the capture of business requirements and the definition of the business model (in terms of the IIW known as the analysis data model). Defining the business requirements is the prerequisite for all further work. Ideally, this work should be carried out as a joint effort of the data modeler and experts from the business departments. When the business departments create and approve the model, the phase to create logical models starts.
The design of logical models consists of two steps: designing the DWH logical model (CDW) followed by designing the DM logical model. It is important to follow this sequential approach. Starting succeeding phases before finishing predecessor phases might yield undesirable results. Therefore the IIW roadmap's structure, and this tutorial, is divided into the following four phases:
- Phase 1: Capturing IIW business requirements
- Phase 2: Defining the analysis data model
- Phase 3: Designing the data warehouse logical model
- Phase 4: Designing the data mart
These four phases fulfill different goals and offer different deliverables:
- Phase 1: Capturing IIW business requirements
- A complete description of the
business requirements that the BI project should solve. The
deliverables are a conceptual model and an analytical requirements model.
- Conceptual model
- A model of all the concepts and business terms to be used across the organization
- Analytical requirements model
- Predefined models of business requirements that address specific industry issues. Models are expressed as measures and dimensions
- Phase 2: Defining the analysis data model
- A conceptual model that represents an ideal
picture of the business concepts and how these concepts relate to each
other. This model is platform independent and
does not require physical aspects of the implementation. The deliverable is the analysis data model.
- Analysis data model
- A data model that specifies the normalized data structures required to represent the concepts defined in the conceptual model.
- DWH and DM design phases
- The business concepts mapped
on an entity-relationship (ER) logical model (DWH) and on a
multi-dimensional (MD) logical model. These models are the basis for the
physical structure of the data in the database. The deliverables are DW
design data models and DM design data models.
- DW design data models
- Data models that represent the enterprise-wide repository of atomic and analytical data used for informational processing
- DM design data models
- Dimensional models that implement analytical requirements and are structured to allow specific dimensional analyses
Figure 1 summarizes these deliverables.
Figure 1. Deliverables of the four IIW phases
IIW also defines three model layers:
- The foundation layer contains the conceptual and analytical requirements models.
- The analysis layer covers the analysis data model.
- The design layer contains the DW design and the DM design models.
The diagram in Figure 2 depicts these layers.
Figure 2. IIW model layers
The following sections of the tutorial describe the four phases with examples of each phase using InfoSphere™ Data Architect (IDA). The examples use IBM IIW Model Version 8.2. The IIW model content is imported in IDA with the help of the Enterprise Model Extender (EME) tool. EME is a set of plug-in extensions to the IBM InfoSphere Data Architect product. To follow along with the tutorial, you will need these products installed.



