Before you start
About this series
This tutorial series, Get started with Informix Warehouse Feature, introduces you to the highlights and capabilities of the new client and server software tools available for you in the Informix Warehouse option. These tools enable you to create and deploy a project for modeling the databases in your Informix data warehousing platform and automating the data transformation and movement processes to easily acquire and integrate data from different sources into your Informix warehouse.
The tutorials in this series are organized following this line of deployment: First, you will use the client component of the Informix Warehouse software — the Design Studio — to create the data warehouse (DW) project around Informix and build the data models for source and target databases. Then, you will continue using Design Studio to create and test the jobs for extracting, loading, and transforming (ELT) data from various sources (typically, OLTP databases and external files) into the target warehouse repository on Informix. Finally, the series illustrates the use of the server components in the Informix Warehouse — the Administration Console and SQW services — to manage and monitor resources associated to your warehouse, including the ELT jobs you programmed before in Design Studio.
About this tutorial
This tutorial is the first part of the Get started with Informix Warehouse Feature series. Get an introduction to the Informix Warehouse option, and learn more about its components and architecture. Review some of the main concepts in data warehousing, and put the Informix Warehouse Feature in context in a data warehousing solution using Informix. Set up and follow an imaginary business scenario that will run the entire tutorial series, where the Informix Warehouse Feature is used to design and populate the new warehouse repository. Finally, start using the Informix Warehouse: Get your hands on the Design Studio tool to create a new DW/ELT project, define the connections to your data sources and targets, create the physical data models for those databases using two different methods, visualize the entity-relationship (ER) diagrams of these models, and deploy the model for your target warehouse in the physical database.
When you have completed this tutorial, you will:
- Know the components and architecture of a solution involving Informix Warehouse
- Be familiar with the functionality of the components in the software
- Be familiar with Informix Warehouse's Design Studio — the client component of Informix Warehouse — to create a new DW/ELT project, connect to the data sources, and design the data models of the source and target (the warehouse) databases involved
This tutorial is written based on an Informix Dynamic Server and
Informix Warehouse installation on a Windows® platform. However,
most of the information in this tutorial applies in the context of a
solution on UNIX® or Linux® too. The tutorial is written
for Informix database users who are familiar with and have working
knowledge of Informix tools such as
dbaccess, SQL language and scripts, Windows
environments, and Eclipse-based IDE/GUI tools. You should also have
skills and experience in relational database design, ER models, data
warehousing, and ETL tools.
This tutorial assumes that you have:
- Informix Warehouse Feature Client installed. For more information about how to install the product, please refer to the Resources section.
- Informix Dynamic Server (IDS) 11.50.xC3 or later installed and a basic working knowledge of IDS. For download information, please refer to the Resources section.
To run the examples in this tutorial, you need a system with at least the Informix Warehouse Feature client installed and an Informix Dynamic Server installed, with a database server instance up and running (local or remote).
This tutorial works on a fictional and simplified example of a business scenario in the Retail industry. You will design and deploy an Informix data mart for Sales and Cost analysis for the retail company Sports Stores. The new warehouse repository will combine and aggregate data from one Informix operational database and a few external files.
You will use the sample databases stores_demo as the source operational (OLTP) system, and sales_demo as the target warehouse (OLAP) system. To create the databases, follow these steps (this tutorial assumes you run these commands as user admin, which will be the schema under which the tables will be defined):
- Drop any databases called stores_demo and sales_demo, if they already exist.
- Create and populate the sample stores_demo database with
the logging option. The
dbaccessdemoscript can be found in $INFORMIXDIR/bin:
- Create an empty sales_demo database with the logging
dbaccess – - > create database sales_demo with log ;
Introduction to Informix Warehouse Feature
The Informix Warehouse Feature simplifies the design and deployment of an Informix warehouse. It allows customers to more easily enable business applications on Informix databases, supplying a state-of-the-art Extract-Load-Transform (ELT) tool in an easy-to-use GUI/Eclipse environment. This platform provides the foundation you need to cost effectively deploy and build next-generation analytic solutions using the IBM Informix Dynamic Server.
Using Informix for a data warehouse is an ideal solution for Informix users who want to build end-to-end business intelligence (BI) and reporting solutions using data from various sources, including IDS. Users can more effectively leverage Informix for BI using Informix Warehouse capabilities to create and populate a data warehouse repository, and then utilize front-end analysis and reporting tools, like IBM Cognos®, to provide BI dashboards and other types of analytic applications and reports on top of the new warehouse repository. Informix users can deploy an end-to-end warehouse solution, simplifying operational complexity and reducing costs by using a single database server product for both operational and warehouse data.
Informix Warehouse has a component-based architecture with client and server pieces. The following software components are provided in Informix Warehouse:
- Informix Warehouse client:
- Informix Warehouse Design Studio
- Informix Warehouse server:
- WebSphere® Application Server
- Informix Warehouse Administration Console
- SQL Warehousing (SQW)
Informix Warehouse Client
Informix Warehouse Design Studio is an Eclipse-based common design environment for defining the source and target databases involved in your DW project, creating and reverse engineering physical data models of your databases, and building SQL-based data flows and control flows to quickly and easily build in-database data movements and transformations into your warehouse.
Design Studio is fully integrated with technology pieces from InfoSphere Data Architect® (IDA) for the graphical environment that allows data modeling of your databases (from scratch—empty template, from templates, or using reverse engineering). Design Studio is also fully integrated with what is called the SQL Warehousing (SQW) Tool, for the graphical environment and SQL-code generation of data and control flows (which are sequences of ELT jobs). Therefore, this tutorial only refers to Design Studio as the single Client component in the Informix Warehouse Feature.
Informix Warehouse Server
Informix Warehouse Administration Console is a Web-based application for administering database and system resources related to your warehouse, as well as deploying, scheduling, and monitoring the control flows previously created in Design Studio through processes called the SQL Warehousing (SQW) services. In order to support these SQW run-time services, Informix Warehouse Console comes with WebSphere Application Server, but you can run the Administration Console on any other Java-based application server. The Administration Console allows you to:
- Manage common resources, such as database connections and machine resources
- Schedule the execution of control flows (sequences of ELT data flows)
- Monitor the execution status
Informix Warehouse Architecture
Typically, you can arrange these Informix Warehouse Feature components along with the source and target database nodes in a three-group architecture like the one illustrated in Figure 1. Figure 1 illustrates the interaction between the different nodes in the platform. Notice that the Informix Warehouse client component relates with both the existing data sources (stored in the supported DBMS —IDS included— or in external files) and the new or existing data target (our datawarehouse, on IDS) for the purpose of designing and testing the ELT jobs. Whenever these jobs are ready to be deployed as approved and automated processes in the warehouse, the Informix Warehouse client component interacts with the Informix Warehouse server component to deploy these ELT jobs as packages or applications. The Informix Warehouse server component takes these packaged ELT jobs and allows you to schedule them for automatic execution on the target IDS data warehouse, and to monitor the status of these jobs execution as well. This architecture diagram also provides a basis for planning your installation across multiple computers, and illustrates the integration points that will be needed between the source and target data nodes and the new Informix Warehouse components (client and server).
Figure 1. Architecture overview of Informix Warehouse Feature with source and target database nodes
Now, focusing only on the new software, you can install the Informix Warehouse client and server components and the IDS for your warehouse repository in different or common machines, depending on whether you look for a multi-tier architecture or whether the operating systems you run on the computers support the different software groups' requisites. For example, you could install the data warehouse server (IDS) and the Informix Warehouse server components on the same computer or on two different computers. Consult the system requirements for the client and server components of the Informix Warehouse when planning your architecture. Figure 2 illustrates the different architecture layouts for this solution:
Figure 2. Logical groups of Informix Warehouse components and IDS data warehouse on different computers
The next section of this tutorial reviews some of the key concepts in the context of a data warehousing solution and the role that the Informix Warehouse Feature plays to help in the implementation of such solutions.
Data warehousing – An overview
In order to quickly respond to the market and keep a competitive advantage, organizations need an integrated and historic view of their businesses. To support this requirement, an IT platform can be created to consolidate the data from the different heterogeneous systems across the company's business functions and operations, and even external sources, into one or more common databases called a data warehouse.
Data warehouses are designed with the business-context in mind (its terms, data, and performance measures) and typically take the form of multi-dimensional databases to better support decision-support applications.
In the broadest sense of the term, a data warehouse refers to a database that contains very large stores of historical data. The data is stored as a series of snapshots and aggregated data at a specific time, which allows a user to reconstruct history, make trend and projection analysis, discover gaps and problems, and make accurate comparisons between different time periods using Business Intelligence (BI) tools available in the market, like IBM Cognos. In addition, by using Data Mining tools, the business analysts can analyze the patterns in the data values and even predict business behavior based on different variables. A data warehouse includes processes in place to clean, integrate, and transform the data that it retrieves before it is loaded into the warehouse database so the repository contains a truthful integrated business view of a specific function or view across multiple functions within the organization.
The term data warehouse can mean different things to different people. A data-warehousing environment can encompass any of the following forms:
- Data warehouse
- Data mart
- Operational data store
Please, refer to the "Overview of Data Warehousing" page in the IDS V11.50 Information Center for further explanation on these types of repositories (see Resources).
Figure 3 summarizes the software technologies and tools that are typically present or needed in the infrastructure of an end-to-end data warehousing solution. It also illustrates how the data flows from the sources to the targets in the process of creating and maintaining the data warehouse. As we move from the operational to the analytical world, the database systems' purpose and design change paradigms from OLTP to OLAP. The data warehouse itself can comprise one or more databases of the repositories types (ODS, data staging repository, data mart, data warehouse) mentioned earlier. Some of these could be the final data warehouse version that the front-end BI tools will use, and some could be used as staging databases in more complex data consolidation projects. Depending on the business need, you may want to put different warehouse repositories available to different business intelligence tools (reporting, dashboard, mashups, OLAP analysis and data mining). Last but not least, the extract-transform-load (ETL) or extract-load-transform (ELT) tools are needed to automate the initial and periodic tasks of consolidating and summarizing data from the different data sources (typically, OLTP database systems, external repositories and files) into the warehousing databases. Depending on the DW project, the ETL/ELT tools and processes can be deployed in several places and can be working in several stages to deal with the complexity of data extraction, transformation, cleansing, aggregation, and loading.
Figure 3. Overview of a typical Data Warehousing Infrastructure
The Informix Warehouse role in an Informix-based DW solution
To be effective, a data warehouse has to contain trusted information. It also has to be integrated, subject-oriented, non-volatile, and time-variant. Therefore, the design of the data warehouse and the periodic process to populate it is very important. Designers must make sure the design is consistent with the business queries that will be conducted within the warehouse. Another important aspect of data warehouses is data acquisition, which can be defined as transferring data from a source to the target warehouse. Data acquisition is one of the most expensive parts of building a data warehouse. This process will often be conducted with an extract-transform-load (ETL) or an extract-load-transform (ELT) tool.
It is here, in the data warehouse design and ELT processes, where Informix Warehouse Feature helps. This tutorial series explains by example how to use the tools available within the Informix Warehouse Feature to accomplish these tasks for Informix warehouse repositories.
Once the initial data has been transferred to the data warehouse, the process must be repeated consistently to incrementally update the warehouse according to the new data and changes in the operational sources. Data acquisition is a continuous process, and the goal is to make sure the warehouse is updated on a regular basis. When this update occurs in real time, the process is then called change data capture. This has become a separate field, and there are a number of products currently available to deal with it, including IBM InfoSphere Change Data Capture® (CDC), which can use Informix as either source or target system.
If very sophisticated and specialized ETL and data cleansing tools are needed to satisfy all types of data integration needs, the IBM InfoSphere DataStage® and QualityStage® products, which also work with Informix, are ideal for this purpose. InfoSphere CDC, DataStage, and QualityStage are products in the IBM InfoSphere® Information Server offering.
Differences between OLTP and OLAP: A change in approaches
IT systems are divided into transactional (OLTP) and analytical (OLAP). In general, you can assume that OLTP systems provide source data to data warehouses — the OLAP systems — that will help analyze it.
OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE) predefined and pre-optimized within the application. The main emphasis for OLTP systems is put on very fast query and transaction processing to support the mission-critical operation of the organization. ERP, SCM, CRM systems to support Manufacturing, Sales, Finance/Accounting, Distribution, Payroll, and Marketing functions are typical examples of OLTP systems.
OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex, performed ad-hoc, and involve aggregations. For OLAP systems, a response time is an effectiveness measure because the users of these systems are typically business analysts and executives who need a strategic view of several functions of the business.
The design of a data warehouse / OLAP database is fundamentally different than a OLTP database. The data warehouse is specifically designed in the form of dimensional databases, where the tables will be structured in cubes, star, or snowflakes schemas to facilitate fast query times and multi-dimensional analysis. In these schemas, the measures of the business (for example, cost, revenue) are stored in what are called the fact tables, whereas the data based on which we analyze those measures (for example, time, geography, product, customer) is stored in what are called the dimension tables. For more information, you can consult the "IBM Informix Database Design and Implementation Guide – Dimensional Databases" in the IDS V11.50 Information Center (see Resources).
Table 1 summarizes the major differences between OLTP and OLAP systems design. (Click here to see a detailed version of this comparison.) The detailed table is partially based on information from the book Building the Data Warehouse (William H. Inmon; Wiley, John & Sons, Incorporated, March 2002; ISBN: 9780471081302) and on www.rainmakerworks.com.
Table 1. Differences between OLTP and OLAP systems
|Characteristic||Online Transactional Systems (OLTP) Operational Systems||Online Analytical Processing (OLAP) Decision Support Systems (DSS)|
|Main Purpose||To support day-to-day operations, control, and run fundamental business tasks||To support managerial, strategic planning and problem solving, and decision support needs|
|Queries||Short transactions; relatively simple SQL||Longer transactions; complex SQL with analytics|
|Updates||Random updates; few rows accessed||Sequential/bulk updates; many rows accessed|
|Processing speed and response times||Sub-second response time||Seconds to minutes response time|
|Database model||ER Modeling; minimizes redundancy||Dimensional Modeling; okay to have redundancy|
|Data normalization||Normalized data (5NF); minimizes duplicates||De-normalized data (3NF); duplicates are okay|
|Indexes||Few indexes; avoids index maintenance cost in writes||Okay to have more indexes; mostly read-only operations|
|Workload predictability and tuning||Pre-compiled queries; repeated execution of queries||Ad-hoc queries; unpredictable load|
After this brief review of data warehousing concepts and OLAP design, and putting IDS and the Informix Warehouse Feature in context, let's take a look at the fictional business scenario you'll use in this tutorial and the rest of series to illustrate the capabilities of the Informix Warehouse tools.
This tutorial series will step you through the process of designing and deploying a simple Informix data warehouse for a fictional Retail company Sports Stores that uses the well-known Informix sample OLTP database stores_demo as its source operational system. The desired warehouse repository is a Data Mart for Sales and Cost Analysis that will be stored in the newly created OLAP database sales_demo.
To review the steps to prepare these two databases for the hands-on exercises in this tutorial, please refer to the "System requirements" section.
Informix Warehouse can help Sports Stores by providing graphical tools for modeling the data mart tables in the sales_demo database and loading its tables with combined and transformed data extracted from the existing operational data sources (in this case, the stores_demo database — also on Informix — and a few plain-text files).
The existing OLTP source database: The stores_demo database
The stores_demo schema contains the transactional data for the Sports Stores retail. At the beginning of this tutorial, the "System requirements" section describes how to create and populate this database. Figure 4 shows the logical data model of the stores_demo schema. For more information about this OLTP database, you can consult the "IBM Informix Guide to SQL: Reference – Appendix A. The stores_demo Database" in the IDS V11.50 Information Center (see Resources).
Figure 4. Logical data model of the operational database stores_demo
(Click here to see a larger image of Figure 4.)
Table 2 describes the nine tables that are in the physical model:
Table 2. Description of the tables in the physical data model of the stores_demo schema
|Physical table name||Description|
|customer||Contains information about the retail stores' customers that place orders from the distributor.|
|orders||Contains information about orders placed by the customers of the distributor.|
|items||An order can include one or more items. One row exists in the items table for each item in an order.|
|stock||The distributor carries 41 types of sporting goods from various manufacturers. More than one manufacturer can supply an item.|
|catalog||Describes each item in stock. Retail stores use this table when placing orders with the distributor.|
|cust_calls||All customer calls for information on orders, shipments, or complaints are logged. The cust_calls table contains information about these types of customer calls.|
|call_type||The call codes associated with customer calls are stored in the call_type table.|
|manufact||Stores information about the nine manufacturers whose sporting goods are handled by the distributor.|
|state||Contains the names and postal abbreviations for the 50 states of the United States.|
The new OLAP target database: The sales_demo database
The data mart database sales_demo will contain the star schema and the aggregated data for the Sports Store retail chain that is required by BI analysts for sales and cost analysis. Again, the "System requirements" at the beginning of this tutorial describes how to create an empty sales_demo database. Figure 5 shows the intended logical data model of the sales_demo schema, the OLAP database for your sample data mart. Notice the star-form created by the dimension tables around the fact table:
Figure 5. Logical data model of the sales_demo schema
Table 3 describes the dimension tables and fact table in the physical data model of the sales_demo schema:
Table 3. Description of the tables in the physical data model of the sales_demo schema
|Physical table name||Description|
|time||Dimension table. It contains time information about the sale.|
|customer||Dimension table. It contains information about sales customers.|
|product||Dimension table. It contains information about the products sold through the data warehouse.|
|geography||Dimension table. It contains information about the sales district and region.|
|sales||Fact table. It contains information about product sales and has a pointer to each dimension table.|
This tutorial uses Informix Warehouse's Design Studio to design the sales_demo database from scratch. You can use the SQL script code in Listing 1 as guidance:
Listing 1. Sample SQL script code intended for sales_demo database
-- The following statements create the tables for the sales_demo database: create table geography ( district_code serial, district_name char(15), state_code char(2), state_name char(18), region smallint); create table product ( product_code integer, product_name char(31), vendor_code char(3), vendor_name char(15), product_line_code smallint, product_line_name char(15)); create table customer ( customer_code integer, customer_name char(31), company_name char(20)); create table time ( time_code int, order_date date, month_code smallint, month_name char(10), quarter_code smallint, quarter_name char(10), year integer ); create table sales ( customer_code integer, district_code smallint, time_code integer, product_code integer, units_sold smallint, revenue money (8,2), cost money (8,2), net_profit money(8,2));
In real-life scenarios you may need to use a combination of templates and reverse engineering to aid in the process of creating the data model of a large data warehouse.
For more information about this database, you can consult the "IBM Informix Guide to SQL: Reference – Appendix B. The sales_demo and superstores_demo Databases" in the IDS V11.50 Information Center (see Resources).
Based on this imaginary business scenario, this tutorial series will use the different tools in the Informix Warehouse Feature to design and deploy a platform for a data warehousing solution for the retail company Sports Stores.
In the following sections of this tutorial, see how to start using Design Studio (the client tool of the Informix Warehouse) to create a new Eclipse-based workspace. Then, define and test the database connections to both the existing operational sources you later want to extract data from (in our case, just stores_demo, on Informix) and the new datamart you want to design and populate (the currently empty Informix database sales_demo). After this first setup, you will use the information provided in this "Business scenario" section to guide the creation of the tables in the sales_demo schema and complete the data model of this new OLAP system. In order to produce the data model for the existing stores_demo OLTP system, you will use reverse engineering to discover the tables and relationships, so Design Studio can automatically generate this model from the Informix catalog information about the stores_demo database.
Getting started with the Informix Warehouse Client
Launching Design Studio
- Select Start > All Programs > IBM
Informix Warehouse Feature > DesignStudio.
Figure 6. Launching the Informix Warehouse Client from Start menu
Alternatively, you can launch Design Studio's Eclipse IDE by running the command
eclipse.exefrom C:\Program Files\ISWarehouse\ds\ (default location of Design Studio on Windows) or from your actual Design Studio directory.
On UNIX, this tends to be installed at /opt/IBM/ISWarehouse. The Informix Warehouse Feature Client can be launched from /opt/IBM/ISWarehouse/ds/ by executing the command
- When the following Workspace Launcher window appears, change the
name of the workspace to workspace_demo and click
Figure 7. Informix Warehouse Feature Client splash screen
- A Welcome screen is shown after the workspace is created for the
first time. Close the screen, as shown in Figure 8:
Figure 8. Closing the Welcome screen
The newly created (blank) workspace will appear, as depicted in Figure 9:
Figure 9. A blank workspace
Now you're ready to start working with Design Studio. In the next steps you will identify and define the connections and data models of the databases participating in the DW/ELT project.
Defining the database connections
In order to start working with databases in your data warehousing application (ELT project), you need to create and test the database connections to the different databases participating in this project as either source or target systems (in this case, the connections to both sample databases stores_demo and sales_demo).
Create two new database connections (one for sales_demo and another one for stores_demo) using the Informix JDBC driver®. Optionally, you can use the IBM Data Server Driver for JDBC and SQLJ. This tutorial connects using the Informix JDBC driver.
- On the Data Source Explorer tab, right-click on Database
Connections > New…
Figure 10. Creating a new database connection
A New Connection wizard screen appears.
Figure 11. New Connection wizard
- Since this tutorial uses the Informix database and Informix
JDBC® driver, select Informix under the Select a
database manager option and click on the drop-down menu
under the JDBC driver option to select Informix 11.5 –
Informix JDBC Driver Default.
Figure 12. New Connection wizard, part 2
- One database at a time, fill out the connection information for
both stores_demo and sales_demo database
connections. Test the connections to ensure your database
connection information is correct and that you are able to connect
to both databases. (Figures 13 and 14 illustrate the steps for the
stores_demo database only.)
Figure 13. Testing the new connection
- Click on Next to edit the filtering option.
- Uncheck the box next to Disable filter to allow filtering.
Click on the radio button next to Selection. Check the box
next to admin, the user that was used when creating the
database (if you used a different user, check that option
Figure 14. Using filter option to select admin schema
- Click on Finish. A new database connection is created on the Data Source Explorer panel, as depicted in Figure 15.
- Follow the steps above for both the stores_demo and the
sales_demo databases. Figure 15 shows that both
connections are created and alive:
Figure 15. Data Source Explorer showing both connections created and active
You can browse through the objects of the databases on the Data Source Explorer panel on the left side and see the properties of the database connection's objects in the Properties window on the right side of the workspace.
Creating a new DW / ELT project in Design Studio
Now, let's create a new project for our data warehouse and its ELT jobs.
Notice the Data Project Explorer located in the upper left panel, as shown in Figure 16. It should be empty.
Figure 16. Data Project Explorer
- To create the Data Warehousing project, click on File
> New > Data Warehousing Project
Figure 17. Creating Data Warehousing project
The New Project wizard screen appears.
Sales Demo - ELT and DW Projectin the Project name field.
Figure 18. New Project wizard
- Click on Finish. The new project now appears on the left
side, under the Data Project Explorer.
Figure 19. New project is created
In the next step, you're going to generate the data model for the existing OLTP database stores_demo using reverse engineering to discover it and pick up just the tables you need for the DW/ELT project and visualizing them in an ER representation.
Creating physical data model from reverse engineering
Now, let's create the physical data model for the existing stores_demo database using the reverse engineering option. In this section, you'll also create an overview diagram to easily see the Entity-Relationship diagram.
- Right click on the Sales Demo – ELT and DW project, and
select New > Physical Data Model, as
illustrated in Figure 20:
Figure 20. Creating a physical data model
- The New Physical Data Model wizard appears. Change the Destination Folder location to the project name, Sales Demo – ELT and DW project. To do so, you use the Browse… button and select the project.
- Rename the file name in the File name field to
- Change Database to Informix, and change Version to 11.50.
- Select the radio button next to Create from reverse
engineering, then select Next.
Figure 21 illustrates the New Physical Data Model dialog window with the values suggested in the steps above:
Figure 21. New Physical Data Model wizard
- Select the stores_demo connection, and click on Next
Figure 22. Selecting stores_demo connection
- Select the admin schema (or a different schema if you did
not use admin when you created the stores_demo database),
then click on Next.
Figure 23. Selecting the schema
- The next screen asks you to select the database elements. Keep the
default values, and click on Next.
Figure 24. Selecting database elements
- Generate the overview diagram by checking the box next to
Overview in the Generate diagrams area. Also,
check the box next to Infer implicit relationships, then
click on Finish.
Figure 25. Finishing the Physical Data Model wizard
The newly created database model stores_demo_model will appear on the left panel, under the Data Models folder, and the database model Properties will appear on the right panel. Save all the changes.
- If you need to display the visual ER diagram of the discovered
stores_demo data model with the tables under
admin user/schema, go to Data Project Explorer
> Data Diagrams. Then, go to the data diagram for
stores_demo, stores_demo_model.dbm, and
double-click on the schema/user name admin. The ER diagram
will appear on the right side.
Figure 26. Opening the Overview diagram (ER model) of stores_demo database for admin schema
If you do not see the ER model (overview diagram), you might need to go to Data Models > stores_demo_model.dbm > admin schema, right-click on admin, select Add to Overview Diagram, and create a new diagram that includes the tables under admin schema.
- Arrange the model to your visual preferences. The following are
suggestions to help arrange the entities and relationships in the
Overview diagram displayed inside the canvas:
- To zoom out the diagram, click on the Zoom out button on the Palette, located on the right side of the workspace, and then click on the diagram area to zoom it out.
- Use the Select button to drag and drop the entities to make them closer to one another and wherever they would better go in the diagram.
- In order to automatically arrange the relationship connectors to the new order of the entities that you created, select Select All Connectors from the Select drop-down list on the top toolbar (or alternatively use the Menu option: Diagram > Select > All Connectors). All the relationship connectors will be selected.
- Next, select Arrange Selected Connectors from the Arrange drop-down list on the top toolbar. This action automatically arranges the connectors based on the new positions where you placed the entities.
- Now, you can zoom in again and use the Select tool to move the entities and relationships according to your preferences.
- Continue iterating this way until you see an ER diagram you feel comfortable with. Then save the changes.
- An overview diagram like the one in Figure 27 is obtained:
Figure 27. Overview diagram (ER model) for stores_demo after arranging entities and relationships
Modifying the stores_demo diagram
- Open the Overview diagram for stores_demo.
- While you are still in the stores_demo diagram window, you
can remove the ER entities call_type and catalog
because they will not be used to populate the data mart
sales_demo. After removing the two entities, rearrange
the remaining entities so they can be visible in the same space.
Figure 28. Stores_demo diagram showing the relationships
- Save all the changes.
- You could decide to create the missing non-identifying relationship between tables customer and state. To do so, you can use a relationship connector that links state(code) (parent) with customer(state) (child).
Creating physical data model from a template
Let's now create the physical data model for sales_demo from scratch (from an empty template).
- Create a new physical data model using the sales_demo
database and the specifications from before, except select
Create from template this time, then click on
Figure 29. Template option
- The next screen provides an option to use an existing template. Go
with the default (empty template), and click on Finish.
Figure 30. Selecting a template
Both data models have been created under the Data Models folder in the Sales Demo – ELT and DW Project. The data diagrams are created under the Data Diagrams folder in the project.
Figure 31. Location of the data models and data diagrams
- Notice that the sales_demo.dbm does not have a database specified.
It is named Database. Let's change that to
- Click on Database under sales_demo.dbm (you might need to double click on sales_demo.dbm to expand the tree).
- The Properties window appears on the lower right panel.
Change the name "Database" to
sales_demo, and select Save.
Figure 32. Changing the default database name
- Do the same for Schema. To do so, expand the database folder (now sales_demo) and change the schema property name.
The expanded tree of the sales_demo physical data model should now look like Figure 33 (notice the Diagrams folder is also expanded):
Figure 33. Expanded tree of sales_demo physical data model
Creating new tables for sales_demo
The diagrams are representative of the databases themselves in the form of Entity-Relationship model. In this section, explore the diagrams and learn how to manipulate them. You'll also create new tables for the sales_demo database.
- Double-click to open the Diagram1 under the Diagrams
folder. You see a blank diagram with the Palette on the right.
This palette allows you to drag and drop objects on to the blank
canvas to create entities and relationships.
Figure 34. Blank diagram with palette
Explore the Palette and notice the Data and Geometric Shapes folders. You're going to drag and drop the Table object on to the blank canvas to create a table.
- Drag and drop the Table object on to the canvas.
Figure 35. Drag and drop Table object
- Let's now make a fact table called sales. To rename the
table from Table1 to sales, click on the name and
sales. Alternatively, you can click on the Table1 table and the properties window will appear on the lower panel where you can rename Table1 to sales.
- Add the columns to the sales table. Click on the
Columns tab in Properties window, and select the Add
Column button (the plus sign icon) to add the columns to
the sales table. Add the following columns, as illustrated in
Figure 36: customer_code, district_code, time_code, product_code,
units_sold, revenue, cost, net_profit. Make sure Not Null
is specified for customer_code, district_code,
time_code, and product_code. They are primary
keys, but leave the Primary Key check box unchecked for all
of them because you do not want to worry about referential
integrity after the first bulk load for now.
Figure 36. sales fact table
- Repeat the steps for the dimensional tables: geography
(Figure 37), product
(Figure 38), customer
(Figure 39), and time
Figure 37. geography dimensional table
Figure 38. product dimensional table
Figure 39. customer dimensional table
Figure 40. time dimensional table
- Click on Save. Figure 41 illustrates how the diagram will
Figure 41. Updated sales_demo diagram
- Expand the schema to view the newly created tables.
Figure 42. New tables under admin schema with the sales table expanded
You want to update the database in the actual server with the new tables.
- Right-click on the schema (in this case, admin), and click
on Generate DDL....
Figure 43. Generate DDL
- The Generate DDL wizard appears. Use the default option (CREATE
statements only), and click on Next.
Figure 44. Generate DDL wizard screen 1
- The second screen asks you which objects should be included in the
script. Keep the default, and click on Next.
Figure 45. Generate DDL wizard screen 2
- The next screen allows the DDL to be reviewed. Notice the folder
and file name. They can be edited to your preference. This example
uses the default name and folder. You want to run this script
against the server right away, so check the box next to Run DDL
on server, and click on Next.
Figure 46. Generate DDL wizard screen 3
- Select the sales_demo database in the next screen, and
click on Next.
Figure 47. Generate DDL wizard screen 4
- View the summary screen, and click on Finish.
Figure 48. Generate DDL wizard screen 5
- To make sure the script was run successfully, take a look at the
SQL Results view in the lower right panel. It also
displays the job status on the right.
Figure 49. Reviewing the job status
This view is useful if the script execution is not successful. You can expand the status and analyze every DDL statement executed. Sometimes it would say the table already existed in the database server.
You can verify if the tables are created by opening a dbaccess session and viewing the database.
Figure 50. Verifying the tables are created
Congratulations!! You have completed this tutorial! Now you have created your new Informix-based warehouse repository, sales_demo, and you are ready to go through the other tutorials in this series to create and deploy the processes needed to populate your data warehouse.
This first tutorial allowed you to get familiar with the Informix Warehouse Feature so you can easily start using it.
This tutorial also introduced you to the context, architecture, and components of Informix Warehouse. It illustrated with an example of how to start using the Client component of the software, Design Studio, in order to create a DW/ELT project, identify the connections, and create the data models of the databases that will later participate as sources and targets in the data movement and transformation processes that will help to populate your data warehouse.
This tutorial guided you through the process of designing the data models of these databases to include the tables in the scope of the DW solution using two different techniques: Discovering the Entity-Relational model of a database using reverse engineering for existing repositories with tables, or modeling the database from scratch (from an empty template), based on the design we had originally planned for the new warehouse repository, using the graphical environment provided by Design Studio. These data models remain part of the DW/ELT project and are exportable as templates or DDL scripts, which you can validate, analyze impact of the desired changes, and execute them directly against the actual databases using Design Studio.
The following parts of this tutorial series will show you how to continue using Design Studio to now design the data and control flows that will do the task of populating the new data warehouse repository you have created in this tutorial, to finally show you how to deploy and monitor those ELT jobs using the Informix Warehouse Server component, the Administration Console.
- Using Informix for your Warehouse: Consult the main Informix Warehouse page for more information regarding Informix Warehouse.
- "Using Informix for your Data warehouse: Overview" (IBM, March 2009): Learn more about using Informix for your Data warehouse (PDF).
V11.50 Information Center:
Find the information that you need to use the IDS family of products and
- IBM Informix Warehouse information
- Informix Warehouse Installation Guide
- Overview of Data Warehousing
- IBM Informix Database Design and Implementation Guide - Dimensional Databases
- IBM Informix Guide to SQL: Reference – Appendix A. The stores_demo Database
- IBM Informix Guide to SQL: Reference – Appendix B. The sales_demo and superstores_demo Databases
- "Data modeling with InfoSphere Data Architect and Informix Dynamic Server" (developerWorks, March 2009): Learn how to use InfoSphere Data Architect (IDA) as a modeling tool for Informix Dynamic Server (IDS).
- InfoSphere Data Architect Information Center: Find the information that you need to use InfoSphere Data Architect.
- developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Technology bookstore: Browse for books on these and other technical topics.
Get products and technologies
- IBM Informix Dynamic Server Developer Edition: Get a free trial download of Informix Dynamic Server Developer Edition 11.5.
- Informix product family trials and demos: Obtain evaluation and trial software, drivers, fixes and patches for Informix products.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Guy Bowerman's Informix Application Development blog: Read Guy Bowerman's blog, which covers IDS application development, with digressions into virtualization, cloud computing and other topics.
- Jacques Roy's Informix and computing blog: Read Jacques Roy's blog discussing Informix and computing.
- Fernando Nunes's Informix technology blog: Read Fernando Nunes's Informix technology blog.
- Participate in developerWorks blogs and get involved in the developerWorks community.