Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Manage dimension tables in InfoSphere Information Server DataStage

How to use the Slowly Changing Dimension stage

Brian Caufield (bcaufiel@us.ibm.com), Software Architect, IBM
Brian Caufield photo
Brian Caufield is a software architect in IBM Silicon Valley Lab. Brian has been working in the DataStage development organization for 10 years and was involved in the design of the Slowly Changing Dimension Stage.

Summary:  Information Server DataStage® Version 8.0 introduced the Slowly Changing Dimension (SCD) stage. This tutorial provides step-by-step instructions on how to use the SCD stage for processing dimension table changes. It also shows you how to use the output of the stage to update an associated fact table. The tutorial includes a fully operational download.

Date:  12 Mar 2009
Level:  Intermediate PDF:  A4 and Letter (727 KB | 32 pages)Get Adobe® Reader®

Activity:  24017 views
Comments:  

Before you start

The Slowly Changing Dimension stage was added in the 8.0 release of InfoSphere Information Server DataStage. It is designed specifically to support the types of activities required to populate and maintain records in star schema data models, specifically dimension table data. The Slowly Changing Dimension stage encapsulates all of the dimension maintenance logic — finding existing records, generating surrogate keys, checking for changes, and what action to take when changes occur. In addition, you can associate dimension record surrogate key values with source records, which eliminates the need for additional lookups in later processing.

About this tutorial

This tutorial is designed to introduce you to using the Slowly Changing Dimension stage on the Information Server DataStage parallel canvas. The tutorial uses a simplified example scenario that focuses on Slowly Changing Dimension functionality. Actual business scenarios may require different approaches to the job design used in this tutorial's example. The volume of data processed in the tutorial is intentionally small to make it easier to understand the processing that is taking place.

The material in the SCD_Tutorial.zip file in the Download section is built to run on a Windows platform with a DB2 database. You can modify the material to run on a different platform or to use a different database.


Objectives

In this tutorial, you will learn how to design a job that uses the Slowly Changing Dimension stage to perform updating and loading of dimension and fact tables. After completion, you will be able to configure the SCD stage for history-tracking changes and in-place changes, and use the output of the stage to update an associated fact table.


Prerequisites

This tutorial is written for DataStage developers who are familiar with the DataStage Parallel Edition design canvas. You will also benefit if you already have a knowledge of star schema design concepts (including fact and dimension tables), the use of surrogate keys, and the usual methodology for updating dimension tables.


System requirements

To create the job in this tutorial, you need an Information Server DataStage 8.x installation that is licensed to use the parallel engine. You also need a DataStage Designer client and access to a DataStage project where you can create, import, compile, and run DataStage jobs.

To use the sample scripts in the SCD_Tutorial.zip download, your Information Server must be installed on a Windows® OS with access to a DB2 database. However, you can also modify the scripts to work on other operating systems and with a different database.

1 of 8 | Next

Comments



Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=375710
TutorialTitle=Manage dimension tables in InfoSphere Information Server DataStage
publish-date=03122009
author1-email=bcaufiel@us.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.