SQL and XQuery tutorial for IBM DB2, appendix: Appendix A

All the tables in the Aroma database and their relationships with each other

Appendix A describes all of the tables in the Aroma database, which consists of two schemas: a simple star schema for retail sales information and a multistar schema for purchasing information. This appendix belongs to the SQL & XQuery Tutorial for IBM DB2 series.

Pat Moffatt (pmoffatt@ca.ibm.com), Information Management Program Manager, IBM Academic Initiative, IBM

Photo of Pat MoffattPat Moffatt is the Information Management Program Manager for the IBM Academic Initiative. Through the Academic Initiative program, she ensures that appropriate Information Management resources are made available to help faculty integrate Information Management software into their curriculum. To learn more about this program, visit www.ibm.com/university/data.



Bruce Creighton (bcreight@ca.ibm.com), Skills Segment Planner, IBM

Bruce CreightonBruce Creighton is a Skills Segment Planner in the Information Management Education Planning and Development department. In this role, he plans investment in educational content and balances the investment between areas where IBM can attain revenue and those where the requirement for skills development are important enough to provide free education.



Jessica Cao, Training Tools Developer, IBM

Jessica CaoJessica Cao is an Arts and Science and Computer Science student at McMaster University. She expects to complete her combined honours degree in April 2009. Jessica is working in IBM Toronto lab's DB2 Information Management Skills Channel Planning and Enablement Program to take advantage of her interest in programming, editing, and writing.



31 August 2006

Also available in Russian Vietnamese

About this appendix

Appendix A describes all of the tables in the Aroma database, which consists of two schemas -- a simple star schema for retail sales information and a multistar schema for purchasing information.

Most of the examples in this article use the tables in the retail schema. The purchasing tables are used in a few examples that require a more flexible schema for adequate illustration.


Connect to a database

You need to connect to a database before you can use SQL statements to query or manipulate data. The CONNECT statement associates a database connection with a username.

Find out from your instructor the database name that you need to be connected to. For this series, the database name is aromadb.

To connect to the aromadb database, type the following command in the DB2 command line processor:

CONNECT TO aromadb USER userid USING password

Replace the userid and password with the user ID and password that you received from your instructor. If no user ID and password are required, simply use the following command:

CONNECT TO aromadb

The following message tells you that you have made a successful connection:

Database Connection Information
Database server      = DB2/NT 9.0.0
SQL authorization ID = USERID
Local database alias = AROMADB

Once you are connected, you can start using the database.


Aroma database: Retail schema

Most of the examples in this guide are based on data from the basic Aroma database, which tracks daily retail sales in stores owned by the Aroma Coffee and Tea Company. The Figure 1 illustrates this basic schema.

Figure 1. Basic schema
Basic schema

The crow's-feet in this diagram indicate a one-to-many relationship between the two tables. For example, each distinct value in the Perkey column of the Period table can occur only once in that table but many times in the Sales table.

Basic aroma schema

The following tables make up the basic Aroma database:

TableDescription
PeriodDefines time intervals such as days, months, and years.
ClassDefines classes of products sold at retail stores.
ProductDefines individual products sold at retail stores, including bulk and packaged coffee and tea, and coffee machines.
MarketDefines the geographical markets of the business.
StoreDefines individual retail stores owned and operated by the Aroma Coffee and Tea Company.
PromotionDefines the types, durations, and values of promotions run on different products.
SalesContains the sales figures for Aroma products during time periods at various stores.
GoldContains customers who have purchased promotion cards.

The Period, Class, Product, Market, Store, Promotion, and Gold tables are examples of typical business dimensions. They are small and contain descriptive data that is familiar to users.

The Sales table is a good example of a fact table. It contains thousands of rows, and its largely additive information is accessed in queries by joins to the dimension tables it references.


Sample data from the Class and Product tables

Query

SELECT * FROM aroma.class;
Result
ClasskeyClass_TypeClass_Desc
1 Bulk_beans Bulk coffee products
2 Bulk_tea Bulk tea products
3 Bulk_spice Bulk spices
4 Pkg_coffee Individually packaged coffee products
5 Pkg_tea Individually packaged tea products
6 Pkg_spice Individually packaged spice products
7 Hardware Coffee mugs, teapots, spice jars, espresso machines
8 Gifts Samplers, gift boxes, and baskets
12 Clothing T-shirts and caps

Query

SELECT * FROM aroma.product;
Result
ClasskeyProdkeyProd_NamePkg_Type
1 0 Veracruzano No pkg
1 1 Xalapa Lapa No pkg
1 10 Colombiano No pkg
1 11 Expresso XO No pkg
1 12 La Antigua No pkg
1 20 Lotta Latte No pkg
1 21 Cafe Au Lait No pkg
1 22 NA Lite No pkg
1 30 Aroma Roma No pkg
1 31 Demitasse Ms No pkg
2 0 Darjeeling Number 1 No pkg
2 1 Darjeeling Special No pkg
2 10 Assam Grade A No pkg
...

Class and Product tables

The Product table describes the products defined in the Aroma database. The Class table describes the classes to which those products belong.

Column descriptions: Class table
Column NameContents
classkeyInteger that identifies exactly one row in the Class table. Classkey is the primary key.
class_typeCharacter string that identifies a group of products.
class_descCharacter string that describes a group of products.
Column Descriptions: Product table
Column NameContents
classkeyForeign-key reference to the Class table.
prodkeyInteger that combines with the Classkey value to identify exactly one row in the Product table. Classkey/prodkey is a two-column primary key.
prod_nameCharacter string that identifies a product. The database contains 59 products. A fully populated database would have many more. Although some Aroma products have the same name, they belong to different classes and have different package types.
pkg_typeCharacter string that identifies the type of packaging for each product.

Sample data from the store and market tables

Query

SELECT * FROM aroma.market;
Result
MktkeyHQ_CityHQ_StateDistrictRegion
1AtlantaGAAtlantaSouth
2MiamiFLAtlantaSouth
3New OrleansLANew OrleansSouth
4HoustonTXNew OrleansSouth
5New YorkNYNew YorkNorth
...

Query

SELECT * FROM aroma.store;
Result
StorekeyMktkeyStore_TypeStore_NameStreetCityStateZip
114SmallRoasters, Los Gatos1234 University AveLos GatosCA95032
214LargeSan Jose Roasting Company5678 Bascom AveSan JoseCA95156
314MediumCupertino Coffee Supply987 DeAnza BlvdCupertinoCA97865
43MediumMoulin Rouge Roasting898 Main StreetNew OrleansLA70125
510SmallMoon Pennies98675 University AveDetroitMI48209
69SmallThe Coffee Club9865 Lakeshore BlvdChicagoIL06060
...

Market and Store tables

The Store table defines the Aroma Coffee and Tea Company stores. The Market table describes the U.S. markets to which each store belongs. Each market is identified by a major metropolitan city. The Market table is an outboard table, like the Class table.

Column descriptions: Market table
Column NameContents
MktkeyInteger that identifies exactly one row in the Market table. Mktkey is the primary key.
hq_cityCharacter string that identifies a city. The Market table defines 17 cities. A fully populated database could have thousands.
StateCharacter string that identifies a state.
DistrictCharacter string that identifies a district based on a major metropolitan city. A global database would contain countries and nations or other geographic dimensions.
RegionCharacter string that identifies a region. The Market table defines only four regions for the entire United States. A comprehensive database would include numerous regions and probably more districts within a region.
Column descriptions: Store table
Column NameContents
storekeyInteger that identifies exactly one row in the Store table. Storekey is the primary key.
MktkeyForeign-key reference to the Market table.
store_typeCharacter string that identifies stores by size.
store_nameCharacter string that identifies a store by name.
street, city, state, zipColumns that identify the address of each store.

Sample data from the Period table

Query

SELECT * FROM aroma.period;
Result
Perkey Date Day Week Month Qtr Year
1 2004-01-01 TH 1 JAN Q1_04 2004
2 2004-01-02 FR 1 JAN Q1_04 2004
3 2004-01-03 SA 1 JAN Q1_04 2004
4 2004-01-04 SU 2 JAN Q1_04 2004
5 2004-01-05 MO 2 JAN Q1_04 2004
6 2004-01-06 TU 2 JAN Q1_04 2004
7 2004-01-07 WE 2 JAN Q1_04 2004
8 2004-01-08 TH 2 JAN Q1_04 2004
9 2004-01-09 FR 2 JAN Q1_04 2004
10 2004-01-10 SA 2 JAN Q1_04 2004
11 2004-01-11 SU 3 JAN Q1_04 2004
12 2004-01-12 MO 3 JAN Q1_04 2004
13 2004-01-13 TU 3 JAN Q1_04 2004
14 2004-01-14 WE 3 JAN Q1_04 2004
15 2004-01-15 TH 3 JAN Q1_04 2004
16 2004-01-16 FR 3 JAN Q1_04 2004
17 2004-01-17 SA 3 JAN Q1_04 2004
18 2004-01-18 SU 4 JAN Q1_04 2004
19 2004-01-19 MO 4 JAN Q1_04 2004
20 2004-01-20 TU 4 JAN Q1_04 2004
...

Period table

The Period table defines daily, weekly, monthly, quarterly, and yearly time periods for 2004, 2005, and the first quarter of 2006.

Column descriptions
Column NameContents
PerkeyInteger that identifies exactly one row in the Period table. Perkey is the primary key.
DateDate value that identifies each day from January 1, 1998 through March 31, 2000.
DayCharacter-string abbreviation of the day of the week.
WeekInteger that identifies each week of each year by number (1 through 53, each new week starting on a Sunday).
MonthCharacter-string abbreviation of the name of each month.
QtrCharacter string that uniquely identifies each quarter (for example, Q1_98, Q3_99).
YearInteger that identifies the year.

Sample data from the Promotion table

Query

SELECT * FROM aroma.promotion;
Result
PromokeyPromo_TypePromo_DescValueStart_DateEnd_Date
01No promotion0.009999-01-019999-01-01
1100Aroma catalog coupon1.002004-01-012004-01-31
2100Aroma catalog coupon1.002004-02-012004-02-29
3100Aroma catalog coupon1.002004-03-012004-03-31
4100Aroma catalog coupon1.002004-04-012004-04-30
5100Aroma catalog coupon1.002004-05-012004-05-31
6100Aroma catalog coupon1.002004-06-012004-06-30
7100Aroma catalog coupon1.002004-07-012004-07-31
8100Aroma catalog coupon1.002004-08-012004-08-31
9100Aroma catalog coupon1.002004-09-012004-09-30
10100Aroma catalog coupon1.002004-10-012004-10-31
11100Aroma catalog coupon1.002004-11-012004-11-30
12100Aroma catalog coupon1.002004-12-012004-12-31
13100Aroma catalog coupon1.002005-01-012005-01-31
14100Aroma catalog coupon1.002005-02-012005-02-28
15100Aroma catalog coupon1.002005-03-012005-03-31
16100Aroma catalog coupon1.002005-04-012005-04-30
17100Aroma catalog coupon1.002005-05-012005-05-31
18100Aroma catalog coupon1.002005-06-012005-06-30
19100Aroma catalog coupon1.002005-07-012005-07-31
20100Aroma catalog coupon1.002005-08-012005-08-31
...

Promotion table

The Promotion table is a dimension table that describes promotions that are run on different products during different time periods. Promotion tables are sometimes referred to as condition tables because they indicate the conditions under which goods are sold.

Column descriptions
Column NameContents
promokeyInteger that identifies exactly one row in the Promotion table. Promokey is the primary key.
promo_typeInteger that identifies the promotion by number (or code).
promo_descCharacter string that describes the promotion type.
valueDecimal number that represents the dollar value of the promotion, such as a price reduction or the value of a coupon.
start_date, end_dateDate values that indicate when each promotion begins and ends.

Sample data from the Sales table

Query

SELECT * FROM aroma.sales;
Result
PerkeyClasskeyProdkeyStorekeyPromokeyQuantityDollars
22011168 34.00
241211169 60.75
2111111640 270.00
2230111616 36.00
2522111611 30.25
2130111630 187.50
2110111625 143.75
24102012 87.00
24112014 115.50
22222018 58.50
2402017 136.00
2502013 74.75
24302014 101.50
22102018 63.00
21223011 99.00
2646306 36.00
25123010 40.00
21113036 279.00
2513011 132.00
25103012 48.00
...

Sales table

Multipart primary key

The Sales table contains a multipart primary key: Each of its five columns is a foreign-key reference to the primary key of another table:

perkey, classkey, prodkey, storekey, promokey

This primary key links the Sales table data to the Period, Product, Store, and Promotion dimensions.

Joins with the Sales table are shown in examples throughout this tutorial series.

Column descriptions
Column NameContents
perkeyForeign-key reference to the Period table.
classkeyForeign-key reference to the Class table.
prodkeyForeign-key reference to the Product table.
storekeyForeign-key reference to the Store table.
promokeyForeign-key reference to the Promotion table.
quantityInteger that represents the total quantity sold (per day).
dollarsDecimal number that represents dollar sales figures (per day).

Sample data from the Gold table

Note: Participants will not find this table in the Aroma database upon initialization. It is meant to be created by participants in Part 6 of this series, Data Definition Language and Control Language (developerWorks, August 2006).

The following results table displays the table as it would be after "The INSERT Statement" section of Part 6.

Query

SELECT * FROM aroma.gold;
Result
Card_idStorekeyFnameLnameEmailStatus
11AdaAlexandercoffee_lover@hotmail.comActive
21BenBowmanmmmcoffee@yahoo.comActive
31CynthiaCheniheartcoffee@gmail.comActive
41DavidDoylecoffee_maniac@msn.comActive
51EmilyEnrightcupAday@hotmail.comActive
61FrankFreemancoffee_addict@gmail.comActive

Gold table

The Gold table contains information about customers who have purchased a promotion card.

Column descriptions
Column NameContents
card_idSMALLINT that identifies the customer with an unique card number.
storekeyForeign-key reference to the Store table.
fnameCharacter string that stores the first name of the customer.
lnameCharacter string that stores the last name of the customer.
emailCharacter string that stores the email of the customer.
statusCharacter string that stores the status of the customer.

Purchasing schema of the Aroma database

A few of the examples in this tutorial series are based on tables used to track product orders that the Aroma Coffee and Tea Company receives from its suppliers. This purchasing schema uses the same Product, Class, and Period dimensions as the retail schema but has two dimensions of its own: Deal and Supplier. The Line_Items and Orders tables both contain facts, but the Orders table can also be queried as a dimension table referenced by the Line_Items table.

Figure 2 illustrates the tables in the purchasing schema.

Figure 2. Tables in purchasing schema
Tables in purchasing schema

Multistar schema

The primary keys of the Line_Items and Orders tables do not match the set of their respective dimension-table foreign keys. Any given combination of dimension table primary keys can point to more than one row in these fact tables. This type of table is known as a multistar fact table or data list.

For example, multiple order numbers in the Orders table can refer to the same set of Supplier, Deal, and Period characteristics:

Order_NoPerkeySupkeyDealkey
369981710070
370081710070

Purchasing tables

The purchasing schema contains similar kinds of facts to those stored in the Sales table, prices, and quantities. The prices are dollar values representing amounts paid to suppliers for whole orders or specific line items within orders. The quantities represent units of a product ordered.

You can use this schema to ask interesting questions about the purchasing history of the Aroma Coffee and Tea Company, for example, which suppliers give the best deal on which products, or which suppliers have the best record for closing orders.

The Aroma Coffee and Tea Company sells the same products throughout its stores as it orders through its suppliers. Therefore, you can write queries that span both schemas to compare what was ordered with what was sold or to calculate simple profit margins.

The following tables make up the purchasing schema of the Aroma database:

PeriodDefines time intervals such as days, months, and years.
ClassDefines classes of products, both sold at retail stores and ordered from suppliers.
ProductDefines individual products, both sold at retail stores and ordered from suppliers.
SupplierDefines the suppliers of products ordered by the Aroma Coffee and Tea Company.
DealDefines the discount deals applied to orders by suppliers.
Line_ItemsContains the line item detail information for product orders, including the price and quantity of each item on each order.
OrdersContains information about product orders, such as the full price of each order and the types of products ordered.

The Supplier and Deal tables are exclusive to the purchasing schema and are referenced by the Orders table.

Tip: The purchasing schema contains data for the first quarter of 2000 only.


Sample data from the Supplier and Deal tables

Query

SELECT * FROM aroma.supplier;
Result
SupkeyTypeNameStreetCityStateZip
1001Coffee/teaCB Imports100 Church StreetMountain ViewCA94001
1002TeaTea Makers, Inc.1555 Hicks Rd.San JoseCA95124
...

Query

SELECT * FROM aroma.deal;
Result
DealkeyDeal_TypeDeal_DescDiscount
01000No deal0.00
1100Orders over $10,000100.00
2100Orders over $20,000500.00
3100Supplier catalog coupon50.00
4100Supplier catalog coupon100.00
37200Supplier coffee special75.00
38200Supplier coffee special50.00
39200Supplier tea special40.00
40200Supplier tea special20.00

The supplier and deal tables

Column descriptions: Supplier table
Column NameContents
supkeyInteger that identifies exactly one row in the Supplier table. Supkey is the primary key.
typeCharacter string that indicates the type of products supplied.
nameCharacter string that identifies the supplier by name.
street, city, state, zipColumns that identify the address of the supplier.
Column descriptions: Deal table
Column NameContents
dealkeyInteger that identifies exactly one row in the Deal table. Dealkey is the primary key.
deal_typeInteger that identifies the type of deal (a code number).
deal_descCharacter string that describes the type of deal.
discountDecimal value that indicates the dollar amount of the deal applied to an order.

Shared dimensions

The purchasing schema shares the Period, Product, and Class tables with the retail schema.

As well as querying the retail and purchasing schemas independently, you can pose some interesting questions that involve tables from both schemas. For example, you can join the Sales and Line_Items tables to compare quantities of products ordered with quantities of products sold. A query like this uses the shared dimensions to constrain products and periods.


Sample data from the Orders and Line_Items tables

Query

SELECT * FROM aroma.orders;
Result
Order_NoPerkeySupkeyDealkeyOrder_TypeOrder_DescClose_DatePrice
3600731100137CoffeeWhole coffee b2006-01-071200.46
3601732100137CoffeeWhole coffee b2006-01-071535.94
360273310010TeaLoose tea, bul2006-01-07780.00
3603740100139TeaLoose tea, bul2006-01-21956.45
360474410050SpicePre-packed spi2006-01-16800.66
360576810032CoffeeWhole-bean and2006-02-1225100.00
360677510032CoffeeWhole-bean and2006-02-1925100.00
360778210032CoffeeWhole-bean and2006-02-2525100.00
360878910032CoffeeWhole-bean and2006-03-0330250.00
360979610032CoffeeWhole-bean and2006-03-1525100.00
...

Query

SELECT * FROM aroma.line_items;
Result
Order_NoLine_ItemPerkeyClasskeyProdkeyReceive_DateQtyPrice
36001731112006-01-0740180.46
360027312102006-01-0750300.00
360037312112006-01-0780240.00
360047312122006-01-07150240.00
360057311202006-01-0760240.00
36011732102006-01-0760240.00
36012732112006-01-0760240.00
360137321102006-01-0760240.00
360147321112006-01-0760240.00
360157321122006-01-0760240.00
360167321312006-01-0770335.94
36021733202006-01-0870130.00
36022733212006-01-0870130.00
...

The Orders and Line_Items tables

The Orders and Line_Items tables contain the purchasing facts.

Column descriptions: Orders table
Column NameContents
order_noInteger that identifies exactly one row in the Orders table. Order_No is the primary key.
perkeyForeign-key reference to the Period table.
supkeyForeign-key reference to the Supplier table.
dealkeyForeign-key reference to the Deal table.
order_typeCharacter string that defines the types of products on the order.
order_descCharacter string that describes the type of order.
close_dateDate value that identifies when the order was completed or closed.
priceDecimal value that indicates the full cost of the order.
Column descriptions: Line_Items table
Column NameContents
order_noInteger that identifies exactly one row in the Orders table. Order_No is the primary key.
line_itemInteger that identifies each item listed on the order by number.
perkeyForeign-key reference to the Period table.
classkeyForeign-key reference to the Class table.
prodkeyForeign-key reference to the Product table.
receive_dateDate value that identifies when the line item was received.
quantityInteger that indicates the quantity of products ordered for each line item.
priceDecimal value that indicates the cost of the line item.

Resources

Learn

Get products and technologies

  • Download DB2 Express-C, a no-charge data server for use in development and deployment of applications .
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, XML
ArticleID=156793
ArticleTitle=SQL and XQuery tutorial for IBM DB2, appendix: Appendix A
publish-date=08312006