Mining tutorial

In this tutorial, you will learn how to design mining flows and visualize the patterns that mining models discover. You will also learn how to deploy your mining models to a production environment and schedule them to run regularly.

The fictional Sample Outdoors company sells and distributes products to third-party retailer stores around the world. The company also sells directly to customers through its online store. The product lines include outdoor products, such as camping equipment, golf equipment, mountaineering equipment, outdoor protection, and personal accessories. The sales database that the company uses, GSDB, contains a wealth of data, including the company's products, which are organized by product lines. The GSDB database also contains the transactional data from purchases that are made by customers.

The company wants to mine the data in the GSDB database to discover patterns in transactions. The company needs a solution that will return relevant results quickly, with little hassle. Data warehousing in Db2® is an enterprise product that can help the fictional Sample Outdoors company by providing optimized data mining functions.

For this tutorial, you will use data warehousing in Db2 to create an association mining model. The model will look for patterns within your data by finding associations between items. By discovering groups of products that are often purchased together, you can recommend to your customers products that they are likely interested in, based on products that they already purchased.

To perform the associations mining, you must use both the transaction level data and the product hierarchy data to calculate the required association rules. In addition, the product hierarchy data is used by the mining tool to automatically determine associations between individual products, product subgroups, product subgroups and products, product groups and subgroups, and so on.

In the third module, you create mining flows that address a different scenario: you will create a prediction model based on transaction data and use the model to predict which newly ordered items a customer will likely return.

Learning objectives

When you finish the tutorial, you will understand the concepts and know how to do the following tasks:
  • Design a complete mining flow that includes name lookup tables and taxonomy information
  • View your customer data for patterns with an association visualizer
  • Add steps to a mining flow that extract association rules into database tables
  • Deploy your mining models to a production environment and schedule them to update regularly
  • Score records in real time
  • Design a mining flow that uses training data to create a prediction model
  • Score a batch of new records

Time required

This tutorial should take approximately three and a half hours to complete.