Integration of InfoSphere Warehouse Data Mining with IBM Cognos Reporting, Part 3 : Invoke mining dynamically from Cognos using a market basket analysis example

Association rules express which items, events or other entities often occur simultaneously in large datasets. This knowledge can be applied, for instance, in market basket analysis to leverage cross-selling potentials by recommending products that are ofter bought together. You can apply association rule mining in InfoSphere Warehouse and export the resulting model to Cognos reports, similar to the way previous articles in this series did with Cluster and Classification models. Since association rule mining is a highly interactive task, a better solution would be to allow the user to call mining directly from a Cognos report, possibly specifying additional parameters. Such an approach can be denoted as dynamic or ad hoc mining. In this article you will learn to achieve this.

Share:

Benjamin G. Leonhardi (bleon@de.ibm.com), Software Engineer, IBM

Author Photo: Benjamin LeonhardiBenjamin Leonhardi is a software engineer for InfoSphere Warehouse data mining at the IBM Research & Development Lab in Boeblingen, Germany. He works on mining visualization, text mining, and mining reporting solutions.



Christoph Sieb (christoph.sieb@de.ibm.com), Senior Software Engineer, IBM

Author Photo: Christoph SiebChristoph Sieb is a senior software engineer at the IBM Research & Development Lab in Boeblingen, Germany. Christoph is responsible for data mining algorithms and framework-related issues of InfoSphere Warehouse data mining. He has more than seven years of experience as a developer and consultant in the area of business intelligence and data mining.



Dr. Michael J. Wurst (mwurst@de.ibm.com), Senior Software Engineer, IBM

Author Photo: Michael WurstMichael Wurst, Ph.D. is a senior software engineer at the IBM Research & Development Lab in Boeblingen, Germany. He holds a Ph.D. in computer science and is responsible for data mining algorithms and tooling in InfoSphere Warehouse. Prior to joining IBM, Michael worked as a co-developer, architect, and consultant for the RapidMiner data mining software.



30 December 2008

Also available in Chinese Russian Vietnamese

Introduction

It is important that, when you are working with large amounts of data, you understand the rules of how different entities are related to each other. Often uncovering these rules is a complex process. Association rules are a very simple but powerful formalism for rules that describe datasets because they express which entities can occur simultaneously with each other.

A traditional example of association rules can be seen in the retail sector. For instance, a general association rule is “IF eggs AND milk THEN sugar” which expresses that many customers who buy eggs and milk when shopping also buy sugar. Such rules can be discovered efficiently from even very large data sets. InfoSphere Warehouse allows you to find these rules conveniently as this article describes in detail.

The previous article of this series described how to first build a mining model in InfoSphere Warehouse and then to publish the results using Cognos. This is fine in some scenarios. Often, however, an analyst wants to influence the analysis interactively, for example, constrain the rules that are found. Going to InfoSphere Warehouse DesignStudio and back to Cognos each time a parameter is changed is not very convenient. Rather you would like to create interactive reports that allow the user to set parameters in a Cognos report, invoke mining in the background automatically and then to feed the results back to Cognos. This article shows you how to do those exact steps.

First, learn about the task of association rule mining and how to achieve it in InfoSphere Warehouse. Then, see how to invoke data mining dynamically from Cognos. Finally, read an example of this mechanism could be used in real time.


Market basket analysis and association rule mining

Association rules and their applications

Association rules describe which items often co-occur. The concept of a transaction is of essential importance in this context. A single visit to a shop could be seen as a transaction, and all of the items bought in the course of this visit are then co-occurring and are called a market basket. Though, also all purchases of a customer over a year could be seen as a transaction. That depends on which kinds of rules you are interested in.

Let’s take a look at a simple example. The following table contains transactions of items. Each transaction contains all the items that a customer bought at a single visit to a store.

Figure 1. Sample transactions
sample transaction

As you can see, a pc is bought quite often. Also, the combination of pc and monitor appears in 20% of the transactions. As you can also see, in 50% of the cases in which a pc and a monitor is bought, also a mouse is purchased. If we take this as a rule: [PC, Monitor]->Mouse, where [Pc,Monitor] is the body and Mouse is the head or target, then the number of transactions that include the body and head divided by the number of all transactions available is called the support (20%) . This measure tells us how often the rule occurs in the data. It doesn't tell us how accurate the rule is though. The confidence describes the percentage of times a body of the rule occurs in a transaction and the head also occurs. This gives information on the accuracy of the rule. Association rule mining requires the user to state a minimal support and confidence. The algorithm then identifies exactly the combinations of items and the rules that fulfill these criteria.

There are many application scenarios in which association rule mining is used. We already mentioned the most prominent one: retail. Association rules are used, for instance, to arrange goods in a supermarket such that products that are often purchased in the same visit to a store are located close to one another. Much more importantly, they are used to leverage cross-selling potential in e-commerce by recommending to users which items could be additionally interesting for them. Another application area is health care. Association rules can be used to find out about health issues that often co-occur such that patients with a given diagnosis can be checked for additional health problems that often come together with this diagnosis. Other application areas include intrusion detection, web log analysis, database access patterns, and so on.

A slight variation of association rules are sequential rules. These rules do not only state which entities co-occur but how events are related concerning their appearance over time. For instance, such rules may state that customers who first bought a notebook and then a bag for it, will probably buy a portable mouse in a time frame of six months. Most of what is presented in the rest of this article applies to sequential rules as well.

Association rule mining in InfoSphere Warehouse

Association rule mining is invoked by calling a stored procedure as all other mining operations in InfoSphere Warehouse. Different transaction formats are supported, and in the following section we will use a transaction format based on the following representation.

Figure 2. Transaction table
Transaction table

To perform mining, you have to tell the algorithm in which columns it finds the actual items and which column contains the information on how to group items into transactions. In this case, the name of this column is TRANSID to group the items according to each customer purchase. We could have also used the customer id as a group id to find associations according to the whole customer purchase history. Using the SQL command IDMMX.BuildRuleModel or the Associations operator in DesignStudio, we invoke the mining. You have to provide a model name, the name of the transaction table, the group column, a minimal support and confidence, and a maximum rule length. Please note that the items are described by ids. The item names can be added with a name mapping table which contains the item ids and a corresponding description. There are many possible additional options like taxonomies, item weights, and so forth. Please refer to the InfoSphere Warehouse tutorial for more details.

Rule models can be visualized in DesignStudio just like classification or clustering models. We can also extract the information contained in this model to a table, just as with clusters. This can be achieved by the command IDMMX.DM_getRules or with the Rule Extractor operator in DesignStudio. The result set contains one rule per record with the following information:

  • head
  • body
  • confidence
  • support
  • lift
  • length

The table with the extracted rules could look like this:

Figure 3. Rules table
Rules table

The section below shows how you can transfer this information to Cognos to visualize it in a convenient way.


Invoking mining procedures from Cognos

In the last article, you saw that it is possible to call DB2 stored procedures from Cognos Reporting by including them as Query Subjects into the Cognos Metadata. This causes the stored procedure to execute once the report is run and then you can use it to extract information from an existing PMML mining model.

In this article, go one step further and create a mining model dynamically in a parametrized stored procedure using the InfoSphere Warehouse Mining SQL API. The same stored procedure then extracts information from the model and returns it to the Cognos Report.

There are actually only two calls involved in the process of generating rules and transferring them to Cognos. One call that builds the rule model in the database and a second one that extracts the rules and returns them as result set. The call that creates a model is, however, a stored procedure that does not have any return value. To call mining from Cognos, it is ideal to do both in a single call to the database.

In order to do this, the basic idea is to create a combined user-defined function that first invokes the model creation process and that then extracts the rules and returns them as a result set. This process is shown in the following figure.

Figure 4. Combined user-defined functions for dynamic mining
Combined user defined functions for dynamic mining

You can create such a combined stored procedure manually. There is, however, the possibility to create it visually using InfoSphere Warehouse. You will learn how to achieve this by following along with the subsequent walk-through example.

Using InfoSphere Warehouse to create complex mining stored procedures

First you must create the complex mining logic. InfoSphere Warehouse includes the Mining Editor which makes it possible to design mining flows (sequences of mining commands) graphically. These can be executed in DesignStudio or deployed to the InfoSphere Warehouse Administration Console to be run on a scheduled basis. But since all mining operations are SQL based, it is also possible to generate an SQL script from a mining flow (excluding some operators like the Mining Visualizers and Text analytics which are not executed in the database). The mining operators like the Association Operator and the Rules Extractor (which is used in the following example) are translated into SQL. The "Generate SQL Code" command of the Mining Editor generates the SQL DDL representing the mining flow and can be included into a stored procedure by returning the result table of the mining flow as the result of the stored procedure. The example shows you how to do this.

After you have created the complex mining logic, the second obstacle is to create, deploy, and test the SQL stored procedure you want to create. InfoSphere Warehouse DesignStudio includes the Data perspective with the Data Development project that provides support for the creation of Java and SQL stored procedures and user-defined functions. A wizard creates a template stored procedure which you can deploy and test directly from DesignStudio.

In the following example, see how to create the mining flow, creating the association model and extracting the rules from it. Then see how to create the SQL code from this flow. In the next step, the tooling is used to create a stored procedure of the SQL code. We will add input parameters to the stored procedure that can later be set in the Cognos Report and will parametrize the created mining model. The extracted rules of the association models are returned by the stored procedure and imported into Cognos as query subjects. Finally, learn how to create a simple Cognos report showing a revenue list for the products and allowing to drill-through to the relevant association rules for those products.


Consuming dynamic Data Mining results in Cognos Reports: An example from the retail sector

In this example, we will perform a market basket analysis on transactional retail data. The data represents market baskets (transactions) from a retail store. The first table contains the transaction ID identifying a single market basket along with the product IDs. To get meaningful results, apply a name-mapping table for the product IDs.

This table can be found in the examples shipped with InfoSphere Warehouse. To import it into your database:

  • Open the DB2 command window
  • Go to the installation directory of your InfoSphere Warehouse installation
  • Navigate to the SQLLIB\samples\dwe\ModelingDB2 subdirectory
  • Connect to your working database (in this example, it is the sample database DWESAMP: db2 CONNECT TO DWESAMP)
  • Execute the db2 -tvf retailImport.db2 command to import the sample tables to your user schema (in our example, this is called IMINER)
  • Also create a new schema, ASSOC, which is used in the following example with the db2 CREATE SCHEMA ASSOC command

The analytical goal is to extract association rules of the form:

Toy car + flash light => Battery (support: 3%,	confidence: 83%)

The above rule states that in 83% of the cases where toy cars and flash lights are bought, batteries are also bought. All products together can be found in 3% of all market baskets (also known as the support).

Instead of pre-computing the rules, the computation is triggered by a Cognos report. Within the report, you can enter parameters to change the mining parameters. In this example, a revenue-per-product list allows you to drill-through to association rules for the specific products. You can define the minimum support and length of the rules. The rules are then computed on demand and returned to the user via another report. The rules can, for example, be used to perform intelligent product placement within the store to increase revenue.

Creating the association rules within InfoSphere Warehouse DesignStudio

First, you must create an association rule model which is stored as PMML from which you can extract the rules into a database table for later Cognos access. For more information on PMML, refer to the PMML standard Web site.

Create a Data Warehouse project:

In the Project Explorer, right click and select New > Data Warehouse project as Figure 5 illustrates.

Figure 5. Create a Data Warehouse project
Open the Data Exploration view

In the following wizard, type the project name, for example, "AdvancedAnalytics." Then press Finish.

Create an empty mining flow:

  • Expand the just created project.
  • Right click on the Mining Flow folder. Select New -> Mining Flow.
  • In the upcoming wizard, type AssocFlow as the mining flow name.
  • In this example, you will work against the database, so leave the default and press Next.
  • Select the DWESAMP database (or the database you selected) and click Finish.

Create the mining flow:

The Mining Flow editor opens. On the right side of the Mining editor you can see a palette with operators. With these operators you can build up a Mining Flow by dragging and dropping them to the editor canvas.

Figure 6. Mining flow inside Design Studio
Mining flow inside Design Studio

To create the association rule mining model and extract the rules to a database table, do the following:

  • In the palette you can find the Sources and Targets section. Select a Table Source operator and drag it on the editor.
  • In the table selection dialog, expand the IMINER schema (or your default schema) and select the RETAIL table. Then press Finish.
  • Create another Table Source operator with a table called RETAIL_NAMES.
  • In the palette you can find the Mining Operators section. Select an Associations operator and drag it onto the editor.
  • Connect the RETAIL table's Output Port with the Associations Input Port, and the RETAIL_NAME table's Output Port with the Associations Names Port.
  • Select the Associations operator.
  • In the properties tab below the mining editor, select the Mining Settings tab on the left side.
  • In the Group column selection list, select TRANSID. All products (ITEMID) with the same TRANSID have been bought in a single market basket.
  • Now select the Name Maps properties page and set the "Item Id Column" to "ITEMID" and the "Item Name Column" to "DESCRIPTION". Both columns come from the RETAIL_NAMES table representing a name lookup table.
  • Proceed to the Column Properties page. Set the "Field Usage Type" of all input columns to Inactive except for the ITEMID column which is set to Active.
  • To the right of the "Field Usage Type," you can specify the Name Mapping for the columns. For the ITEMID column, specify the name mapping to Names. Names is the name of the input port that is connected to the name mapping table RETAIL_NAMES.
  • From the palette under Mining Operators, you can find an Associations Extractor operator. Drag it onto the editor and connect the Model outport of the Associations operator with the Model inport of the extractor. The extractor operator will extract the association rules form the PMML model created by the Associations operator and provides them in a relational table structure.
  • Finally, you have to store the extracted association rules to a physical table. To do this, right click on the "rule" outport of the extractor and select Create Suitable Table.
  • Set the table name to RETAIL_RULES and, for the schema, select the default schema where the RETAIL tables reside. Press Finish.
  • A table target operator for "RETAIL_RULES" should now be shown in the mining flow. To only get current rules for each run, check the Delete previous content check box in the properties of the target table operator.
  • Save the mining flow.

Now, the mining flow is ready to execute.

Execute the mining flow:

This flow generates a model that contains the association rules and stores it as a PMML model in the database. Afterwards, the rules are extracted to a database table so that you can access them later from within Cognos. In the view below the editor, the Execution Status tab is selected and on the right part of the view, you can see the table output of the last operator, that is, the Target Table operator. The table shows the extracted rules, each with an ID, Head and Body information and statistics on each rule. You can visually explore the association rules by right clicking on the Associations operator and select Open Model.

Deploy the association rule mining flow as a DB2 stored procedure

Later, within Cognos, you may want to invoke your previously designed mining flow dynamically with user-defined input. As mentioned above, this input is the minimum support for the rules to extract and their length. To enable dynamic invocation, you should create a stored procedure that takes two parameters and invokes the mining flow with this user input.

Create a new stored procedure within a Data Development project:

To create the Data Development project do the following:

  • Right click inside the "Data Project Explorer" and select New -> Data Development Project. If the Data Development project can not be found, select Project . . . instead and search within the complete project list.
  • Specify the name as Assoc Stored Procedure and press Next.
  • Select Use an existing connection and select the database into which you imported the retail tables. Click Finish.
  • If you are asked whether you want to switch the perspective, select No.

To create the Stored Procedure, do the following:

  • Expand the new project and right click on the Stored Procedure folder. Select New" -> Stored Procedure.
  • Set the name of the procedure to "ASSOC_PROC" and select SQL as the language of the procedure. Click Next.
  • On the next wizard screen, accept the proposed SQL statement and click Next.
  • On the Parameters screen, define two input parameters -- support and length. Press Add, set the name to support and select FLOAT as type. Press OK.
  • Do the same for the second parameter "length" with type "INTEGER".
  • Press Finish.
  • An editor with the stored procedure body opens.

Adapt the stored procedure to execute the previously defined mining flow:

  1. There are two statements in the body of the procedure defining a cursor that enables to return table rows. Adapt the first statement declaring the cursor to:
    DECLARE cursor1 CURSOR WITH RETURN FOR
    SELECT * FROM IMINER.RETAIL_RULES;

    The cursor returns the content of the RETAIL_RULES table containing the association rules. Remember, that IMINER is the default schema in this example and can be different in your setup.
  2. Now, convert the visual mining flow into SQL statements we can paste into the stored procedure body. Switch to the AssocFlow editor by selecting the corresponding tab in the editor area.
  3. In the menu, select Mining Flow -> Generate SQL Code.
  4. Another editor opens containing the SQL representing the flow. Mark the whole editor content and copy it.
  5. Switch to the stored procedure editor ASSOC_PROC and pass the code between the cursor declaration and the cursor "OPEN" statement. Save the editor.
  6. Go to the top of the stored procedure editor and remove the "CURRENT" from the set-schema statement and replace the schema name with the "ASSOC" schema you created at the beginning:
    Listing 1. Replace the schema name with the ASSOC schema
    P1: BEGIN 
    	-- Declare cursor 
    	DECLARE cursor1 CURSOR WITH RETURN FOR 
    		SELECT * FROM IMINER.RETAIL_RULES;
    
    SET SCHEMA="ASSOC"; ... ...

    The ASSOC schema is used to deploy the procedure to.

  7. Go to the CALL IDMMX.BuildRuleModel statement and adapt it to set the minimum support and rule length dynamically via the support and length parameters, Just adapt the fourth and sixth parameter as Listing 2 shows.
    Listing 2. Adapt the fourth and sixth parameters
    CALL IDMMX.BuildRuleModel('AssocAnalysis.IM_ASSOC_03', 
    						  'INPUT_03', 
    						  'TRANSID', 
    						  support, 
    						  25.0,
    						  length,
    						  'DM_addNmp(''nameMap'',''NAMEMAP_03'',
    						  ''ITEMID'',''DESCRIPTION''),
    						  DM_setFldNmp(''ITEMID'',''nameMap''),
    						  DM_setFldUsageType(''ITEMID'',1),
    						  DM_setFldUsageType(''STOREID'',2),
    						  DM_setFldUsageType(''CUSTOMER_ID'',2),
    						  DM_setFldUsageType(''ITEMPRICE'',2),
    						  DM_setFldUsageType(''TRANSDATE'',2),
    						  DM_setMaxNumRules(10000),
    						  DM_setAlgorithm(''SIDE'',
    						  ''<NumBins>5</NumBins>'')');
  8. Finally, go to the statement
    ALTER TABLE "IMINER"."RETAIL_RULES" ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE;

    and replace it with the statement
    DELETE FROM "IMINER"."RETAIL_RULES";
  9. Save the editor.

Deploy the stored procedure to the database:

To deploy the previously created stored procedure do the following:

  1. In the Data Project Explorer, expand the Data Development project "Assoc Stored Procedure," expand the Stored Procedure folder and right click on the ASSOC_PROC procedure; select Deploy...
  2. In the Deploy Routines dialog, change the schema to "ASSOC" and press Finish.

Now, the stored procedure ASSOC_PROC is deployed to the ASSOC schema. You can verify the deployment by navigating to the Stored Procedure folder within the schema using the Database Explorer (see figure 7). The stored procedure enables you to invoke the previously designed mining flow dynamically with user-defined property settings.

Figure 7. Deployed ASSOC_PROC stored procedure within Design Studio
Deployed stored procedure

In the next section, see how to use Cognos to access the stored procedure and build up a report using dynamic association rule mining.

Create the meta data for later reporting using Cognos Framework Manager

Before you can create a report using mining results, you need to define which resources (databases, tables, stored procedures) should be used. Framework Manager enables you to define this meta data and also allows you to enrich data sources by defining relationships.

In the following paragraphs, see how to create a Cognos Framework Manager project, define the two input tables and the previously created stored procedure. Finally, learn how to publish the project and make it available for the report.

Open Framework Manager and create a new project:

  • Launch the Cognos Framework Manager.
  • After startup has finished, click on Create a new project…
  • In the upcoming dialog, specify the name as AssociationExample and click OK.
  • Select English as the language and press OK.
  • As you want to create meta data from a DB2 database, select Data Sources and press Next.
  • In the next step, select the database you have used throughout this example and press Next. If you do not have a connection to the database, you need to create one. For information on how to create this connection, refer to the Cognos documentation.
  • Now you are asked to select the data object you want to import as meta data. Expand your default schema where the RETAIL and RETAIL_NAMES table are located, expand the Tables folder and select the mentioned tables.
  • Then, expand the ASSOC schema and the Procedures folder, and select the ASSOC_PROC procedure you previously deployed. Press Next.
  • The settings on this page are ok, so press Import and then press Finish.
Figure 8. Cognos 8 Framework Manager
Cognos 8 Framework Manager

After creating the project, the Framework Manager displays on the left side of the Project Viewer (see Figure 8). In the Project Viewer you can explore the newly created project. Expand the folder with the name of your database (in this case "DWESAMP") to have a look on the imported tables and the stored procedure. Now, you define a new Relationship between the RETAIL and RETAIL_NAMES table. This makes the later report definition easier.

To create the relationship, perform the following steps:

  • Right click the database folder (here "DWESAMP"). Select Create -> Relationship.
  • The Relationship Definition dialog opens (see Figure 9).
  • You can define a query subject on the left side and one on the right side. On the left side, select the RETAIL table and on the right side the RETAIL_NAMES table.
  • Select the ITEMID column on both sides to associate them.
  • Set the cardinalities as shown in Figure 9. Press OK.
  • You can explore the created relationship by pressing the Explorer link in the middle part of Framework Manager (see Figure 9).
Figure 9. Relationship Definition dialog
Relationship Definition dialog

In the next step, you see how to define the parameters of the stored procedure. The parameters are defined as arguments that have to be passed by the report. You also have to invoke the stored procedure once. Not only to test it, but also to retrieve the columns that are provided by the stored procedure. This is necessary to use the stored procedure like a table in the report tooling.

To create the arguments, perform the following steps:

  • Double click the ASSOC_PROC stored procedure in the Project Viewer.
  • The Query Subject Definition dialog opens (see Figure 10). In the lower part, you see the two arguments of the stored procedure. Click on the ". . . " button at the right end of the SUPPORT argument line.
  • In the Value text box of the Edit Argument dialog, type # prompt('support')#. Press OK.
  • Do the same for the LENGTH argument (use "# prompt('length')#" as the argument value).
  • Press OK to confirm the settings in the Query Subject Definition dialog.
  • Now, a Prompt Values dialog opens. You have to pass two example parameter values to enable Framework Manager to execute the procedure and retrieve the columns of the result. Put into the length value field "2" and in the support value field "3.0". Note, now the association rule mining flow is executing and returning the extracted rules to Framework Manager.
Figure 10. Query Subject Definition dialog with opened Edit Argument dialog
Query Subject Definition dialog with opened Edit Argument dialog

After importing the tables and the stored procedure and adjusting the meta data for those database objects, you can create a package for the project and publish it to the Cognos Content Store. Then it is available to the Cognos Report Studio.

Create a package and publish it:

  • To make the just created meta data available, go to the Project Viewer and right click the Packages folder. Select Create -> Package.
  • In the name field, type AssociationRulePackage. Press Next.
  • Press Next again as nothing has to be done.
  • In the following dialog, make sure that DB2 is in the “Available function sets” list. Press Finish.
  • In the following dialog, press Yes to continue with the publishing process.
  • In the “Select publish location” part of the Publish Wizard, press Next as nothing must be changed.
  • You do not need to specify security settings. Thus, press Next.
  • Then press Publish.
  • A dialog pops up saying that you have successfully published your package; press Finish.

Create a Cognos report using results from dynamic association rule mining

To create a Cognos report, use the Cognos Report Studio which is a fully Web based application. You will create two reports -- one that shows a revenue list grouped by products and another one that shows the list of association rules on the products from the RETAIL table. See how to define a drill-through definition from report one to report two. By clicking a product in the revenue list, open those rules that contain the selected product in the head of the rule. This enables you to analyze which products increase the revenue of the selected one. Further, you can define the minimum support of the rules in the dataset and the length of the rule. The mining is done in real-time and the results are filtered and presented within one click.

You have to perform the following steps to create the revenue list report:

  • Start the Cognos Report Studio.
  • In the Welcome screen, click the Create professional reports link.
  • In the Select a package explorer, click the AssociationRulePackage package you published in the Framework Manager.
  • In the Welcome dialog, select Create a new report for template.
  • Choose the List template and press OK.
  • In the upper left pane of the Insertable Objects view, you can choose between different tabs. The source (Data sources) tab is selected (see Figure 11).
    Figure 11. Report studio
    Report Studio
  • In the Insertable Objects view, expand the RETAIL table and drag the ITEMID column to the list table on the editor page on the right part of Report Studio. Then drag the ITEMPRICE column to the right of th ITEMID column in the list (see Figure 11).
  • Expand the RETAIL_NAMES table and drag the DESCRIPTION column between the ITEMID and the ITEMPRICE columns.
  • Click on the ITEMPRICE column. Don't click on the header but on the three cells below the column header. In the toolbar, press the Sort button (see Figure 11) and select the Sort Descending option. You want the revenue list to show the product group with the biggest revenue at top.
  • Press the Run Report button to preview the current stage of the report.
  • Save the report as RevenueReport.

You have to perform the following steps to create the association rules report:

  • Create a new report from the List template for the same package. You can do this by selecting File->New in the Report Studio.
  • In the Insertable Objects view, expand the ASSOC_PROC object and drag the BODYTEXT, HEADNAME, SUPPORT and CONFIDENCE columns to the list on the editor page on the right part of Report Studio.
  • As you only want to display rules that contain a specific product in the rule's head, you have to define a filter. Select any column in the list and press the Filters button in the tool bar.
  • In the lower left part of the Filters dialog, press the Add button.
  • The Detail Filter Expression dialog opens. Expand the ASSOC_PROC procedure and double click HEAD. The head column is added to the right part of the dialog. Extend the right part as follows:
    [DWESAMP].[ASSOC_PROC].[HEAD] = ?product?

    The ?product? expression defines a parameter which will later be passed by the first report depending on which product you click.
  • Press OK twice.
  • For parameter passing and stored procedure invocation you have to make a final adjustment. Select the Query Explorer in the Explorer selection bar (see Figure 11) and then select Query 1.
  • In the Properties section in the lower left part of the Report Studio, select the Processing property and select Limited Local from the drop down box.
  • Press theRun Report button to preview the current stage of the report.
  • You will then be prompted for three parameters -- two given by the stored procedure and one given by the filter you just defined. Use 3.0 as support, 2 as length and 142 as the HEAD id. Press OK.
  • Finally, save the report as "AssocReport".

Now, you have to extend the first report to invoke the second one through a drill-through definition.

You have to perform the following steps to create a drill-through definition in the revenue report:

  • Open the first RevenueReport again by selecting File->Open.
  • In the list within the editor area, select the ITEMID column (note the header but the three cells below the header).
  • Right click on the selected cells and select Drill-Through Definitions...
  • In the lower left part of the Drill-Through Definition dialog, select the Add button.
  • Press the "..." next to the Report field, and select the AssocReport.
  • Check the box Open in new window.
  • Press the Edit button below the Parameters box.
  • In the Parameters dialog, you will define which value is passed to the product parameter of the previously defined AssocReport. In the Method column of the product row, select Pass data item value from the drop down box.
  • In the Value column of the product row, select the ITEMID column from this report. The itemid will be passed to the AssocReport and thus result in just presenting rules with this product in the rule's head.
  • Press OK twice.
  • Save the report.
  • Press the Run Report button to preview the report.
  • In the sorted revenue list, select, for example, "Toy car" with ID 191. Note, not all products have related association rules, as the support of the rules may have been too small during computation.
  • Now, you will be prompted for just two parameters because the third one - the product - is passed by the drill-through definition. Use 2.0 as support and 3 as length and press OK.

The following figure 12 shows the final report. The first report shows the revenue by product sorted in descending order. The second report shows the filtered, dynamically created association rules for the selected product.

Figure 12. Final retail association rule reports
Final retail association rule reports

Conclusion

In this article you learned about association rule mining and how to find association rules with InfoSphere Warehouse. Association rule mining is a highly interactive task, and users usually have to try many different parameter settings to achieve the desired result. Cognos is very well-suited to support the task of interactive rule analysis. In addition to the techniques used in the first two articles of this series, you learned how mining can actually be invoked dynamically from Cognos, allowing the user to set parameters and constraints without having to switch to DesignStudio. In the upcoming last article of this series, you will learn how to combine all methods presented so far in an interactive customer segmentation application.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Big data and analytics on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Big data and analytics, Information Management
ArticleID=361897
ArticleTitle= Integration of InfoSphere Warehouse Data Mining with IBM Cognos Reporting, Part 3 : Invoke mining dynamically from Cognos using a market basket analysis example
publish-date=12302008