One common problem when you work with predictive analytics is methods to present predictions and recommendations for action in formats that business people can easily access.
As companies move more into the realm of predictive analytics, IBM® Cognos® needs to present this information in easily digestible formats. Many predictive models do not lend themselves to traditional reporting tools. Fortunately, association models do and are a common starting place for predictive analytics teams to start working on business problems.
Common applications of association models include cross-selling and up-selling products and services to customers—concepts that apply to most commercial enterprises.
Association mining for Cognos professionals
Imagine a situation where your company wants to sell more to its existing customers. Sometimes, this drive takes the form of selling to customers more of what they are already buying. But many times, the goal is to sell other, extra items or services. Here is where association models can make targeted recommendations, recommendations that salespeople and automated sales processes use to up-sell or cross-sell more "stuff" to customers.
In physical goods, associated products or add-ons to primary products make sense. Think of an ice cream shop: That store probably wants to sell whipped cream and fudge topping to the customers who buy the ice cream.
The well-known, large Internet retailers use association models to make recommendations to all their customers. Most people who shop on Internet sites see suggestions on the web page to purchase more items that are based on items in the online shopping cart. Smaller websites tend to build those recommendations from just a list of top products. Medium-sized and large web companies usually put a significant amount of time and effort into creating models of purchasing patterns that are based on products that customers order. Add a dash of demographic information about the customer, and the website automatically tailors its recommendations to a short list of products that people who bought the first product are also buying.
Likewise, in services, think of a bank that wants to sell loan products to customers who have savings accounts. Or a pest-control company that wants to sell termite monitoring services to customers who use their other pest services. The examples are endless.
Other potential uses of association models in real-world business are for warehouse planning. Association models can indicate products that are often paired for retrieval from the warehouse to plan layouts to minimize time and movement.
Aside from tactical sales uses, marketing activities benefit from association models, as well. Consider what happens when a product is made part of a marketing campaign. Association models aid marketing decision making by showing which other products will probably sell in conjunction with the targeted items.
Association model concepts
Association models are one of the easiest to construct, easiest to understand, and easiest to implement predictive analytics methods for making these cross-sell recommendations where there are many products and many customers. The algorithms for constructing association models range from fairly simple statistical processes to complex machine learning-based methods. Regardless of methodology, the net output of models that are designed to increase sales is a recommendation to purchase an item that is based on what the customer is purchasing or previously purchased.
A common business nickname for association models is market basket analysis. Whatever the name, however, the basic concept for every item or service that the company sells is to identify the top items or services that other people buy with that primary product. Models can (and should) become more complex and break the customers into size, types, regions, or other characteristics, and then run the association models for products in each customer category. Modeling these characteristics produces more targeted, more specific recommendations that the customer is more likely to purchase. In this way, association models are light years beyond a "top-10 products" list and the classic "managers' hot products" list.
Without this article becoming a primer on statistics and data mining, a simple way to think of association models is as a running tally of which products tend to be ordered in pairs.
The algorithms usually assign probabilities that, given that product A is in the order, what is the probability that product B will also be purchased? The algorithm also looks at it the other way: If product B is the first product, what is the probability of product A being in the order? Most association model algorithms work with those two probabilities to come up with the final numbers.
Incoming data formats
Many predictive analytics processes create models that are not easily documented or transferred from the predictive analytics environment. Fortunately, association model rules are easily put into human-readable format. Common ways to exchange the rules files are text files or spreadsheets.
Figure 1 shows an example of the basic rules that come from the predictive analytics modelers. This example is a text file with the item numbers and the most basic statistics regarding the relationship (or rule) between the two items.
Figure 1. Text file of incoming rules
The basic statistics are usually probability and lift for each rule. Probability is a number less than 1. It is computed from the starting point that if product A is in the order, what is the likelihood that the same order contains product B? Most people recognize this term as the percentage chance.
Lift is a more complex measure. The predictive analytics modelers are the ones who use and filter on lift. The short definition is that lift is a measure of the performance of the rule that is based on the conditional probabilities of the items in the rule that are sold together.
Figure 2 shows similar data to Figure 1 but with more human-readable descriptive fields added in. This format is easier to work with: The report fields are easier for the report developer and business consumer to read and understand.
Figure 2. Example spreadsheet of association model rules
So, you have the data, but where do you store it for reporting?
The format and location depend on your environment and architecture. To Cognos, it is just another data source. Common ways are flat files (for example, XML or comma-separated values), database tables, and spreadsheets. Depending on your predictive analytics team, they might choose to keep the rules in a server environment specific to the predictive analytics process. IBM SPSS® Modeler even has specific functionality to deploy models to Cognos. However, most of the time, an organizational divide exists between the two groups that translates to an architectural divide. The different groups typically feel better about just exchanging files and handshakes.
One consideration for data storage is future access by programs other than Cognos. Eventually, the recommendations are used like the medium-sized and large web companies do now: automatically in the flow of normal operations. Several different systems, such as web carts, enterprise resource planning (ERP) systems, and mobile applications access the recommendations and display them in their own programs. Consider that when you are making your decision.
How will the association rules be used?
Reporting is the first step, and automated processes come later. First, you need to provide a Cognos report that lists cross-sell recommendations by product. This report requires the descriptive fields for the business people. Eventually, the goal is to deploy these rules to the point of contact with the customer.
For web use, think of online retailers that insert recommendations onto almost every page of their website, including the web cart. Less visible to the customer is including recommendations in the ERP order entry or customer relationship management application screens. Your customers do not view these recommendations, but your salespeople who speak with the customers will. These recommendations flow right into the conversation that the salespeople have so that they can make intelligent suggestions to cross-sell and up-sell the customer.
The marketing department also use the recommendations. Most of their use is these reports and transfers to spreadsheet format. Typically, Cognos reports supply the information that marketing takes into spreadsheets, where it is combined with other information to support their analysis.
Simple report example
This example assumes that the predictive analytics team sent the rules file and that someone placed that file into a database table for reporting. Being an experienced Cognos developer, I created the connection to that data source and included it in my project in IBM Cognos Framework Manager.
Figure 2 showed the rules file with fields added to SPSS Modeler rules. This format gives the item more human-understandable characteristics. This file is the one that loaded into my database in this example. Having the human-readable fields helps in development and testing the report.
In Figure 3, you see that I created a project in Cognos
Framework Manager and gave the project the name
CognosAssociationModelReporting. Everything is
organized in a single namespace called
first query subject is the connection to the database table with the
Figure 3. The initial data source in Cognos Framework Manager
The one incoming file, called CognosRecommendations, is a simple setup that has the name of the database table as its name.
The first step is to create a metadata layer over the incoming data in the
CognosRecommendation table. I rename the data table layer to
Incoming Data Layer, as shown in Figure 4. Next, I add a namespace under the
DataConnection and name it Standardization
Layer. Here, I rename the fields and pick out only those
fields that people will want to see.
Figure 4. The Standardization Layer added to the project
In the Standardization Layer, the fields can be broken into four categories:
- Fields to describe the selected item in the rule
- Fields that describe the recommended item in the rule
- Relevant statistics and ranking for the rule
- A field for the group or type of recommendation
This last field does not come from the association model directly. It is included as several different association models are computed, and then their rules are merged into this single rules table. Table 1 shows how the fields map from the Incoming Data Layer to the Standardization Layer.
Table 1. Field mapping from the database table to reporting-friendly names
|Incoming Data Layer||Standardization Layer|
|ITEM||Selected item number|
|Recommendation||Recommended item number|
|Selected item class||Selected item class|
|Selected item description||Selected item description|
|Recommended item class||Recommended item class|
|Recommended item description||Recommended item description|
The Standardization Layer field names are essentially the same as the database table names in this example. The names might not always be the same because the predictive analytics modelers probably have slightly different terms. Hence, the Standardization Layer serves well to do the translation from the technical environment to the business users' nomenclature.
Next, I create a package by using all of the fields in the Standardization Layer (see Figure 5). Then, I deploy that package.
Figure 5. Designating items into the package
Now, I build the first report. This report serves as the final report in a series of simple drills-downs. From the Cognos Workspace home page, I click Author Advanced Reports and use the package that I just created.
As shown in Figure 6, I place four fields onto the report that is displayed to the report consumers:
- Recommended item description
- Recommended item number
Figure 6. Initial report layout
I add a filter, as shown in Figure 7, that prompts the report consumer to enter a product. For ease of viewing, this example displays product descriptions. The list contains all products that have a recommended item to cross-sell. It displays a simple drop-down list box from which the report consumer can choose the primary item on the prompt screen (in the example data, it is called the selected item).
Figure 7. Filter on the report
Figure 8 shows that the title displays the primary (selected) item description. This information is important and reminds the businessperson which product this list of recommendations is for. It is valuable when this list is printed for future reference.
Figure 8. Report with title
The report is sorted on probability. The probability field is typically the easiest sort field of the statistics for businesspeople to understand because it is displayed as a percentage. The prompt page is shown in Figure 9.
Figure 9. The prompt page
The final report is provided in Figure 10.
Figure 10. The final report
This simple example is a good start: It allows quick access to recommendations. With just a bit of tweaking, the report can conform to the company's identity standards before deployment into the live production environment.
Cascading reports for business people
Rather than directly picking a product, I create a series of reports to drill down to the example report. This way, the business user does not scroll through a massive product list; instead, with just a few clicks, the user can target the primary product to make the recommendations.
Using the same metadata layer and package, I create two reports that drill through, the first to the second, and then to the example report in Figure 10. I followed this process:
- From the Cognos home page, click Author Advanced
I create the second report first because it is in the middle of the three-report stack.
- From the package, place the
Selected Item Descriptionfield onto the report.
This field is the only field shown.
- Create a filter on
Selected Item Classas shown in Figure 11.
Figure 11. Filtering on Selected Item Class
Do not include this field on the report.
- Highlight the Selected Item Description column.
- In the properties box, select the Drill-Through Definition property.
- Edit to drill through to the first report.
As shown in Figure 12, associate the value of the click in this report with the input value for the filter in the first report.
Figure 12. Passing the Item Description value
Figure 13 shows the completed Drill-Through Definitions box.
Figure 13. The completed Drill-Through definition
- Save and test the report.
The list of item classes is shown on the prompt screen. I select one of these classes from the list, and then click Finish. A list of item descriptions that are in the selected class is displayed. When I click one of the items, the first report runs with the selected item as the primary item. The final screen shows recommendations to make.
As with the first report, I make it match the company's standards.
Now, I create a report that lists the item classes. This report is the entry report in the stack that the business user runs. Using this report as the front, the business user is not required to use any of the drop-down boxes on the prompt pages. (Although doing so might still be necessary if your company has many categories or items within categories. Using a search prompt might still be a cleaner way for users to select classes and products if many choices exist.)
I followed this process:
- From the Cognos home page, click Author Advanced Reports.
- Select the same package as the other reports.
- Drag the
Selected Item Classfield onto the report, as shown in Figure 14.
Figure 14. Entry report layout
- Highlight the Selected Item Class column.
- Select the Drill-Though Definitions line in the column properties box.
- Select the second report in the stack—the one just finished.
- In the Parameters section, map the
Item Classfrom this report to the prompt value in the second report, as shown in Figure 15.
Figure 15. Parameter mapping in the Drill-Through Definitions property
- Save and test the report.
Instead of a prompt page, you see a list of
Item Classes. I
click one of those classes: That value is now sent to the second report,
which displays a list of item descriptions in the selected class.
Finally, when I select an item description, the final report in the stack displays items to cross-sell.
Using this report in real life
You can use this stack of reports as the basis for an active recommendation system. Imagine the situation of a salesperson who reviews orders with a customer on the phone. Running the first report, the salesperson concentrates on an item class that the customer is ordering. Then, by clicking the exact item that is already on the order, the salesperson is presented with an ordered list of recommendations to make.
When you design packages in Cognos Framework Manager, include the descriptive fields that users need to understand the hierarchy of products. Having more than initially needed might save time in the future when business users want to have a different path to drill into and get to the recommendations.
Moving to a big data future
Big data is the probable future location of much of the world's data. Websites in particular will generate massive amounts of data about clicks and navigation that do not need to be brought into a relational database. Applying association rules in this environment can yield many potential uses.
Consider the situation of keeping all the website click and navigation data in IBM BigInsights™. Offline, your predictive analytics team can create models that predict which products users will click next based on the page they're viewing. This model is one that association rules can generate. Applying these rules requires a real-time connection with the data stream. Next-page or product recommendations can be shown to users in real time as they browse the website.
Cognos can report on the activity and serve to monitor the effectiveness of the association rules models as they are applied to big data within BigInsights. The base knowledge that you acquire as you work with reporting on the rules is useful in creating and reporting on metrics in big data.
As data analysis evolves, predictive analytics will become increasingly important. Not every predictive analytics model lends itself to reporting for business consumption. For those models that are human readable, Cognos is a great reporting platform for delivering information that you can act on. The skills that are required are similar to historical reporting. With some knowledge upgrades, a skilled Cognos user can create reporting packages and reports that effectively disseminate predictive analytics models to affect and improve business operations.
- Learn more about market basket analysis (under the title Affinity Analysis).
- Check out the Cognos home page for general Cognos product information plus the Enterprise Reporting home page.
- Get started with Cognos Framework Manager.
- Find the resources that you need to improve outcomes and control risk in the developerWorks Business analytics zone.
- Learn more about big data in the developerWorks big data content area. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Follow developerWorks on Twitter.
- Watch developerWorks on-demand demos that range from product installation and setup demos for beginners to advanced functionality for experienced developers.
Get products and technologies
- Be sure to check out the developerWorks SPSS community.
- Join the developerWorks community, a professional network and unified set of community tools for connecting, sharing, and collaborating.
Dig deeper into Big data and analytics on developerWorks
Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.
Crazy about Big data and analytics? Sign up for our monthly newsletter and the latest Big data and analytics news.
Software development in the cloud. Register today to create a project.
Evaluate IBM software and solutions, and transform challenges into opportunities.