Skip to main content

Create Web services for real-time scoring using DB2 Warehouse V9.5

Drive operational business decisions using advanced analytics

Peter Bendel (peter_bendel@de.ibm.com), Senior Software Engineer, IBM
Author photo
For more than ten years, Peter has held development positions in different projects in the IBM Software Group, among them information retrieval, workflow software, pervasive computing, and portal solutions. He has in-depth experience with IBM data management products and object-oriented programming. Peter works in the IBM software lab in Böblingen, Germany.

At present, Peter is the architect for IBM's DB2 Warehouse data mining and text analysis components. To that end, he is working to simplify the integration of mining technology into business applications and custom analytical applications.

Author Photo: Steffen Lang
Steffen Lang, a student in computer science at University of Karlsruhe, is currently working in the IBM Boeblingen SW development lab as an intern.

Summary:  A major trend in business intelligence is to create actionable results and use them to drive operational business decisions in call center or back-office applications. The best way to take advantage of such results in the context of operational business processes is by using service oriented architecture (SOA). A business application can invoke information services to gain access to the aggregated, cleansed information in an enterprise data warehouse. This article describes how to build real-time scoring analytical Web services with DB2 Warehouse V9.5 Design Studio. These Web services apply a data mining model to current data to predict a value, a customer segment, or a cross-sell opportunity. These services can be used across several industries to compute, in real-time, cross-sell recommendations using association models, propensity to buy using prediction models, or risk scores in banking and insurance. The results of these computations can be used to drive operational business decisions.

Date:  06 Dec 2007
Level:  Intermediate
Activity:  888 views
Comments:  

What is a scoring service and why do you need it?

A real-time scoring service receives facts about a business entity — for example, a customer — and uses that information together with a data mining model to return a score. The score can be as simple as a value that describes the risk class associated with the customer or the number of a customer segment or group. The scoring service is invoked as a step in a business application that needs to make a business decision based on the value returned by the scoring service.

Examples:

  • An automobile insurance company wants to provide a self-service Web site where a customer can enter risk factors, such as the number of cars, age of the driver, and commute distance. The Web application can use a scoring service to calculate and display an insurance rate with a personalized discount based on the risk computed from the input values.
    risk = score_risk(#cars, driver_age,
    commute_distance)
    					

    The rate and discount are determined by a data mining prediction model that uses historical customer data stored in an Enterprise Data Warehouse to learn to recognize relationships between risk factors and customer claims.
  • A retail bank has a loans application business process where the credit risk associated with each loan applicant must be assessed. Based on the result, the loan application is routed either to an automated process (low risk and low amount) or to a human back-office worker (high risk or high amount).
  • A retailer wants to print personalized coupons at the point of sale. The coupon should offer discount and cross-sell recommendations based on the items that the customer is purchasing. A data mining associations or sequences model is created from the sales transaction table. This model describes product affinities, that is, it describes which products typically sell together. When the customer checks out at the cashier, a real-time scoring service is invoked that computes other items typically bought together with the items currently being purchased.
    cross-sell-items =
    score_product_affinities(item1, item2, item3)
    					

    A promotion is printed on the coupon ("Save 10 percent on your next purchase of X!") where X is one of the items recommended by the scoring service.

Similarly, whenever customers interact with your company and you want to react instantly with a personalized response based on a data mining model, you need a real-time scoring service.

Note that not all business scenarios require that the score be computed in real-time. In many cases, it is adequate to compute the score for a large set of customers on a regular basis — for example, once a day or once a month — and use the result in subsequent business operations. For example, you might use such scores to:

  • Provide improved or faster service for your high-value customers
  • Send outbound marketing campaigns to a selected customer segment

Offline batch scoring typically requires less time and fewer resources than real-time scoring.

The big picture — How do I create a service?

Software prerequisites

To follow the steps in this article, you need a workstation on which the DB2 Warehouse 9.5 Enterprise Edition is installed.


Figure 1. DB2 Warehouse architecture
Db2 Warehouse architecture
  • The DB2 Warehouse server contains the DB2 Enterprise data server. Your DB2 installation must include the Intelligent Miner data mining feature. This feature does the following:
    • Runs a database that is enabled for data mining (this database contains the data mining model)
    • Contains the data that you use to create data mining models
    • Executes the scoring SQL statement invoked by the Web service that computes the real-time score
  • The DB2 Warehouse client contains the DB2 Design Studio. This is an Eclipse-based graphical environment that supports the design and development steps in a warehouse environment. It shares common function with IBM Data Studio and Rational Data Architect. This article describes how to:
    • Use the Design Studio Database Explorer to connect to your database and enable it for data mining
    • Import the data mining models that are provided by this article into the database
    • Create an SQL scoring statement
    • Use the Data Web Services feature (common function with IBM Data Studio) to wrap the SQL statement into a Web service and generate a Web service application, which is stored in a WAR file
    • Use the Web services explorer to test your Web service
  • The application server contains an instance of WebSphere Application Server, where the DB2 Warehouse Administration Console and the WebSphere Application Server Administration Console are installed. This application server can also run your Web service. Use the application server to:
    • Define a data source that connects to the database on the data server
    • Deploy the Web service application (WAR file) and map the referenced database to the data source
    • Start or stop your Web service
    Note: If you do not want to install the DB2 Warehouse application server, you can deploy the Web services to a different servlet container, such as WebSphere Application Server Community Edition or Apache Tomcat.

Add the real-time scoring plug-in

The Download section of this article provides a ZIP file that contains the resources that you need, including the real-time scoring plug-in, which is an extension to DB2 Warehouse V9.5 Design Studio that can generate scoring SQL statements.

To add the real-time scoring plug-in to your Design Studio, copy the file com.ibm.datatools.datamining.realtimescoring_1.0.0.jar from the realtime_scoring_resources.zip file into <DWE_INSTALL_ROOT>\DesignStudio\dataminingtools\eclipse\plugins, where <DWE_INSTALL_ROOT> represents the root directory of your DB2 Warehouse client installation (the default is C:\Program Files\IBM\dwe).

Prepare a database

On the data server, you need to create a database and enable it for data mining. This article uses the sample database DWESAMP, which is created in the DB2 Warehouse tutorial. However, you can use any DB2 database. To enable a database for data mining, issue the idmenabledb command. For more information, refer to the Configuring Intelligent Miner information center topic.

On the client, open the DB2 Warehouse Design Studio and create a database connection for your database in Database Explorer. If both the client and server are installed on the same machine, the database connections are created automatically. For more information, refer to the Creating a database connection information center topic.

Overview of the development process

Creating a scoring Web service typically involves the following steps:

  • Preparing a data mining model: This typically involves collecting a history of transactions from the operational systems of your enterprise data warehouse and using a data mining algorithm (associations, classification, or regression) to learn the patterns hidden in the data. For example, in the automobile insurance industry the model could learn from existing customers how customer characteristics such as age, number of drivers, and type of car are related to the number and size of claims, and thus the probable risk associated with each customer. Creating a data mining model is beyond the scope of this article. The Download section contains a ZIP file that contains the data mining models used in this article.
  • Creating an SQL statement that invokes the model used to compute a score: DB2 data mining is implemented in the database as a set of user defined functions (UDFs) and stored procedures (SPs). These functions can be invoked using SQL SELECT or CALL statements. Figure 2 shows an example of a scoring SQL statement:

    Figure 2. Example of a scoring SQL statement
    Scoring SQL statement

  • Creating a Web service: The Data Web services feature that is available in DB2 Warehouse Design Studio or IBM Data Studio products can create a Web service for an existing SQL SELECT or CALL statement. This lets you invoke the SQL functions in a Web services environment. For example, it lets you embed the service in a business process implemented using WebSphere Process Server.
  • Deploying the Web service: Web services created with Data Web services tools can be deployed to a variety of application servers, among them WebSphere Application Server and Apache Tomcat. This is the runtime environment that serves as a container for the middle tier of your Web service. The application server receives the requests from the Web services client and then invokes the SQL statements in the configured data server.
  • Testing your Web service: DB2 Design Studio contains a Web services explorer that can be used to test any Web service based on the Web service definition language (WSDL) specification of that service.
  • Invoke the Web service from your business application: For example, in banking, a credit application process could invoke a Web service to compute the risk. Developing the Web services client is beyond the scope of this article.

This article describes how to create two Web services:

  • The first example describes how to create a Web service for a car insurer. The service computes the risk associated with information, such as driver age and number of drivers.
  • The second example describes how to create a Web service for a retailer. The service makes cross-sell recommendations based on the products purchased by a customer.

Import the model into the database

For the first example, you must import the insurance model into your database. Start DB2 Warehouse Design Studio on the client and open the Data perspective. In the Database Explorer, right-click Data Mining Models of the DWESAMP database, and click Import... to open the Import Model Wizard. Next, select Import from the file system and enter the path to the InsuranceModel.pmml file (included in the realtime_scoring_resources.zip file found in the Download section). Then import the model into the database by clicking Finish.


Figure 3. Importing the model
Importing the model

Create the SQL script

Before creating the Web service, create an SQL script for your model. This script represents the query invoked by the Web service. First, you need a new Data Development Project, which contains the SQL script and the Web service. To create this project, navigate to File > New > Data Development Project. In the wizard, enter the name RealTimeScoring and choose the DWESAMP connection. After clicking Finish, switch to the Data perspective if this has not already been done.

  1. Now you can create the SQL script. Right-click on the insurance model in the Database Explorer and click Create SQL Script... to open the SQL Script Wizard.

    Figure 4. Starting the SQL Script Wizard
    Starting the SQL Script Wizard

    Note: If the "Create SQL Script..." menu item is missing, install the real-time scoring plug-in provided in the Download section and restart the Design Studio.
  2. On the first page of the wizard, select the RealTimeScoring project and enter a name for the SQL script, for example ComputeInsuranceRate.

    Figure 5. Selecting the project and naming the SQL script
    Selecting the project and naming the SQL script

  3. After clicking Next, select the results of the SQL script and name them on the second page. In this example, you are interested in the predicted class (the risk group of the applicant) and the confidence.

    Figure 6. Selecting the results and naming them
    Selecting the results and naming them

  4. Now you can create the SQL script by clicking Finish. It appears in the SQL Scripts folder of the project in the Data Project Explorer.

Create the Web service

  1. To create a new Web service, right-click the Web Services folder of your project in the Data Project Explorer, and click New Web Service....
  2. In the wizard, choose the RealTimeScoring project, and enter the name InsuranceRateScorer and the namespace URI urn:insurance.

    Figure 7. Creating the Web service
    Creating the Web service

  3. After creating the Web service, add the SQL script to it. To do this, select the file ComputeInsuranceRate.sql in the Data Project Explorer and drag-and-drop it onto the InsuranceRateScorer Web service.

Setting up the JNDI data source

Before deploying the Web service, you must set up a Java™ Naming and Directory Interface (JNDI) data source at the application server. The data source resource reference in your Web service is assigned to it in the deployment step. To set up the JNDI data source, perform the following steps:

  1. Start the DB2 Warehouse Administration Console by navigating to Start > IBM DB2 Warehouse V9.5 > DB2WCOPY01.
  2. In the console, navigate to DB2 Warehouse > Common > Resources > Manage Data Sources, and click Create.
  3. Enter DWESAMP in the Database Display Name field, click Use New Data Source, and click Next.

    Figure 8. Creating a new data source
    Creating a new data source

  4. Provide the connection information. Specify jdbc/DWESAMP in the JNDI Name field, localhost in the Host Name field, and 50000 in the Port Number field.

    Figure 9. Providing the connection information
    Providing the connection information

    Note: Host name and port number depend on your architecture.
  5. Enter the user and password, then create the data source by clicking Finish.

Deploy the Web service

Currently, the auto-deployment of Web services to WebSphere Application Server is not supported and you must build a *.war file and use the WebSphere Application Server Administration Console to deploy it.

Generate the *.war file

First, click Build and Deploy... in the context menu of your InsuranceRateScorer Web service in the Data Project Explorer. The Deploy Web Service dialog opens. Apply the following settings:

  1. WebSphere Application Server v6.x as Web server type
  2. Build .war file only, do not deploy to a Web server
  3. JSR_109 as the artifact.soapEngine property value
  4. <Use .jar files installed on Web server> as the value for artifact.soapEngineDirectory

    Figure 10. Generating the *.war file
    Generating the *.war file

After clicking Finish, the deployable RealTimeScoringInsuranceRateScorer.war file is generated and saved in your project. You can find it in the Navigator view (navigate to Window > Show View > Navigator). It is in the InsuranceRateScorer folder under the DataServerWebServices folder of your project. Export it to the local file system by clicking Export... in its context menu and going to General > File System in the export wizard.


Figure 11. Exporting the *.war file
Exporting the *.war file

Deploy on WebSphere Application Server

To deploy the Web service, complete the following steps:

  1. Start the DB2 Warehouse Administration Console.
  2. Navigate to Applications > Enterprise Applications, and click Install.
  3. Provide the path to the RealTimeScoringInsuranceRateScorer.war file. Enter insuranceRate as the context root, and click Next.

    Figure 12. Entering the *.war file path and context root
    Entering the *.war file path and context root

  4. No changes are required in the Select installation options dialog, so click Next.
  5. Depending on your topology, it might be necessary to map the module to a server. However, in a single server environment you can just click Next in the Map modules to servers dialog without changing any of the settings.
  6. Use the Map resource references to resources dialog to map the data source reference of your Web service module to the JNDI data source you created earlier, as shown in Figure 13:

    Figure 13. Mapping the data source reference
    Mapping the data source reference

  7. Depending on your topology, it might be necessary to assign the Web module to a virtual host. However, for a single server deployment you can click Next in the Map virtual hosts for Web modules dialog without changing any of the settings.
  8. Click Finish in the Summary dialog to install the application.
  9. Select Save directly to the master configuration.

Start the Web service

Due to an incompatibility problem with the Apache Commons Logging library, before starting the Web service, map a shared library:

  1. Click RealTimeScoringInsuranceRateScorer_war to open the Web service configuration.

    Figure 14. Opening the Web service configuration
    Opening the Web service configuration

  2. Select the Shared library references.
  3. Select the RealTimeScoringInsuranceRateScorer_war, and click Reference shared libraries.

    Figure 15. Opening the shared library mapping page
    Opening the shared library mapping page

  4. Map dwecorelib, and click OK.

    Figure 16. Mapping dwecorelib
    Mapping dwecorelib

  5. Click OK, and click Save directly to the master configuration.

Now you can start the Web service in the Enterprise Applications view. The URL of the WSDL file (with the correct endpoints) is: http://<server>:<port>/<contextRoot>/wsdl


Figure 17. Starting the Web service
Starting the Web service

Test the Web service

To test your Web service in the Design Studio, perform the following steps:

  1. Start the Web Services Explorer by clicking Run > Launch the Web Services Explorer.
  2. Click on the WSDL Page button in the top right corner. Click WSDL Main in the Navigator, enter the URL of your WSDL file, and click Go.

    Figure 18. Entering the URL of the WSDL file
    Entering the URL of the WSDL file

  3. In the Web Services Explorer Navigator, select InsuranceRateScorer > InsuranceRateScorerPostBinding > ComputeInsuranceRate to test the post binding.

    Figure 19. Selecting the post binding
    Selecting the post binding

  4. Enter values for the parameters. Figure 20 shows an example of how to find the risk group of a 45-year-old married driver who has one child also driving the car, owns a house, and drives 10000 miles per year:

    Figure 20. Entering the example values
    Entering the example values

  5. To get the result click Go. The result shown in the Status section indicates that the driver belongs to the risk group 1 with a confidence of 0.93814433.

    Figure 21. The resulting risk group
    The resulting risk group


Rule model example

Example scenario

A retailer wants to print personalized coupons at the point of sale. A Web service is to receive a list of products that a customer is purchasing and suggest additional products that are typically bought in combination with these products. In this case, the data mining model that is employed is an associations model based on which products were bought together during previous sales transactions.

Importing UDFs

The SQL scripts for rule models require UDFs that first must be imported. In the context menu of the RealTimeScoring project, click Import... and choose General > File System. Then, select the files IDMMX.HLP_GETELEMENTS.ddl and IDMMX.HLP_GETELEMINDEXES.ddl (both are included in the realtime_scoring_resources.zip file found in the Download section) and import them. They are placed in the SQL Scripts folder of your project. Execute them by clicking Run SQL in their context menus.

Import the model

Import the product affinities model into the database as described for the insurance model, except use the file ProductAffinities.pmml instead of InsuranceModel.pmml.

Create the SQL script

To create the SQL script, open the SQL Script Wizard for the product affinities model in the Rules folder, as described for the insurance model. On the first page of the wizard, select the RealTimeScoring project and enter a name for the script, for example ComputeProductAffinities, and click Next. On the second page of the wizard, choose the results of the SQL script and name them. For example, to find the name of the item in the rule head, the support, the confidence, and the lift of the rule, select the check boxes as shown:


Figure 22. Selecting the results and naming them
Selecting the results and naming them

Click Finish to create the SQL script.

Create and deploy the Web service

Create and deploy the Web service similar to the way this was done in the insurance model example.

Test the Web service

Test your Web service by using the Web Services Explorer, as described for the insurance model. For example, to find out which products are usually bought together with cream and cottage cheese, enter Cream, Cottage cheese as the ITEMID_LIST, and click Go. The result in the Status section shows that, for example, lime juice is likely to be bought.


Figure 23. Getting the product affinities for cream and cottage cheese
Getting the product affinities for cream and cottage cheese

Conclusion

Real-time scoring services use input data and stored data mining models to calculate results instantly. They can be used for many purposes, such as risk prediction or cross-sell recommendations. With the real-time scoring plug-in, you can now easily create such Web services for your data mining models in the Design Studio. This article describes how to create an SQL script for a model, add it to a Web service, deploy the service on a server, and test it. Examples of possible next steps are building a Web page that employs the service, or integrating the service into a business process.



Download

DescriptionNameSizeDownload method
Resources referenced in this articlerealtime_scoring_resources.zip170KBHTTP

Information about download methods


Resources

Learn

Get products and technologies

About the authors

Author photo

For more than ten years, Peter has held development positions in different projects in the IBM Software Group, among them information retrieval, workflow software, pervasive computing, and portal solutions. He has in-depth experience with IBM data management products and object-oriented programming. Peter works in the IBM software lab in Böblingen, Germany.

At present, Peter is the architect for IBM's DB2 Warehouse data mining and text analysis components. To that end, he is working to simplify the integration of mining technology into business applications and custom analytical applications.

Author Photo: Steffen Lang

Steffen Lang, a student in computer science at University of Karlsruhe, is currently working in the IBM Boeblingen SW development lab as an intern.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, SOA and Web services
ArticleID=273342
ArticleTitle= Create Web services for real-time scoring using DB2 Warehouse V9.5
publish-date=12062007
author1-email=peter_bendel@de.ibm.com
author1-email-cc=
author2-email=steffen.lang@stud.uni-karlsruhe.de
author2-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers