IBM Support

QS-AVI Address Cleansing as a Web Service for IBM InfoSphere Identity Insight

Product Documentation


Abstract

Address cleansing – sometimes referred to as address “hygiene” or “standardization” – is a process used with the Identity Insight pipeline to help you correct and standardize address information for optimal entity resolution processing. This new IBM® InfoSphere(TM) Identity Insight feature enables the use of an industry standard address data standardization solution that includes:
* AddressDoctor®
* IBM InfoSphere Information Server
* IBM InfoSphereDataStage®
* IBM InfoSphere WebSphere QualityStageTM.

Content


QS-AVI Address Cleansing as a Web Service for IBM InfoSphere Identity Insight

Author: Bhaveshkumar R Patel (bhavesh.patel@in.ibm.com)

Address cleansing – sometimes referred to as address “hygiene” or “standardization” – is a process used with the Identity Insight pipeline to help you correct and standardize address information for optimal entity resolution processing. This new IBM® InfoSphereTM Identity Insight feature enables the use of an industry standard address data standardization solution that includes:

· AddressDoctor®

· IBM InfoSphere Information Server

· IBM InfoSphereDataStage®

· IBM InfoSphere WebSphere QualityStageTM.

Enabling support for an address standardization module provided by AddressDoctor eliminates the dependencies and limitations often associated with other standardization databases such as Worldwide Address Verification and Enhancement System (WAVES). The AddressDoctor address standardization module can be used for Identity Insight entity resolution by using the DataStage and QualityStage Address Verification interface. This process is generally referred to as QS-AVI in this document.



This techdoc describes how to create and apply an address data-cleansing job that standardizes address data for use by IBM Identity Insight. The job is defined in DataStage and uses QS-AVI Data Quality stages. Note that the steps are described and illustrated in a Windows client environment.

The basic steps for implementing this address cleansing job as a Web service are:

STEP 1: Verify prerequisite software
STEP 2: Define a QS-AVI Data stage Job to cleanse the address data.
STEP 3: Enable the Data stage job for Information Services
STEP 4: Use the Information Server Console to define a Data stage job
as a service
STEP 5: Use the Information Server Console to deploy this new job as service
STEP 6: Examine the WSDL file
STEP 7: Test the service


STEP 1: Verify prerequisite software

You should have the following software installed:

· Data stage InfoServer Version 8.0.1
· QS-AVI Data Quality stages
· Address Doctor Database (Required Country Database)

STEP 2: Define a QS-AVI Data stage Job to cleanse the address data.

1. Open DataStage Designer.

Start -> All Programs -> IBM Information Server -> IBM WebSphere DataStage and QualityStage Designer

a. In the Attach to Project window, enter ibmpassw0rd as the password to connect to the Project. Click OK.

Figure 1 - Attach to Project window


b. Close the window New.


2. In the Palette pane, open the Data Quality folder to browse through the available stages. Make sure you are able to find the QS-AVI “Address Verification” stage as shown in figure 2.


Figure 2 - Address Verification in the Data Quality folder


3. Copy the AddressValidateWS.dsx file from the QS-AVI package to your local hard drive (C:\).

4. Import the AddressValidateWS.dsx file to DataStage. This is a predefined address cleansing job and has been designed for IBM Identity Insight and QS-AVI integration.

5. In the Repository pane, select the job AddressValidateWS in the Jobs folder, and open it by selecting Edit.

Figure 3 - DataStage Designer Repository pane



6. Open the Address_Verification_8 stage, by selecting Properties.
In this window you can examine and modify stage->properties. (See figure 4.)

7. Update the stage-> properties as follows:
a. Update Reference database path with the AddressDoctor Database
installation location.

b. Update Full Preload with the required county database.

Figure 4 - AddressVerification stage window.

STEP 3: Enable the DataStage job for Information Services


One more step must be performed before the new job is enabled for Information Services. You must change the properties of the job and specify that multiple instances of the job can be run, and that the job can be made available as a Web service.


1. In the Repository pane, open the job properties by selecting the Edit menu and then Job Properties).

2. On the General page of the job properties, check the following 3 boxes:

· Enable hashed file cache sharing
· Allow Multiple Instances
· Enabled for Information Services

Figure 5 - Job Properties window


3. Click OK to save the job properties.

4. Save the job by selecting Save from the File menu.

5. Compile the job by selecting Compile from File menu, or press F7.
 

Step 4: Define a DataStage job as a service using the Information Server Console


The Console for IBM Information Server allows you to define a data transformation or cleansing job (DataStage job) as a service. The job must have been set in DataStage with the property “Enabled for Information Services”. The tool includes a wizard to guide you through the task.

The wizard walks you through the following task steps:

1. Name and describe the new service.
2. Choose one service interface binding (such as SOAP over HTTP, EJB).
3. Select the DataStage job to expose as the first operation of the new service.
4. Set up the request and response messages for the operation.
5. Set run time parameters.

Before you begin
Before starting the IBM Information Server Console, you must have an application server up and running. The example configuration in this document uses WebSphere Application Server, which was installed as part of the IBM Information Server install.

Verify that the IBM WebSphere Application Server service is already started. The service name is IBM WebSphere Application Server V6 – bhapatelNode02. If it is not started, start the service.

1. Open the Console for IBM Information Server. From the Windows Start button, select All Programs -> IBM Information Server -> IBM Information Server Console.

a. When you are prompted for user name and password, enter:

user name : IBM_XXXX
password : XXXXXX


2. Click New Project to create new project.
a. Type (Select) : Information Services
b. Name : AddressValidateProject


Figure 6 - New Project window.


3. Open the AddressValidateProject by selecting File -> Open Project.
In the Open Project window, select AddressValidateProject, and click Open.


4. Open and customize the Information Services Application, AddressValidateApp.
a. Open the Information Services Application window.
b. Click the Develop icon, and Click Tasks->New (right side panel).
c. On the Overview page, Application name, enter AddressValidateApp.






Figure 7 - Open AddressValidateApp.


d. You can use the wizard to help you create and deploy the new service. The wizard lets you fill in information about the general properties of the new service, the binding used by the service, and the operation that the service invokes.

e. On the Overview page, for Service Name, enter AddressValidateService. In the Description field, describe the service (i.e. “QS-AVI-WISD Web Service”). This information is useful when users look at the Services Directory to find out what services exist, which function the service performs, and who to call for help.

Figure 8 - Select Information Services Application.


f. On the bindings page (under “NewService1” in the Services folder), click the Attach Bindings menu button (bottom right), and select SOAP over HTTP as binding.

Note that currently the system offers you a choice between SOAP over HTTP and EJB as binding.

g. Specify the operation performed by the service. At the bottom of the “Select a View” portion of the window, click New, and select Operation from the menu.

h. A new window is displayed, which lets you specify the operation to invoke.

Figure 9 - a new operation.


i. Change the name of the operation to addressValidateOps. The name of the operation must start with a lower case letter or you will not be able to successfully save your service definition.

j. Click Select to choose the information provider for this operation. In the Information Provider window, select DataStage and QualityStage, as type of information provider.








k. Navigate now through the folders to find the job named AddressValidateWS, which were enabled for information services earlier when you set up the job in DataStage and QualityStage Designer. Select the job located in the Job folder: IaaS_Proj.

If the job name is not listed, it is likely that you did not compile the DataStage job. If so, go back to the DataStage and QualityStage Designer and compile the job.

Figure 10 - Select the IaaS_Proj job in the Job folder.


l. Click OK.


Figure 11 - new operation detail pane.


You can browse through the Inputs, Outputs and Provider Properties tabs to review input and output parameters for the service. Remember that this DataStage job is enabled to Information Services and includes a WISD_Input and a WISD_Output stage. During the definition of these stages, you should have identified the columns that would be used as input and output.


m. The Provider Properties tab contains important runtime parameter settings. These parameters control the number of job instances allowed, the load balancing delay, and how requests will be handled in the pipeline.

n. Click Save Application to complete the definition of the service. You could now deploy the service. However the service will not return multiple rows of data, as explained in step m, you must go back to DataStage Designer to slightly modify the original job.

o. Click Close Application. You are now returned to the Application window, which should look like this:

Figure 12 - a defined service.


You have now completed the registration of the service and can deploy the job as a service. The deployment is also performed using the Console for IBM Information Server.

Step 5: Use the Information Server Console to deploy this new job as a service


Deploying an application will install an Enterprise Application on an application server. This enables the services to be invoked by other applications or services.

1. In the Information Services Application window, select the AddressValidateApp.

2. Click Deploy.

3. The window with the Service Objects to deploy is displayed. Deploy the service object named AddressValidateService.

Figure 13 - Deploying the application.


4. You can browse the Manage Providers section. For this example, keep all of the default options.


5. Click Deploy (located at the bottom of the window). Note that deploying an application can take a very long time, especially if your system does not have 3GB or more of system memory.

6. The bottom of the screen has an activity status window, which you can expand by selecting Details.

7. Once the deployment completes, the deployment status window shows a change in status from “Executing” to “Completed”.


8. Close the Activity Status window. The application is now successfully deployed.
 

Step 6: Examine the WSDL file


Verify the deployment by generating the Web service definition language (WSDL) document for the new service. WSDL contains all the necessary descriptions (meta data) that a client application would need to invoke the service.

WISD generates the WSDL “on the fly”. If your application was not deployed successfully, you will not be able to generate the definition.

1. Open the “Deployed Information Services Application” window.
On the WISD Navigation bar, click the OPERATE icon and select Deployed Information Services Application from the menu.

Figure 14 - Deployed Information Services Application window.


2. In the Deployed Applications window, you should see the name of the application AddressValidateApp that you just deployed. Expand the AddressValidateApp folder.

3. The AddressValidateApp folder contains the name of the service(s) defined in the application; for each service, you also can display the operation being called by the service.

4. Select the name of the service: AddressValidateService.

Figure 15 - AddressValidateService details.


5. Select View Service in Catalog to open the Information Server Administrator Web Client with the Information Services Catalog view displayed.

Figure 16 - View Service in Catalog results.


6. The above window contains the general properties of the service. You can browse through the various pages to see the information related to bindings, attributes and operations. To find the WSDL document, open the Bindings page, and expand the SOAP over HTTP box.

Figure 17 - Bindings view.


7. Click the link Open WSDL Document to generate the WSDL file for the AddressValidateService service. The file is being displayed in a new browser window.

Figure 18 - Generated WSDL file.



8. Save the WSDL file in the folder C:\SOADEMO\Results.

9. Keep the name of the URL associated to the WSDL file:



10. Close the window displaying the WSDL file and the window labeled Header Microsoft Internet Explorer, and exit the Console for IBM Information Server.

11. You can now test the service.
 

Step 7: Test the Service


You can use the WebSphere Integration Developer environment provides to easily verify that a service is working properly, without having to write an application.

1. Open WebSphere Integration Developer:
Start -> All Programs -> IBM WebSphere -> Integration Developer v6 -> WebSphere Integration Developer v6

2. Accept the workspace as displayed: SOAiis.

3. Select Run then Launch to open the Web services Explorer.

4. In the Web Browser pane, select the icon representing WSDL Page (upper right-hand corner).




5. Click WSDL Main in the Navigator.

6. Enter the WSDL URL:



Figure 19 - Open the WSDL URL


This is the address that is associated with the service. You generated that address by opening the WSDL document from the View Service in Catalog in the Information Server Administrator Web Client at the end of the previous section.

a. Click Go to get the operation name associated with the service.

7. The next screen displays the operation name(s) associated with the service. Click the operation named addressValidateOps.

Figure 20 - displaying the operation names associated with a service.


8. You must specify the input values that this operation requires. Let’s assume that you want to standardize the name and address of this customer:

addr1 : 4100 bohanon Dr
city : Menlo Park
state : CA
country : USA


Figure 21 - Invoking a WSDL operation.


9. Click Go. The service is being invoked.

In the Status window, you should see the result containing the standardized named and address for the customer entered as input. In the Status window, switch from a Source view to a Form view to get a nicely formatted response document.

Figure 22 - a form view of a response document.


Figure 22 shows that service has been successfully invoked and that you have successfully enabled a data-cleansing job as a Service.

Original Publication Date

17 August 2013

EAS-II_V8_QS_AVI_WISD.pdf

[{"Product":{"code":"SS2HSB","label":"InfoSphere Identity Insight"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF033","label":"Windows"},{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF025","label":"Platform Independent"},{"code":"PF010","label":"HP-UX"}],"Version":"4.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
30 September 2021

UID

swg27018320