Introduction to real-time data integration and WebSphere DataStage
RTI, a component of IBM WebSphere DataStage (hereafter referred to as DataStage), lets you create sharable standard services, including Web services. You can invoke these services, which represent the data-integration functionality of the DataStage jobs, without having to be fully aware of the data-integration logic. Deploying DataStage jobs as sharable services yields a number of benefits, including:
- Providing single-point standard access to disparate data sources, internal and external.
- Reusing the logic from DataStage jobs in real time.
- Developing applications more quickly by providing unified services for every application, which dramatically reduces the redundant code.
Figure 1 shows the architecture of RTI.
Figure 1. Architecture of RTI

This article explains how to deploy RTI jobs as Web services. Let's break that down into the following sections:
- An introduction to RTI job topologies
- Developing a sample data-integration component step by step with DataStage
- Publishing the data-integration component as a Web service
- Developing a Java client to call the Web service that you publish
The RTI server supports three job topologies:
- Topology I: Batch Jobs
- Topology II: Batch Jobs with an RTI output stage
- Topology III: Fully RTI-compliant jobs
Topology I uses new or existing batch jobs that are exposed as RTI services. (Note that this topology doesn't contain any RTI stages, such as an RTI input stage or an RTI output stage, which are explained in the following sections.) An RTI service that's based on a batch job can accept job parameters as input arguments. This type of service returns no output. When you configure the deployment, you can set values for job parameters. Figure 2 shows an example of this topology.
Figure 2. Batch job

Topology II: Batch jobs with an RTI output stage
The only difference between topology I and topology II is that there's an RTI output stage in topology II. The RTI output stage is the exit point from the job, and it returns one or more rows to the client application as a service response. The RTI output stage supports one input link. Its table definition maps to the output arguments of the RTI service. See an example of this topology in Figure 3.
Figure 3. Batch jobs with an RTI output stage

Topology III: Fully RTI-compliant jobs
In topology III, jobs use both an RTI input stage and an RTI output stage. The RTI input stage is the entry point to a job, accepting one or more rows during a service request. The RTI input stage supports one output link. Its table definition maps to the input arguments of the RTI service, such as the input arguments of a Web service operation. A job that conforms to topology III is always on. After you deploy this job as a Web service, you find an instance of this job running in the DataStage Director. Figure 4 shows an example of this topology.
Figure 4. Fully RTI-compliant jobs

Now let's create a sample RTI job to extract the location information from a
database table. The parameter you use is employeeid,
which is structured as an array, so you can pass in several at the same time. The
job then returns the location information of the employees specified by the input
parameter. This RTI job uses a table named RFIDLOCATION (see Table 1 for the table
definition), which is stored in an IBM DB2® database called GBPMDB. Table
definitions of the RTI input stage and RTI output stage are shown in Table 2 and
Table 3, respectively. Notice that all the columns in Tables 2 and 3 are included
in Table 1. So when you import a table definition using DataStage Designer, you
can get the table definitions of Tables 2 and 3 from the table definition in Table
1.
Table 1. Table definition of table RFIDLOCATION
| Column name | Key | Type | Length | Nullable |
|---|---|---|---|---|
| RFIDRecordLocationID | Yes | Integer | 10 | No |
| EmployeeID | No | Varchar | 60 | No |
| LocationID | No | Integer | 10 | No |
| RecordTime | No | Timestamp | 26 | No |
Table 2. Table definition of RTI input stage
| Column name | Key | Type | Length | Nullable |
|---|---|---|---|---|
| EmployeeID | No | Varchar | 60 | No |
Table 3. Table definition of RTI output stage LocationInfo
| Column name | Key | Type | Length | Nullable |
|---|---|---|---|---|
| EmployeeID | No | Varchar | 60 | No |
| LocationID | No | Integer | 10 | No |
| RecordTime | No | Timestamp | 26 | No |
Import the table definition using DataStage Designer
- In the repository of DataStage Designer, right-click Table Definitions, then select Import > Plug-in Meta Data Definitions.
- Select DSDB2 in the Name column, and click OK.
- In the next window (see Figure 5), select the server name from the drop-down list (in this case, GBPMDB). The server name is the database's name you created, which contains the table you want to import.
- Type the user name,
db2inst1, and password,passw0rd, to connect to the server. - Select the Tables check box, and click Next to continue.
Figure 5. Select database
- Now select the RFIDLOCATION table from the Select Table(s) list, then click Import to import the RFIDLOCATION table definition.
You should now see the table definitions you just imported in the DSDB2 subcategory in the repository of DataStage Designer, as shown in Figure 6.
Figure 6. Table definition

- Now create a new parallel job in DataStage Designer (see the layout of this
job in Figure 10). This job contains one DB2/API UDB stage, one RTI input stage,
one RTI output stage and one join stage—all four of which are connected
by link stages.
Figure 7. Job layout
- Save the new job as sampleRTI.
- Click the Job Properties icon in the DataStage Designer to open the Job Properties window (see Figure 8).
- On the General tab, select Allow Multiple Instance and RTI Service Enabled.
- Click OK, then save the configuration.
Figure 8. Configuration of job properties
- Double-click the DB2/UDB API stage RFIDLOCATION to open the window
shown in Figure 9.
Figure 9. Configure database connection information
- In this window, specify the server name (the database name you want to connect
to), in this case,
GBPMDB. - Enter the user ID,
db2inst1, and password,passw0rd. Keep the other configurations as default. - Click the Output tab at the top of the window.
- On the General tab of the Output page (see Figure 10), enter
RFIDLOCATIONin the Table names field, and select Generated SQL query from the Query type drop-down list. - Leave the remaining default settings, and click the Columns tab.
Figure 10. Configure table information
- On the Columns tab, click the Load button to load the table definition. A window pops up from which you can select the table definition in the repository.
- Select RTILOCATION in the DSDB2 subcategory, and then click OK.
- Now select the columns shown in Figure 11, then click OK.
Figure 11. Select columns
- You'll see the results as shown in Figure 12. Click OK, and save the
job.
Figure 12. Import result
Configure the RTI input stage Employeeid
- Double-click the RTI input stage Employeeid, and select the Columns tab on the Output page.
- From here, load the table definition. (The steps for loading the table definition for the Employeeid RTI input stage are the same as those described in Configure the DB2/UDB stage section.)
- After loading the table definition, click OK, and save the job.
Configure the RTI output stage LocationInfo
- Double-click the RTI input stage LocationInfo, and go to the Columns tab on the Input page.
- Load the table definition. (The steps for loading the table definition for the Employeeid RTI input stage are the same as those described in Configure the DB2/UDB stage section.)
- After loading the table definition, click OK, and save the job.
Configure the join stage JoinByEmployeeid
- Double-click JoinByEmployeeid to set it up.
- Go to the Properties tab on the Stage page, then select
EMPLOYEEID as the join key and Inner as the join type, as shown in
Figure 13.
Figure 13. Configure join stage
- Now go to the Output tab. You can see that DataStage has generated a mapping relationship between the source and the target. Leave these default settings, and click OK.
Click the Compile icon in the DataStage Designer. The Compile Job window opens with a Job successfully compiled message, assuming no errors have occurred during the process. The job is now ready for deployment.
Deploy the RTI job sampleRTI as Web service
The rest of this article describes how to deploy the RTI job as a Web service using the RTI console.
- In the Current Tasks pane, open the RTI console, and click Register an RTI Server. This opens the RTI Server Wizard, as shown in Figure 14.
- In the RTI Server Name field, enter the machine name of the RTI server. The port number may be different, depending on which application server your RTI server is running on. For example, if the application server is IBM WebSphere Application Server, the port number is 9080.
- Keep the default settings of all the other fields, and click Finish.
Figure 14. Set RTI server
- The RTI server you just registered now appears as an icon in the right pane. Double-click the icon.
- In the Current Tasks pane, click Register a DataStage Machine to open the DataStage Machine Registration Wizard (see Figure 15).
- In the Machine name field, enter the name of the machine that's running a DataStage server or DataStage TX host.
- For DataStage server machines, enter your valid credentials in the User and
Password fields. The default listening port for the RTI Agent is 2000. If your
DataStage administrator has changed the port, do the following:
- Select the User-defined port button.
- Enter the new port number.
- Click Finish.
Figure 15. Set DataStage machine
- In the Current Tasks pane, click Add a New Service to the RTI Server to open the RTI Service Wizard.
- In the Service name field, enter the name of the new service, then click Finish.
- The service you just created now appears as an icon in the right pane. Double-click this sampleRTI icon.
- In the Current Tasks pane, click Add Support for Service Bindings to open the Add Support for Service Bindings Wizard.
- Select Soap over HTTP in the right-pane list, then click Next.
- In the Additional binding-specific description field, optionally enter a description of the binding, which is added to the Web Services Description Language (WSDL).
- From the Style list, select an encoding style for SOAP messages. Your choice depends on what client applications accept.
- Click Finish. The binding icon appears in the Results pane.
- In the Current Tasks pane, select Add an Operation to open the New Operation Wizard, as shown in Figure 16. You'll see the list of registered DataStage Server and DataStage TX machines represented as nodes.
- Select the job sampleRTI, which you just created, then click
Next.
Figure 16. Select RTI job
- In the Operation Name field, change the name if necessary. The default is the name of the job or map.
- In the Queue Size field, specify the size of the operation queue in terms of service requests. If the queue size is exceeded, the request is rejected. The default is three requests.
- In the Wait delay field, specify the maximum wait time in milliseconds. If the wait time is exceeded, the request is rejected. The default value is 100 milliseconds.
- Click Next.
- Pull down the Options list and select Array (see Figure 17). The jobs that accept or require
multiple rows in a single request should have their input arguments organized as
arrays. In this case, you deploy the job as a Web service that can accept
multiple rows in a single request, so you have to choose Array from the
Options list.
Figure 17. Create new operation
- The RTI job returns multiple rows in a single request, so again select Array in the Options list.
- Click Next.
- You should now be in the New Operation Wizard - Messages Summary window, as
shown in Figure 18. In the Service
Request and Response Messages field, review the input and output arguments for
the operation, then click Finish.
Figure 18. Review input and output arguments
Now you set the runtime parameters.
- In the Minimum field, enter the minimum number of concurrent job instances that can run at any given time.
- In the Maximum field, enter the maximum number of concurrent job and map instances that can run at any given time. The default is five, and the maximum is 500.
- Keep the default settings in the other fields, and click Next.
- In the next window, replace the default user credentials if needed, then click Finish.
- When the Operation Created window pops up, click OK.
- Right-click the binding icon in the right pane, and select Activate, as
shown in Figure 19.
Figure 19. Activate the binding
- Double-click the operation you created. In the right pane, you'll find the job you just attached to the operation.
- In the Global Tasks list, click Browse the RTI Registry to open the RTI
Registry Web page. You'll find the RTI service you just registered, sampleRTI,
in the list (see Figure 20).
Figure 20. RTI service list
- Click sampleRTI to display the registry information for it.
- To display the WSDL for the service in a Web browser, click the WSDL link. You can invoke the RTI job you developed through this WSDL file.
Develop a Java client to call the Web service
This section explains how to call the Web service that you just deployed using a Java Client—and your main task is to develop this Java client. Before you start, you need to prepare the following environment:
- Eclipse IDE Version 3.0 or later — You use Eclipse to develop the Java project; prepare an Eclipse IDE so you can easily follow the steps in this article.
- JDK 1.4 or 1.5 — This is essential to develop a Java project.
- Apache Axis — You use Axis to generate a local Java stub from a Web service, which makes it easy to invoke a Web Service.
Now you can begin developing your Java client.
- Create a Java project, and name it
TestRTIJob. - Right-click the project, and select Properties. A window like the one shown in Figure 21 opens.
- Click the Libraries tab, and add the Axis .jar files shown in Figure 21
to this project.
Figure 21. Add .jar files
- Select Run > Run from the Eclipse IDE to open a window like the
one shown in Figure 22.
Figure 22. Generate Java stub
- Create a new Java application instance in the left part of this window, then click the Search button on the right.
- From the Choose Main Type window that opens, select the WSDL2Java class. This class is provided by Axis to generate a local Java stub from a WSDL file.
- Click OK.
- Copy the URL of the WSDL file of the Web service you just published to the
Program arguments field, as shown in Figure 23.
Figure 23. Copy URL of the WSDL file
- Click Run. When the program is finished, you'll notice that some stub classes
are generated in your project, just like Figure 24 shows. These classes are used
to help you invoke the Web service.
Figure 24. Generated Java stub classes
- Now you need to create a class named
TestRTIJobto invoke the Web service via the stub classes that were just generated. The source code of this class is provided in Listing 1.
Listing 1. Call Web servicepackage com.ascential.rti.sample; import java.rmi.RemoteException; import javax.xml.rpc.ServiceException; public class TestRTIJob { public static void main(String[] args){ SampleLocator locator = new SampleLocator(); SampleDOCLIT service = null; try { service = locator.getsampleSoap(); String name = service.RTIJob("001"); System.out.println("The name is: " + name); } catch (ServiceException e) { e.printStackTrace(); } catch (RemoteException e) { e.printStackTrace(); } } } - Run this class. The Java console prints the user's location information.
You're done! You've finished the whole process from developing an RTI job to publishing it as a Web service to invoking the service from a Java client.
IBM WebSphere DataStage provides a convenient approach to deploying DataStage jobs as Web services. In this article, you learned about RTI and all its characteristics, then you developed a sample RTI job and published it as a Web service. You wrapped up by invoking the Web service with a Java client. Hopefully you've become more familiar with DataStage and how it combines seamlessly with SOA.
Learn
- Read
"Enable C++ applications for Web services using XML-RPC"
(developerWorks, June 2006), a step-by-step guide to exposing C++ methods as
services.
- Visit the
Architecture area on developerWorks to
get the resources you need to advance your skills in the architecture arena.
- The SOA and Web services zone on IBM developerWorks hosts hundreds of informative articles and introductory, intermediate, and advanced tutorials on how to develop Web services
applications.
- The IBM SOA Web site offers an overview of SOA and how IBM can help you get there.
- Stay current with developerWorks technical events and webcasts. Check out the following SOA and Web services tech briefings in particular:
- Get started on SOA with WebSphere's proven, flexible entry points
- Building SOA solutions and managing the service lifecycle
- SCA/SDO: To drive the next generation of SOA
- SOA reuse and connectivity
- Browse for books on these and other technical topics at the
Safari bookstore.
- Check out a quick Web services on demand demo.
Get products and technologies
- Innovate your next development project with
IBM trial software, available for download or on DVD.
Discuss
- Participate in the discussion forum.
- Get involved in the developerWorks community
by participating in developerWorks blogs, including the following SOA
and Web services-related blogs:
- Service Oriented Architecture -- Off the Record with Sandy Carter
- Best Practices in Service-Oriented Architecture with Ali Arsanjani
- WebSphere SOA and J2EE in Practice with Bobby Woolf
- Building SOA applications with patterns with Dr. Eoin Lane
- Client Insights, Concerns and Perspectives on SOA with Kerrie Holley
- Service-Oriented Architecture and Business-Level Tooling with Simon Johnston
- SOA, ESB and Beyond with Sanjay Bose
- SOA, Innovations, Technologies, Trends...and a little fun with Mark Colan






