IBM InfoSphere Information Server Pack for Salesforce.com

The InfoSphere® Information Server Pack for Salesforce.com lets you use InfoSphere DataStage® products to extract data from and load data into your Salesforce.com organization. Salesforce.com is a Web-based customer relationship management (CRM) platform that provides database services, applications, and application programming interfaces (APIs).

The Pack for Salesforce.com connects IBM® InfoSphere Information Server to Salesforce.com, and provides a graphical interface you can use to design and run jobs that:
  • Extract all or some of the data from your Salesforce.com organization
  • Extract only the data that was changed or deleted since the last time you extracted the data, or within the time frame you specify
  • Load new data or change data in your Salesforce.com organization
You configure the Pack's single stage to connect to Salesforce.com and either load or extract data from your Salesforce.com organization.
  • When designing a job to extract data, you can browse and select the metadata in your Salesforce.com organization to automatically generate the data selection statement used at run time to extract a Salesforce.com object. You can edit the selection statement to add conditions and sorting.
  • When designing a job to load data, you can browse and select the Salesforce.com object you want to create or change. You can design a load job with a reject link that captures any data rejected by Salesforce.com when you run the load job.
When designing either a load job or an extraction job, you can:
  • Save and load connection properties, allowing you to reuse them across multiple jobs
  • Save variables as job parameters, allowing you to design reusable jobs
  • Adjust the batch to ensure that the job processing is completed within the limits set by Salesforce.com
  • Load or extract Unicode data

The Pack for Saleforce.com stage supports a single output link when you use it in an extraction job. The Pack does not support running a single extraction job on parallel computing nodes. To avoid duplicate data, you must configure extraction jobs to run sequentially.

When you use the Pack to design a data load job, the Pack supports an input link and a reject link. A load job can run sequentially or in parallel. You can set up the pack to load or update data by using either a real-time load or a bulk load.

Real-time load
This type of load depends on the web service interfaces and is designed for jobs that load a small or moderate number of data records. With this approach, the Pack bundles the data records into multiple batches (up to 200 rows per batch). The Pack sends one batch at a time to Salesforce.com by using a web service call and then waits for the status of the operation before sending the next batch.
Bulk load
This type of load uses the asynchronous bulk API and is suited for jobs that load or update a large number of data records. With this approach, the Pack organizes the data into a comma-separated values (.csv) file format, and then uses the HTTPS Post method to send the comma-separated values files to Salesforce.com. Each HTTPS post can send up to 10,000 rows or 10 MB as a maximum. When you have hundreds of thousands or millions of input records, you can use a bulk load to improve load performance by reducing the number of round trips between InfoSphere DataStage and Salesforce.com.