Salesforce crawler - configuration properties

The Salesforce crawler crawls Salesforce databases.

The Create crawler: Salesforce screen is where you enter the configuration parameters for this crawler. For more detailed information on using the Salesforce crawler, see Salesforce Connector Considerations

Prerequisites

To run the Salesforce crawler, you need to generate Java™ SOAP binding libraries. The following links explain how to generate the libraries. Create two libraries from Partner WSDL and Metadta WSDL

The following JAR files must be mounted and available in the local filesystem and are accessible from Watson™ Explorer. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.

Crawler Properties

Crawler name
The name of the crawler. Alphanumeric characters, hyphens, underscores, and spaces are allowed.
Crawler description
A description of the crawler.
Advanced options
Maximum document size
The maximum size is 131,071 bytes.
When the crawler session is started
Specifies which content to crawl.

Data Source Properties

User name
The user name to call the Salesforce API.
Password
The password of the specified user .
Security Token
The security token of the user to call Salesforce API.
JAR location
The path to the prerequisite JAR libraries. The folder that contains the JAR libraries must be mounted so it is available. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.
All Versions
Enable this option to crawl all versions of the object type ContentVersion.
Sandbox
Enable this option to crawl the Salesforce Sandbox.
Batch size
The number of records to fetch by a single Salesforce query. However Salesforce API does not always respect this value. For more information, see Change the Batch Size in Queries.
Object Types
Specifies the object types to crawl. The default behavior is to crawl all object types. For custom object names, append __c in order to match the Salesforce API convention for custom object names. For example, for MyCustomObject specify MyCustomObject__c. Do not specify comment objects such as FeedComment, CaseComment, IdeaComment without FeedItem, Case, Idea respectively. If you specify a tag object you must also specify its parent. For example, AccountTag without Account.

Crawl space Properties

You can find and add multiple crawl spaces for a Salesforce crawler. For instructions, see Finding and adding crawl spaces in a Salesforce crawler.

Crawler plug-in

Data source crawler plug-ins are Java applications that can change the content or metadata of crawled documents. You can configure a data source crawler plug-in for all non-web crawler types. For more information, see Crawler plug-ins.

Enable the crawler plug-in
Enable this option when you use the crawler plug-in.
Plug-in class name
The class name for the crawler plug-in.
Plug-in class path
The JAR file location of the crawler plug-in. The folder that contains the JAR file must be mounted so it is available. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.