Salesforce crawler - configuration properties
The Salesforce crawler crawls Salesforce databases.
The Create crawler: Salesforce screen is where you enter the configuration parameters for this crawler. For more detailed information on using the Salesforce crawler, see Salesforce Connector Considerations
Prerequisites
To run the Salesforce crawler, you need to generate Java™ SOAP binding libraries. The following links explain how to generate the libraries. Create two libraries from Partner WSDL and Metadta WSDL
The following JAR files must be mounted and available in the local filesystem and are accessible from Watson™ Explorer. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.
- force-partner.jar (from Partner WSDL)
- force-metadata.jar (from Metadata WSDL)
- force-wsc.jar (from Force Wsc)
- commons-beanutils.jar (from Apache Commons BeanUtils)
Crawler Properties
- Crawler name
- The name of the crawler. Alphanumeric characters, hyphens, underscores, and spaces are allowed.
- Crawler description
- A description of the crawler.
- Advanced options
-
- Maximum document size
- The maximum size is 131,071 bytes.
- When the crawler session is started
- Specifies which content to crawl.
Data Source Properties
- User name
- The user name to call the Salesforce API.
- Password
- The password of the specified user .
- Security Token
- The security token of the user to call Salesforce API.
- JAR location
- The path to the prerequisite JAR libraries. The folder that contains the JAR libraries must be mounted so it is available. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.
- All Versions
- Enable this option to crawl all versions of the object type ContentVersion.
- Sandbox
- Enable this option to crawl the Salesforce Sandbox.
- Batch size
- The number of records to fetch by a single Salesforce query. However Salesforce API does not always respect this value. For more information, see Change the Batch Size in Queries.
- Object Types
- Specifies the object types to crawl. The default behavior is to crawl all object types. For custom object names, append __c in order to match the Salesforce API convention for custom object names. For example, for MyCustomObject specify MyCustomObject__c. Do not specify comment objects such as FeedComment, CaseComment, IdeaComment without FeedItem, Case, Idea respectively. If you specify a tag object you must also specify its parent. For example, AccountTag without Account.
Crawl space Properties
You can find and add multiple crawl spaces for a Salesforce crawler. For instructions, see Finding and adding crawl spaces in a Salesforce crawler.
Crawler plug-in
Data source crawler plug-ins are Java applications that can change the content or metadata of crawled documents. You can configure a data source crawler plug-in for all non-web crawler types. For more information, see Crawler plug-ins.
- Enable the crawler plug-in
- Enable this option when you use the crawler plug-in.
- Plug-in class name
- The class name for the crawler plug-in.
- Plug-in class path
- The JAR file location of the crawler plug-in. The folder that contains the JAR file must be mounted so it is available. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.