Crawlers

You configure crawlers for the different types of data that you want to include in a collection. A single collection can contain any number of crawlers.

A crawler has two primary functions. When you configure a crawler, the discovery processes determine which sources are available in a data source. After you start a crawler, the crawler copies data from the data sources to a converter pipeline.

The following crawlers are available in IBM Watson® Explorer.

Agent for Windows file systems crawler
The Agent for Windows file systems crawler crawls remote Microsoft Windows file systems.
BoardReader crawler
The BoardReader crawler crawls social media data that has been collected by the BoardReader web service. BoardReader is an application that aggregates data from multiple social media sources across the Internet.
Box crawler
The Box crawler crawls files stored on Box. Box (www.box.com) is a cloud-based data storage repository.
Applies to version 12.0.1 and subsequent versions unless specifically overridden Case Manager crawler
The Case Manager crawler crawls an IBM® Case Manager server.
Applies to version 12.0.1 and subsequent versions unless specifically overridden Exchange crawler
The Exchange crawler crawls public folders and user mailboxes that are managed by Microsoft Exchange Server.
File system crawler
The File system crawler crawls directories on the server where Watson™ Explorer is installed.
Applies to version 12.0.1 and subsequent versions unless specifically overridden FileNet® P8 crawler
The FileNet P8 crawler crawls IBM FileNet P8 databases.
IBM Connections crawler
The IBM Connections crawler crawls documents on an IBM Connections server.
JDBC database crawler
The JDBC database crawler crawls JDBC databases.
Notes® crawler
The Notes crawler crawls IBM Notes databases.
Salesforce crawler
The Salesforce crawler crawls Salesforce databases.
SharePoint crawler
The SharePoint crawler crawl Microsoft SharePoint servers.
Web crawler
The Web crawler crawls web sites.
Web Content Manager crawler
The Web Content Manager crawler crawls documents on an IBM Web Content Manager server.
WebSphere® Portal crawler
The WebSphere Portal crawler crawls documents on an IBM WebSphere Portal server.