Crawlers

You configure crawlers for the different types of data that you want to include in a collection. A single collection can contain any number of crawlers.

A crawler has two primary functions. When you configure a crawler, the discovery processes determine which sources are available in a data source. After you start a crawler, the crawler copies data from the data sources to a converter pipeline.

The following crawlers are available in IBM Watson® Explorer.

Agent for Windows file systems crawler: The Agent for Windows file systems crawler crawls remote Microsoft Windows file systems.
BoardReader crawler: The BoardReader crawler crawls social media data that has been collected by the BoardReader web service. BoardReader is an application that aggregates data from multiple social media sources across the Internet.
Box crawler: The Box crawler crawls files stored on Box. Box (www.box.com) is a cloud-based data storage repository.
Case Manager crawler: The Case Manager crawler crawls an IBM® Case Manager server.
Exchange crawler: The Exchange crawler crawls public folders and user mailboxes that are managed by Microsoft Exchange Server.
File system crawler: The File system crawler crawls directories on the server where Watson™ Explorer is installed.
FileNet® P8 crawler: The FileNet P8 crawler crawls IBM FileNet P8 databases.
IBM Connections crawler: The IBM Connections crawler crawls documents on an IBM Connections server.
JDBC database crawler: The JDBC database crawler crawls JDBC databases.
Notes® crawler: The Notes crawler crawls IBM Notes databases.
Salesforce crawler: The Salesforce crawler crawls Salesforce databases.
SharePoint crawler: The SharePoint crawler crawl Microsoft SharePoint servers.
Web crawler: The Web crawler crawls web sites.
Web Content Manager crawler: The Web Content Manager crawler crawls documents on an IBM Web Content Manager server.
WebSphere® Portal crawler: The WebSphere Portal crawler crawls documents on an IBM WebSphere Portal server.