Exchange Server crawlers
To collect content from public folders and user mailboxes that are managed by Microsoft Exchange Server, configure an Exchange Server crawler.
Crawler connection credentials
When you create the crawler, you can specify credentials that allow the crawler to connect to the sources to be crawled. You can also configure connection credentials when you specify general security settings for the system. If you use the latter approach, multiple crawlers and other system components can use the same credentials. For example, the search servers can use the credentials when determining whether a user is authorized to access content.
Crawler configuration
Before you configure an Exchange Server crawler, you must configure the Exchange Web Service (EWS) on the Exchange Server server to allow the crawler to access content.
- Specify properties that control how the crawler operates and uses system resources. The crawler properties control how the crawler collects content from all servers in the crawl space.
- Specify information about the Exchange Server server that you
want to crawl.
You must specify a user ID and password so that the crawler can access content on the server. The user ID can be in user principal name (UPN) format or domain format, such as Domain\ExampleAccountName.
- Select the public folders or personal folders to crawl. The crawler cannot crawl both types of folders in the same crawler session. To include public folders and personal folders in a collection, create separate crawlers.
- Specify options for making documents searchable. For example, you can exclude certain types of documents from the crawl space.
- Set up a schedule for crawling the Exchange Server server.
Compound documents
If a document contains multiple parts, and you want all parts of the document to be treated as a single document in the search results, you can configure the crawler to support compound documents. In this case, a parent document that contains child documents can be searched as a single document. If the search terms are found, all of the child documents are listed with the parent document in the search results. If support for compound documents is not enabled in the crawler configuration, the parent and child documents are searched separately and returned as separate documents in the search results. For more information, read about support for crawling compound documents.
Public folders
The Exchange Server crawler can crawl any number of folders and subfolders on Exchange Server public folder servers. When you create a crawler, you select the content that you want to collect from one public folder server. Later you can edit the crawl space to add content from another server.
User mailboxes
The Exchange Server crawler can crawl any number of personal folders and items in Exchange Server user mailboxes. When you create a crawler, you select the content that you want to collect from one Mailbox server. Later you can edit the crawl space to add content from another server. The crawler can collect content only from user mailboxes, not other types of mailboxes on the Mailbox server.
If you plan to collect content from a Mailbox server, you must deploy a provided web service, ESExchangeServices, on the Mailbox server so that the crawler can access folders, items, and user permissions.
Personal folders | Item types |
---|---|
Inbox
Drafts Sent items Outbox Calendar Contacts Tasks Notes User-created folders |
Message(Mail)
Calendar Task Contact Notes PostItem MeetingMessage MeetingRequest MeetingResponse MeetingCancellation |
Mailbox filters
- Specific domain controllers that organize the content on the server
- Specific mailbox servers
- Specific database servers
- Specific organization units (OU), which includes all mailboxes of users who belong to the OU
- Specific users
After you create the crawler, you cannot change the filters. To crawl different personal folders or a different mailbox server, for example, you must create a separate crawler.
Mailbox security
To obtain security data for searching user mailboxes, you must deploy a provided web service, ESCommonServices, on the Exchange Server Mailbox server. This service enables Watson Explorer Content Analytics to obtain group lists and permissions necessary for pre-filtering access controls