About this task
The minimum steps required to use the new IO SharePoint connector are the following:
Procedure
-
Create a new Watson™ Explorer Engine search collection.
-
In the Configuration tab of your Watson Explorer Engine search
collection, click Add a new seed.
-
A pop-up window displays. Select IO SharePoint from the scrollable list of
available connectors, and click add.
-
In the Web Application or Site Collections URLs box, enter the URL(s) of
your SharePoint site(s).
Note: Use the same
Web Application or
Site Collection
URL as it appears in
Central Administration. If you
use a non-root
Site Collection URL, referred to as a
Managed Path URL, see the section
Filtering URLs At Crawl Time for more information.
-
In the Username field, enter the SharePoint user name of the account to use for
crawling.
-
In the Password field, enter the crawling account's SharePoint password. The
actual password that you enter is not displayed. Enter that password again in the box
below to confirm that you entered it correctly.
-
In the Authentication Type drop-down menu, select the authentication method for your
SharePoint deployment: BASIC, NTLM2, KERBEROS, or
CLAIMS_BASED_AUTHENTICATION. The default authentication type is NTLM2. If the
target is SharePoint Online, select CLAIMS_BASED_AUTHENTICATION.
Note: If you are using Windows Claims Based Authentication, CBA, select the
appropriate authentication type and set ACLs contain Claims option under
Crawling and ACLs section to on.
-
Check Use SharePoint Online Authentication if the target server is
SharePoint Online.
-
Set the Seed URL Type.
This setting identifies what type of SharePoint object the provided seed URLs point to: site
collections or web applications (also known as virtual servers). If the seed URL type is set to Site
Collections, then only the children of the site collection referenced by the URL are crawled. If the
Seed URL type is set to Web Applications, then all of the site collections (and their children)
belonging to the web applications referenced by each URL are crawled.
Note: SharePoint Online does not have Web Applications. Therefore, in order to crawl multiple site
collections on SharePoint Online, you need to specify every URL of the site collections as the seed
URL. Note that site collections do not have parent-child relationship; so that even if you specify
https://<server1>.sharepoint.com as the seed URL, the
connector will not crawl another site collection such as
https://<server1>.sharepoint.com/sites/<anothersitecollection>,
although there might seem to be a relationship between them.
-
Click OK/Apply to save the updated configuration information for the IO
SharePoint connector.
After saving your changes, you will be able to use the commands on the
Overview tab of the Watson Explorer Engine Administration tool. At this point, you can
initiate a crawl of SharePoint sites on the specified server(s) using the IO SharePoint
connector. See IO SharePoint Advanced Configuration and Performance Tuning for more information.
Warning: Be sure you have selected the appropriate
authentication method. The IO SharePoint connector will not function correctly if the
wrong Authentication Type is selected. NTLM2 is selected by
default.
Example
Crawling SharePoint Server - NTLM Authentication
Table 1. Seed Component Section
Field |
Value |
Web Application or Site Collection URLs |
https://localsharepoint.example.com |
Username |
Administrator |
Password |
<password> |
Seed URL Type |
default (Site Collections) |
Authentication Type |
default (NTLM2) |
Crawling SharePoint Server - ADFS Authentication
Table 2. Seed Component
Section
Field |
Value |
Web Application or Site Collection URLs |
https://yoursharepoint.example.com |
Username |
Administtrtator@example.com |
Password |
<password> |
Seed URL Type |
default (Site Collections) |
Authentication Type |
CLAIMS_BASED_AUTHENTICATION |
Table 3. Claims Based
Authentication Section
Field |
Value |
Security Token Service Endpoint |
https://youradfs.example.com/adfs/services/trust/2005/UsernameMixed |
Relaying Party Trust Identifier (AppliesTo, Realm) |
urn:sharepoint:yourrelayingpartytrustid |
Crawling SharePoint Online - Native AuthenticationIf your
crawler username contains a domain name ending with "onmicrosoft.com", native authentication is
likely used. Please check with your SharePoint administrator for details on your SharePoint
configuration.
Table 4. Seed
Component Section
Field |
Value |
Web Application or Site Collection URLs |
https://AAAA.sharepoint.com |
Username |
azureuser@AAAA.onmicrosoft.com |
Password |
<password> |
Seed URL Type |
default (Site Collections) |
Authentication Type |
CLAIMS_BASED_AUTHENTICATION |
Use SharePoint Online Authentication |
Checked |
Table 5. Claims Based
Authentication Section
Field |
Value |
Security Token Service Endpoint |
https://login.microsoftonline.com/extSTS.srf |
Crawling SharePoint Online - ADFS Authentication (v12.0.2.2 or
later)If your crawler username contains your own domain name such as "example.com", ADFS
authentication will be used. Please check with your SharePoint administrator for details on your
SharePoint configuration.
Table 6. Seed Component Section
Field |
Value |
Web Application or Site Collection URLs |
https://BBBB.sharepoint.com |
Username |
username@example.com |
Password |
<password> |
Seed URL Type |
default (Site Collections) |
Authentication Type |
CLAIMS_BASED_AUTHENTICATION |
Use SharePoint Online Authentication |
Checked |
Table 7. Claims Based
Authentication Section
Field |
Value |
Security Token Service Endpoint |
https://adfs.example.com/adfs/services/trust/2005/usernamemixed |
Relaying Party Trust Identifier (AppliesTo, Realm) |
urn:federation:MicrosoftOnline |