Finding and adding crawl spaces in a file system

You can define multiple crawl spaces in a file system crawler.

Before you begin

Before you define crawl spaces, you must define mount points in the local filesystem that are accessible from Watson™ Explorer. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.

Procedure

  1. Click Find and Add under Crawl space Properties.
  2. In the Find and Add crawl spaces window, click Fetch children to populate the drop-down list with a list of possible crawl spaces.
  3. Click on an item in the drop-down list, and then click Add and Close to select it.

    You are now returned to the page for your crawler.

  4. Click 3 vertical dots, and then click Edit to enter the crawl space properties.

    The following properties can be set. All of the properties are optional.

    Crawl space name
    The crawl space name as shown in the crawler status. This value is only used as a label, so that you cannot change the actual crawl space even if modifying this property. To change crawl spaces, select different nodes in the Find and Add crawl spaces panel.
    Level of subdirectories to crawl
    The subdirectory level to crawl. If you want to crawl all documents, select All subdirectory levels. If you want to crawl the specified directory only, select Current subdirectory only.
    Field to use as the document date
    Document modification date
    By default, the modified dates of the crawled documents is stored as the native field __$Date$__.
    Document crawl date
    Assigns the crawled dates to the native field __$Date$__.
    Extension filter
    Select Included filter or Excluded filter. Based on this selection, the File extensions to exclude or File extensions to include property is displayed. The selected file extensions are then excluded or included. When the setting of the content type filter conflicts with that of the extension filter so that one filter is the included filter and the other filter is excluded, then only the excluded filter is effective.
    Automatic code page detection
    When this property is disabled, the encoding converter detects the code pages of crawled documents. When you want to hint, enable this property and specify the code page from the list.

What to do next

You can repeat the above procedure to define any number of crawl spaces.