Finding and adding crawl spaces in a file system
You can define multiple crawl spaces in a file system crawler.
Before you define crawl spaces, you must define mount points in the local filesystem that are accessible from Watson™ Explorer. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX.
Before you begin
- Click Find and Add under Crawl space Properties.
- In the Find and Add crawl spaces window, click Fetch children to populate the drop-down list with a list of possible crawl spaces.
Click on an item in the drop-down list, and then click Add and
Close to select it.
You are now returned to the page for your crawler.
Click , and then click Edit to enter the crawl space properties.
The following properties can be set. All of the properties are optional.
- Crawl space name
- The crawl space name as shown in the crawler status. This value is only used as a label, so that you cannot change the actual crawl space even if modifying this property. To change crawl spaces, select different nodes in the Find and Add crawl spaces panel.
- Level of subdirectories to crawl
- The subdirectory level to crawl. If you want to crawl all documents, select All subdirectory levels. If you want to crawl the specified directory only, select Current subdirectory only.
- Field to use as the document date
- Document modification date
- By default, the modified dates of the crawled documents is stored as the native field
- Document crawl date
- Assigns the crawled dates to the native field
- Extension filter
- Select Included filter or Excluded filter. Based on this selection, the File extensions to exclude or File extensions to include property is displayed. The selected file extensions are then excluded or included. When the setting of the content type filter conflicts with that of the extension filter so that one filter is the included filter and the other filter is excluded, then only the excluded filter is effective.
- Automatic code page detection
- When this property is disabled, the encoding converter detects the code pages of crawled documents. When you want to hint, enable this property and specify the code page from the list.
You can repeat the above procedure to define any number of crawl spaces.