Windows Remote File System Crawler

This document describes Windows Remote File System Crawler with Watson Explorer Foundational Components. Watson Explorer has SMB Fileshares Connector to crawl files via SMB protocol. However, SMB Fileshares Connector support SMB v1. So, if files are shared via SMB v2 or SMB v3, SMB Fileshares Connector cannot crawl the files. To crawl files shared via any version of SMB, Windows Remote File System Crawler is introduced.

Before you begin

Microsoft Windows only.

To get the ACLs of files correctly, the Windows system where the Watson Explorer Foundational Component engine is working needs to join the Windows Domain. This is the same as existing SMB Fileshare Connector.

Note: Windows Remote File System Crawler internally calls Windows API to connect remote file system, then crawl files like crawling local files. It is very similar to performing a “Map network drive”, then crawl the files on the drive. So, the version of SMB and other settings of SMB depends on the setting of Windows. For example, you can specify the version of SMB by the setting of Windows.

Procedure

  1. In the Add a new seed dialog, choose Windows Remote Filesystem.
  2. Fill in the file path, username, and password. The username and password should have privileges to mount the file system. The username should be specified with the information where the user belongs, such as domain\user1 instead of just user1. The file path should be specified as \<File Server>\<Shared_Directory>.
  3. By default, filesystem metadata is not crawled. Set Crawl Filesystem Metadata to true to cause the crawler to get the available filesystem metadata (creation date, last modified date, file attributes, etc) about the file. The filesystem metadata needs to be added to the virtual document by the Windows Filesystem Metadata converter.
  4. Start crawling.
    Note: When the username is changed after crawling, reboot the server. Otherwise, the crawling will be done with the old username.