Files
This seed can be used to crawl files local to the Watson™ Explorer Engine installation.
The following values can be configured for this type of seed:
- Files - Newline-separated list of files to crawl. UNIX users can use a path such as
- /usr/local/
- C:/Program Files/
This component simply prepends file:/// to the given file path(s) to create a valid URL.
Tip: The filesystems used by Linux, Unix, and Unix-like computer systems
can contain special types of files, such as block and character device nodes and files that
represent named pipes, which cannot be crawled because they do not contain data, but serve as
device or I/O access points. Attempting to crawl such files will generate crawler and
converter errors during the crawl. To avoid such errors, you should exclude the
/dev directory in any top-level crawl on a Linux, Unix, or Unix-like
filesystem. If present on the system thqat you are crawling, you should also exclude temporary
system directories such as /proc, /sys, and
/tmp that contain transient files and system information.