Product Documentation
Abstract
When the Web crawler connects through an unauthenticating Microsoft ISA proxy server, the attempt to obtain the robots.txt file fails. A message similar to the following example is logged:
col_44157.WEB_34166" WPS_SearchDev" BaseException.java"-1"3 java.net.SocketTimeoutException: Read timed out"3 java.net.SocketTimeoutException: Read timed out.
Content
This problem occurs because the proxy server does not close the connection, which means that the Web crawler continues to wait for more data until the connection eventually times out.
To solve this problem, you can configure the crawler to use a plug-in that adds a "Connection: close" field to the HTTP request:
1. Download the attached files, CloseSessionPerRequest,java and closerequestplugin.jar.
2. Stop the Web crawler.
3. In the enterprise search administration console, update the Web crawler and configure the crawler to use a custom plug-in.
- In the Plug-in class name field, enter: wc.pi.CloseSessionPerRequest.
- In the Plug-in class path field, enter the path to where you put the closerequestplugin.jar file. For example: "C:\test\closerequestplugin.jar"
4. Restart the Web crawler.
Was this topic helpful?
Document Information
More support for:
OmniFind Enterprise Edition
Software version:
8.5, 8.4
Operating system(s):
Windows
Document number:
322315
Modified date:
17 June 2018
UID
swg27015937