I'm facing a difficult problem in my OF_9.1+WCM_7.0 setup. The WCM export a seedlist which contains many inaccessible URLs (404 error). These inaccessible URLs should not appear in search result. So I hope omnifind can just ignore these inaccessible URLs. But it turns out that seedlist crawler will stop after it encounter 100 inaccessible URLs. Is there a way to configure the MaxErrorCount for this crawler?
I found there is a solution for OF_8.5, as shown belo. Though it doesn't seem to work for 9.1 during my tests.
IC65924: Web Content Management crawler stops when it encounters errors caused by links to inaccessible documents more than 6 times Configuration parameters to change the maximum number of consecutive error documents that can be skipped and the maximum number of retries per one error document were added. To configure this support, create a file named ES_NODE_ROOT/master_config/<Collection ID>.<Crawler ID>/wcmcrawler_ext.xml with following content and restart the crawler: <?xml version= "1.0" encoding= "UTF-8"?> <ExtendedProperties> <AppendChild XPath= "/Crawler/DataSources/Server" Name= "MaxErrorCount">100</AppendChild> <AppendChild XPath= "/Crawler/DataSources/Server" Name= "MaxRetryPerDoc">2</AppendChild> </ExtendedProperties>