• 1 reply
  • Latest Post - ‏2013-12-12T16:04:29Z by dschoppmann
3 Posts

Pinned topic Deleting Web Documents from Index

‏2013-08-07T19:55:34Z | crawler web

I accidentally made the crawlspace for my web crawler too large, and many urls were crawled and parsed that I did not want. I have not found a successful way to remove all of these documents from the search results. Is there an easy way to remove all the documents crawled by a specific crawler from the index? Or possibly remove all web sources from a collection that has other sources as well?

  • dschoppmann
    8 Posts

    Re: Deleting Web Documents from Index


    You can remove documents form index based on their URL. Open the ESAdmin web console and navigate to Parser/Indexer of the collection. Switch to edit mode and click on "Remove URIs from the index". Be careful with usage of wildcard star (*)! Alternatively you can delete the crawler. This causes a deletion of all documents crawled by this crawler. For both ways the parser needs to be running.