I'd like to be able to identify which documents in a set of search results contain duplicates so we can identify the duplicates in the source system and eventually cleanse them as required.
I can see the % of documents that are duplicates and then painstakingly click on each show similar documents icon in the document results view but there does not appear to be a means to create a report on the details of just the duplicate documents.
Any help or thoughts on this would be much appreciated.
This topic has been locked.
Pinned topic Duplicate Documents
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
bfoyle 060001WDQ360 Posts
Re: Duplicate Documents2012-09-04T04:31:16ZThis is the accepted answer. This is the accepted answer.There is an internal field "$dup" is used to eliminate duplicates. Turn duplicate detection on and you should be able to use the following queries. You can then use flagging to flag either the masters or the duplicates of the masters and then maybe export the xml to take further action or validate.
<original query> \$dup:yes => this query will return documents that match the original query only from duplicates.
<original query> -\$dup:yes => this query will return documents that match the original query only from master documents.
VD7T_John_Sutton 270003VD7T2 Posts
Re: Duplicate Documents2012-09-05T09:26:55ZThis is the accepted answer. This is the accepted answer.That is perfect - many thanks for your help