Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
2 replies Latest Post - ‏2012-09-05T09:26:55Z by VD7T_John_Sutton
VD7T_John_Sutton
VD7T_John_Sutton
2 Posts
ACCEPTED ANSWER

Pinned topic Duplicate Documents

‏2012-09-03T16:27:13Z |
Hi

I'd like to be able to identify which documents in a set of search results contain duplicates so we can identify the duplicates in the source system and eventually cleanse them as required.

I can see the % of documents that are duplicates and then painstakingly click on each show similar documents icon in the document results view but there does not appear to be a means to create a report on the details of just the duplicate documents.

Any help or thoughts on this would be much appreciated.

Regards
Updated on 2012-09-05T09:26:55Z at 2012-09-05T09:26:55Z by VD7T_John_Sutton
  • bfoyle
    bfoyle
    60 Posts
    ACCEPTED ANSWER

    Re: Duplicate Documents

    ‏2012-09-04T04:31:16Z  in response to VD7T_John_Sutton
    There is an internal field "$dup" is used to eliminate duplicates. Turn duplicate detection on and you should be able to use the following queries. You can then use flagging to flag either the masters or the duplicates of the masters and then maybe export the xml to take further action or validate.

    <original query> \$dup:yes => this query will return documents that match the original query only from duplicates.
    <original query> -\$dup:yes => this query will return documents that match the original query only from master documents.
  • VD7T_John_Sutton
    VD7T_John_Sutton
    2 Posts
    ACCEPTED ANSWER

    Re: Duplicate Documents

    ‏2012-09-05T09:26:55Z  in response to VD7T_John_Sutton
    That is perfect - many thanks for your help