Collecting documents for life cycle processing

For life cycle processing of documents, you must set up a task route that contains a stubbing collector (EC Process Email Stubbing Life Cycle). With such a stubbing task route, you reduce the amount of content in the email, its attachments, or both. When this content is reduced, IBM® Content Collector leaves a stub document in the source location. You define the content of the stub document in the collector settings, for example, it might contain only the header of the original note and a link to the archived content.

Before you begin

Prerequisites: Before IBM Content Collector can collect documents, you must configure a source connector. Otherwise, Content Collector cannot access the source system.

About this task

A stubbing collector (EC Process Email Stubbing Life Cycle) collects email at regular intervals to remove:

  • Attachments
  • The body text
  • The remaining email documents

The content, the attachments, or the entire email is removed based on the date when the email was archived, modified, received, or restored, and other status information in the email.

For example, a stubbing collector is configured in a way that attachments are removed from the source document three months after archiving. Six months after archiving, Content Collector deletes the entire email. An email that is not yet archived or stubbed contains no status information, so the stubbing collector will not collect it for processing. After the email is archived in an archiving task route, the status of the email is archived. The stubbing collector collects the email if the specified criteria is fulfilled, that is if the document was archived three months ago, and removes the attachments, which leaves an email with body text and links to the archived attachments. The status of the email is now archived and attachments removed.

The next time the collector processes the email, the collector checks the status of the email and how much time passed since it was archived. As soon as the specified interval of six months after archiving passed, Content Collector deletes the remaining stub document from the source location.

You can also configure the stubbing collector to collect documents for restubbing that were archived by using CommonStore if the stubbing functions for CommonStore documents are enabled in the Email Connector configuration.

Procedure

To configure a stubbing collector:

  1. Open the Configuration Manager and click Task Routes.
  2. Create or select the task route to which you want to add the collector.
  3. In the Toolbox, click Email > EC Process Email Stubbing Life Cycle and add it to the task route diagram.
  4. On the General page in the right pane, define general settings, such as whether to create hash keys for the collected files or to collect content type information.
  5. On the Schedule page, set a collector schedule by specifying when and how often the collector checks the mailboxes for email to collect for stubbing.
  6. On the Collection Sources page, configure one or more collection sources, depending on your source system.
    Typically, this will be a mailbox collection source. However, you can also select a journaling collection source, if you do not want to remove journal mails right after archiving them.

    The collector can retrieve statistical information about the mailboxes and stores that are specified as collection sources. Select Create statistics for the collector and select which counters should be selected to create a statistics file in the statistics subdirectory of the Content Collector log directory. See the topic about collecting mailbox statistics for detailed information. Note that collecting statistical information might impact the performance of the collector.

  7. On the Life Cycle page, define a stubbing life cycle by selecting stubbing options.
    Select stubbing options for email that is in one of the following states:
    • Email that was archived
    • Email that users marked for stubbing
    • Email that was restored, but was not restored from a search result list
    • Email that mobile users copied to an offline repository after it was archived. When the email is copied, its status is set to mobility done. This status means that IBM Content Collector does not wait for the delayed stubbing interval to pass, but stubs the original email the next time that the stubbing collector runs.
    Restriction: A stubbing sequence as defined in a document life cycle cannot be applied to documents that were restored from a repository that was fed from IBM CommonStore for Exchange Server or from IBM CommonStore for Lotus® Domino®. The information about the document state in these documents cannot be interpreted by IBM Content Collector because it is in an incompatible format. These documents are stubbed according to the settings on the CommonStore page.
    Important: The address information for the IBM Content Collector server becomes an unchanging part of the link in the stub document. To avoid problems with the generated links, specify the fully qualified host name (for example, ICCServer.example.com) of the machine that runs the web application server, or the respective alias, in the Web Application configuration under General settings. Ensure that this host name or alias is resolved properly by the DNS. If the host name of the server changes, stub links will not work.

    For each selected stubbing option, set the time when you want this to happen and select whether this time is calculated relative to the date the email was received, archived, or modified.

    Choose from these options:
    Remove nothing and add text
    Add text to original email indicating that content was archived by Content Collector.
    Remove attachments
    Remove the attachments of email after archiving.
    Remove attachments and cut body
    Shorten the body text of email after archiving. Content Collector replaces the original formatted text with a plain text representation that is cut off at the specified length, where line breaks are preserved. Other formatting, however, is not preserved.
    Remove attachments and body
    Remove all of the body text in the email after archiving.
    Delete entire email
    Delete the email after archiving.
    Select documents to re-create stubs
    Stub email again after users restored the content.
    Important: This option does not apply for email that was archived with CommonStore for Lotus Domino or CommonStore for Exchange Server and is restored with Content Collector. These documents are restubbed according to the settings on the CommonStore page.
    Select restored documents for deletion
    Delete documents that were restored from a search result list.

    When email is stubbed, a preview link to the archived document is added to the stub document, and, if attachments are removed, also an attachment link to the archived attachment.

    You specify the text that indicates archiving or stubbing operations in the postprocessing task for stubbing (EC Create Email Stub task). You must include a postprocessing task in your task route even if you use a stubbing collector because stubbing operations must occur after archiving. Through its position in the task route, the postprocessing task for stubbing ensures that the stubbing actions that are selected in the stubbing collector are done at the right time.

  8. If you enabled the stubbing functions for CommonStore documents in the Email Connector configuration, configure stubbing on the CommonStore page.
    1. If your Email Connector is configured for Microsoft Exchange, you can select to stub archived CommonStore for Exchange Server documents.
      In this case, the stubbing type depends on the deletion type that was selected when the document was archived with CommonStore for Exchange Server. If the selected deletion type was ATTACHMENT, Content Collector removes attachments from the original document, so that the stub document contains the email body and a list of the attachments that were removed. If the selected deletion type was BODY, Content Collector removes the message body and the attachments from the original document. However, this is relevant only if CommonStore for Exchange Server was configured to use delayed stubbing.

      Content Collector stubs the documents as soon as the specified time interval after the documents were archived has elapsed.

    2. To have Content Collector re-create the stubs for CommonStore documents that were restored with Content Collector, select the option Select restored CommonStore documents to re-create stubs.
      For CommonStore for Exchange Server documents, the stub is created in the same way as the original stub. For CommonStore for Lotus Domino documents, the entire document is stubbed. Content Collector re-creates the stubs as soon as the specified time interval after the documents were restored has elapsed.
  9. Save your settings.