Configuring export formats

Users need to be able to export content for legal review. However, before users export content, you should review the export format options to ensure that the plug-ins are optimized for your users. eDiscovery Manager can export content with the default options, but your site might benefit by taking advantage of some of the formats' optional features, which might not be enabled by default. In addition to the default export formats, you can create your own custom export format using the eDiscovery Manager export API, or you can use the export format plugins from other vendors. In the latter case, work with your plug-in vendors to gather the necessary information for configuring export formats and their associated plug-in parameters.

Important: Your ability to perform tasks depends on the roles that were assigned to you. Only icons and menu options for tasks that are associated with your roles are visible to you. Be aware that this topic might include information that is not relevant to your roles.

To configure export formats:

  1. On the Administration page, click Export Formats in the Navigation pane.
  2. Select a format to which your users will want to export content.
    EDRM XML
    eDiscovery Manager provides this format for exporting content to EDRM (Electronic Discovery Reference Model) XML files.
    HTML
    eDiscovery Manager provides this format for exporting content to HTML files.
    Native
    eDiscovery Manager provides this format for exporting content to its native form. For example, Lotus Notes® email is exported to NSF files and Microsoft Exchange email is exported to PST files.
    PDF
    eDiscovery Manager provides this format for exporting content to PDF files.
    Custom export format
    You can also use export formats from other vendors or create your own. See Developing with the eDiscovery Manager Export APIs in the IBM® Archive and eDiscovery Solution Information Center.
  3. Configure the plug-in parameters of the export format, if necessary. (Not all export formats have both Extract and BatchComplete plug-in points, and not all plug-in points are configurable.) At a minimum, review the default plug-in parameter values of the export format to familiarize yourself with the configurable parameters.

    EDRM XML
    Extract plug-in point
    The Extract plug-in point occurs after each piece of content is extracted from the content server. The configurable parameters of the Extract plug-in include:
    Format.of.inline.content
    Enter the format type of the inline content (the email body) in the EDRM XML export file. Valid values include:
    HTML
    An HTML representation of the email body, complete with line breaks, links, font type and size, table support, and so on. The HTML representation is very close to what the email looks like when displayed by its native email client.
    TEXT
    A plain text representation of the email body with no rich text information. The TEXT representation cannot mimic what the email looks like when displayed by its native email client, but exporting a plain text representation is faster than exporting an HTML representation. TEXT is the default value.
    Maximum.size.of.EDRM.XML.file
    Enter the maximum size, in MB, of an EDRM XML export file. When an export file reaches the specified threshold, eDiscovery Manager creates another export file.
    BatchComplete plug-in point
    The BatchComplete plug-in point occurs after each batch of 1000 documents or files is extracted from the content server. The configurable parameters of the BatchComplete plug-in include:
    Create.ZIP.file
    Indicate whether (TRUE) or not (FALSE) to compress all of the extracted files into a ZIP file. The default is TRUE. The naming convention for ZIP files is:
    exportFileNamePrefix_batchN_X.zip
    where:
    exportFileNamePrefix
    is the export file prefix that a user enters when exporting content.
    N
    is the document batch number, for example, 0 (zero), 1, 2, and so on. Content is exported in batches of 1000 documents or files.
    X
    is the relative order of the ZIP file within the document batch. Depending on the value of the plug-in parameter Maximum.size.of.ZIP.file, there can be multiple ZIP files for each batch of exported documents.

    For example, the following ZIP files might be created by an export task:

    test_batch0_0.zip - The first ZIP file in the first document batch

    test_batch0_1.zip - The second ZIP file in the first document batch

    test_batch1_0.zip - The first ZIP file in the second document batch

    test_batch1_1.zip - The second ZIP file in the second document batch

    test_batch1_2.zip - The third ZIP file in the second document batch

    test_batch2_0.zip - The first and only ZIP file in the third document batch
    Maximum.size.of.ZIP.file
    Enter the maximum size, in MB, of a ZIP file that contains a batch of documents or contains a portion of a batch of documents. The default is 512 MB.
    HTML plug-in
    Extract plug-in point
    The Extract plug-in point occurs after each piece of content is extracted from the content server. The configurable parameters of the Extract plug-in include:
    Document.conversion.timeout.minutes
    The maximum time in minutes to allow for conversion per email. If you see errors occur because of a time out, you might want to increase this value.
    Email.XSLT.file.name
    Optionally specify the name of an Extensible Stylesheet Language Transformation (XSLT) file that controls how the exported content is converted to HTML. The default XSLT file provided with eDiscovery Manager is HTMLExportTemplate.xsl.
    Important: If you specify your own XSLT file, add the file to the class path so that eDiscovery Manager can access it.
    Retain.original.documents
    Indicate whether (TRUE) or not (FALSE) to retain exported content in its original form as well as in HTML form. An item's original form is the format in which it is stored on the content server.

    The default is FALSE. Exported content, in its original form, is not retained in the export directory. After an export task finishes, only the HTML version of the content remains. For email, both the email content and its attachments are converted to HTML. The only exception is for pictures that are embedded in email; they are always retained in the export directory in their original formats.

    Specify TRUE to retain exported content, in its original form, in the export directory along with the HTML version of that content. Examples of content that can be controlled by this parameter include CSN, MSG, and DXL files for archived email and documents archived with IBM Content Collector for File Systems.

    Lotus Notes email: Lotus Notes email is stored on content servers in CSN format, not NSF format. For this reason, it is retained in the export directory as CSN files if the value of Retain.original.documents is TRUE when exporting to HTML.
    Exchange email: Exchange email is stored on content servers in MSG format, not PST format. For this reason, it is retained in the export directory as MSG files if the value of Retain.original.documents is TRUE when it is exported to HTML.
    BatchComplete plug-in point
    The BatchComplete plug-in point occurs after each batch of 1000 documents or files is extracted from the content server. The BatchComplete plug-in for HTML export format is disabled by default. You must enable this plug-in to make an HTML export task create a ZIP archive file. If this plug-in is enabled, its configurable parameter is:
    Maximum.size.of.ZIP.file
    Enter the maximum size, in MB, of a ZIP file that contains a batch of documents or contains a portion of a batch of documents. The default is 512 MB.
    Native
    Extract plug-in point
    The Extract plug-in point occurs after each piece of content is extracted from the content server. The Native export format has no Extract plug-in and thus, no Extract plug-in parameters.
    NativeBatchComplete plug-in point
    The BatchComplete plug-in point occurs after each batch of 1000 documents or files is extracted from the content server. The configurable parameters of the NativeBatchComplete plug-in are relevant only for Lotus Notes content. They include:
    Lotus®.Export.log.file.absolute.path
    Enter the absolute path of the export log file, dominoExport.log. The default path of the export log file is C:\Program Files\IBM\eDM\logs\dominoExport.log on Windows systems and /opt/IBM/eDM/logs/dominoExport.log on AIX® systems.
    Restriction: Use only English and system language characters in the path. For example, if the system language is Japanese, use only English and Japanese characters.
    Lotus.Logging.mode
    Enable error or debug logging for exporting. By default, the value of this parameter is DEBUG.
    Recommendation: After you determine that export is working successfully, set the value of this parameter to ERROR. Doing so prevents the export log file from becoming too large and consuming more disk space than needed.
    Important: Check the size of the export log file regularly. If it becomes too large, delete it.
    Lotus.Domino.server
    By default, this parameter's value is intentionally left blank. Only enter a value for this parameter if you want to override the Lotus Domino® server setting on the Lotus Domino Settings panel. To override, enter the Lotus Domino server name, for example, D01MC084/01/M/ACME.
    Lotus.Domino.server.export.directory
    Enter the name of a directory to contain exported content. If it does not already exist, this directory is created in the Lotus Domino server data directory. The default value is exportedDocs. For example, enter exportedContent.
    Restriction: Use only English and system language characters in the path. For example, if the system language is Japanese, use only English and Japanese characters.

    The path name that is used by eDiscovery Manager for the Lotus Domino export database is limited to 109 bytes. This path name is comprised of the Lotus Domino server name, the export directory name, the export subdirectory name, and the export file name prefix. Use a short value for the Lotus.Domino.server.export.directory plug-in parameter to allow users plenty of characters when they specify the export subdirectory and export file name prefix.

    Tip: To export to a directory that is outside of the Lotus Domino data directory:
    Lotus.Mail.database.template
    By default, this parameter's value is intentionally left blank. Only enter a value for this parameter if you would like to override the Mail database template setting on the Lotus Domino Settings panel. To override, enter a mail database template for creating export databases. The template must exist in the data directory on the Lotus Domino server.
    Restriction: The file name of the mail database template can contain only English and system language characters. For example, if the system language is Japanese, the file name can contain only English and Japanese characters.
    Lotus.Maximum.size.of.export.database
    Enter the maximum size, in MB, of an export database file. When an export database file reaches the specified threshold, eDiscovery Manager creates another one. Each NSF might contain a batch of documents or a portion of a batch of documents. The value here is approximate because checking of the export database is only done periodically for performance reasons, and individual items cannot span multiple databases. In actuality, the size of the exported NSF file might be smaller or larger than the maximum size that is specified here. The default is 512 MB.
    PST.Maximum.filesize.in.megabytes
    Enter the maximum size in MB that the PST export file can grow to until a new PST file is started. Each PST might contain a batch of documents or a portion of a batch of documents. The value here is approximate because checking of the export file is only done periodically for performance reasons, and individual items cannot span multiple PST files. In actuality, the size of the exported PST file might be smaller or larger than the maximum size that is specified here. The default is 50 MB.
    PST.Package.Msg.Files.To.PST
    Indicate whether (TRUE) or not (FALSE) to package all of the MSG files into a PST file. By default, MSG files are packaged into a PST file (TRUE).
    PDF
    Extract plug-in point
    The Extract plug-in point occurs after each piece of content is extracted from the content server. The configurable parameters of the Extract plug-in include:
    Document.conversion.timeout.minutes
    The maximum time in minutes to allow for conversion per email. If you see errors occur because of a time out, you may want to increase this value.
    Email.XSLT.file.name
    Optionally specify the name of an Extensible Stylesheet Language Transformation (XSLT) file that controls how the exported content is converted to PDF. The default XSLT file provided with eDiscovery Manager is PDFExportTemplate.xsl.
    Important: If you specify your own XSLT file, add the file to the class path so that eDiscovery Manager can access it.
    Retain.original.documents
    Indicate whether (TRUE) or not (FALSE) to retain exported content in its original form as well as in PDF form. An item's original form is the format in which it is stored on the content server.

    The default is FALSE. Exported content, in its original form, is not retained in the export directory. After an export task finishes, only the PDF version of the content remains. Exceptions to this are email attachments and embedded pictures in email; they are always retained in the export directory in their original formats.

    Specify TRUE to retain exported content, in its original form, in the export directory along with the PDF version of that content. Examples of content that can be controlled by this parameter include CSN, MSG, and DXL files for archived email, and documents archived with IBM Content Collector for File Systems.

    Lotus Notes email: Lotus Notes email is stored on content servers in CSN format, not NSF format. For this reason, it is retained in the export directory as CSN files if the value of Retain.original.documents is TRUE when exporting to PDF.
    Exchange email: Exchange email is stored on content servers in MSG format, not PST format. For this reason, it is retained in the export directory as MSG files if the value of Retain.original.documents is TRUE when it is exported to HTML.
    BatchComplete plug-in point
    The BatchComplete plug-in point occurs after each batch of 1000 documents or files is extracted from the content server. The BatchComplete plug-in for PDF export format is disabled by default. You must enable this plug-in to make a PDF export task create a ZIP archive file. If this plug-in is enabled, its configurable parameter is:
    Maximum.size.of.ZIP.file
    Enter the maximum size, in MB, of a ZIP file that contains a batch of documents or contains a portion of a batch of documents. The default is 512 MB.
    Custom export formats
    Enter the names and values of any plug-in parameters for the Extract and BatchComplete plug-in points of custom export formats.
  4. Optional: If you want to change which export format is the default, click the check mark in the same row as the export format that you want to be the default. The “(Default)” visual indicator moves from the row of the old default format to the row of the new default format.
  5. Save your configuration changes by clicking the Save the export formats icon.

Adding a new export format

This section describes how you can add a new export format.

Add a content export format by clicking the Add a new content export format icon. Alternatively, import a new content export format by clicking the Import a new content export format icon and specifying the XML file that contains the definition of the content export format.
Tips:
  • The names of content export formats must be unique. You cannot import a new content export format that has the same name as an existing content export format.
  • Before you import the definition of a new content export format, use an XML validator to verify that the syntax is correct in the XML file that you plan to import. Validate the XML file by using the XML Schema Definition (XSD), export_config_format.xsd. The export_config_format.xsd file resides in the resources subdirectory of the eDiscovery Manager installation directory.
  • Be aware that the Browse button in the Import window is not controlled by the browser's language preference. It is controlled by the operating system locale. If the operating system locale is different than the language preference of your browser, the Browse button is displayed in a different language than the rest of the Import window.
  • If you import a new content export format that is defined to be the default, an existing content export format that was also defined to be the default is no longer the default.