P8 Create Document

Email and files are stored as documents in the FileNet® P8 repository. To create a document in the repository, you need to specify where to create the document, and how to index it. You specify where by selecting a repository from a list of those configured. You index the document by choosing a document class for the item and by specifying what values should be assigned to each property of that class.

Task summary

Table 1. P8 Create Document task summary
Characteristic Value
Task name P8 Create Document
Main purpose Creates a document in the FileNet P8 repository
Usable with which source connectors? Email Connector, File System Source Connector, IBM® Connections Connector, SharePoint Connector, SMTP Connector
Usable with which target connectors? IBM FileNet P8 Connector
Usable with which content search engine? IBM Legacy Content Search Engine, IBM Content Search Services

However, if email is to be indexed with IBM Content Search Services, preferably use the P8 Archive Email task. This task is specifically optimized for use with IBM Content Search Services.

When needed? Required in archiving task routes
Placement in task route Usually appears before postprocessing tasks and after metadata and version tasks
Produces which metadata? P8 Create Document, P8 Published Connection, Task Status
Configuration options

Connection

Prerequisites: A connection set with at least one connection to a FileNet P8 object store must exist.
Select the set of IBM FileNet P8 connections that you want to use. Which connection is actually used for this task depends on the configuration of the connection set:
Configuration of the connection set How is the repository connection selected
No partitioning, one repository connection The defined repository connection is used.
Partitioning type Static, more than one repository connection At run time, the current date is compared to the date ranges of all defined repository connection. The repository connection where the current date is within the assigned date range is used.
Partitioning type Dynamic, more than one repository connection You must configure a DateTime expression in the task to determine the Date value that, at run time, is to be compared to the date ranges of the defined repository connections. The repository connection where the provided date is within the assigned date range is used.

Make sure to configure the same DateTime expression in all FileNet P8 tasks in the task route.

Checkin Options

With these options, you determine the way a document is stored in IBM FileNet P8. For more detailed information about each option, see the IBM FileNet P8 Enterprise Manager documentation:

  • Select Auto classify to enable a script you create, usually in Visual Basic or XML, to manage the processing of the file for check in.
  • Select Defer checkin if you are configuring a task route with the P8 Create Content Elements. This creates a reservation object in FileNet P8, which will later be checked into FileNet P8 when the object gets the files it requires to be complete later in the task route. Documents left in a reservation state at the end of the task route are automatically checked in.
  • Under Version, select an option:
    • Select Major. This is a FileNet P8 term for a document that has been "released". Typically, a major version's security makes the document available to a wide range of users.
    • Select Minor. This is a FileNet P8 term for a document version that has not been released. Typically, a minor version's security makes the document available only to the authors and reviewers.
  • Under Content capture options:
    • Select Transfer content (default) to save the document in FileNet P8.
      Important: In email archiving task routes, select this option only if an EC Prepare Email for Archiving or SC Prepare Email for Archiving task appears before this P8 Create Document task in the task route. The Prepare Email for Archiving tasks extract the email from the mail system, and content can only be transferred if it has been extracted.
    • Select Reference external content to allow a reference to the document to be created in FileNet P8, and the physical document to be located on the source system. This option is valid only if this task route is using a file system collector.

      The collection source must be a shared directory with the appropriate permissions on the share to allow access for the FileNet P8 Content Engine and Application Engine (Workplace) servers. When you configure the path for collection from a shared directory, the collection-source directory must be in the format \\servername\shared_folder\, so that the FileNet P8 content reference is in a recognizable format for the FileNet P8 system.

    • Select Do not transfer content (contentless) to allow document metadata only to be saved in FileNet P8. This option must be selected if the P8 Create Document task appears before the EC Prepare Email for Archiving or SC Prepare Email for Archiving task in the task route.
  • Select Set content retrieval name to set a metadata property to use for the Retrieval Name on the content element. This option is enabled only if you have selected the Transfer content (default) option.
  • In the Shortcut link text box, enter the URL to be used when adding a shortcut to a document in the object store.
    Important: The Email Connector uses its own link format and ignores the contents of the Shortcut link field. This configuration is required for the File System Create shortcut postprocessing option or the SharePoint Replace with link postprocessing option.
    Enter the URL in the same format as the sample URL in the text box. Modify the sample URL provided:
    https://HOST:PORT/AFUWeb/RetrieveDocument.do?
    r=<%DOCID_ENCRYPTED%>&repositoryID=<%REPOSITORY_ID_ENCRYPTED%>&sum=<%URL_CHECKSUM%>
    &filename=<%FILENAME%>
    For secure links to archived documents that require users to log on to the repository before they can access the content, provide the URL in this format:
    https://HOST:PORT/AFUWeb/SecureRetrieveDocument.do?
    r=<%DOCID_ENCRYPTED%>&repositoryID=<%REPOSITORY_ID_ENCRYPTED%>&sum=<%URL_CHECKSUM%>
    &am=<%CHALLENGE_MODE%>&filename=<%FILENAME%>
    
    In this case, the repository connection is established with the user's credentials and access to the item in the repository is granted based on the user's access rights.
    • Replace HOST:PORT with the name and port number of the IBM Content Collector Web Application service.
    • Do not alter any of the tokens <%token_name%> and adhere to the order of the parameters except for the parameter &sum. This parameter can appear anywhere in the parameter list.
    • The ENCRYPTED tokens are encrypted with an algorithm that is compatible with the IBM Content Collector Web Application service. This means that you cannot use %PID_ENCRYPTED%, %ITEMTYPE_ENCRYPTED%, or %URL_CHECKSUM% with applications that do not use the IBM Content Collector Web Application service.
    Tip: Users of previous versions of Content Collector can retain URLs in the previous format:
    http://server_name/Workplace/getContent?
    ObjectStoreName=<%OBJSTORE%>&id=<%DOCID%>&objectType=<%OBJTYPE%>
    • Replace server_name with the name and port of the FileNet P8 Application Server that is running Workplace.
    • Do not alter any of the tokens <%token_name%> and adhere to the order of the parameters except for the parameter &sum. This parameter can appear anywhere in the parameter list.

Document Deduplication

Select Detect duplicates to work with Content Collector deduplication in addition to repository-based deduplication. If you do not select this option, all deduplication is handled by FileNet P8 internally, provided that suppression of duplicate content elements is enabled on the respective file storage area, or on the device layer. See the topic about deduplication for details.

From the Hash key metadata mapping lists, select a metadata class and property to use to detect duplicates. In task routes using email collectors, a hash key is automatically generated and available in these lists for duplicate detection. For task routes using other collectors, you must enable hash key generation as follows so that the hash key is available in these lists:
Table 2. Hash key generation
Type of collector Where to enable hash key generation
File system collector Collector configuration
SharePoint collector SP Create File task
IBM Connections collector CX Pre-processing task
Tip: Hash key based deduplication should be used only for IBM Connections items that contain one part, like files. Hash keys for items that consist of several parts are likely to differ even if the content of the items is identical.

Do not use hash key based deduplication with the SharePoint collector if you collect version series documents or Microsoft Office documents (Microsoft SharePoint changes the metadata that the collector uses to identify identical documents). Doing so slows your system and results in no deduplication.

Select one of the following algorithms to detect duplicates:
Always create document
This is the default. Content Collector tries to create a document in FileNet P8 without checking for duplicates first. If an object with the same ID already exists in the repository, creating another one would violate a uniqueness constraint, so the underlying database produces an error. The errors are recorded in the log file of the Web Application container. This results in a processing overhead.

Select this option if you expect only a low number of duplicate documents to avoid to increase database load by running a database query for every document that is processed.

Check before creating document
Enable Content Collector to search the repository for an object with the same ID before trying to create the document. In this case, no errors are produced when a duplicate is found, so that no logging activity is created. However, a database query is run for every document that is processed, which increases database load.

Select this option if you expect a large number of duplicate documents.

In any case, when a duplicate is found, the metadata of the ingested document is updated to show it is a duplicate. In contrast to FileNet P8 deduplication handling where further document objects are added that all point to the same content within one storage area, Content Collector does not add any further document objects to the repository. Therefore, when you browse or search for documents, each duplicate appears to be a unique document if deduplication was managed by FileNet P8 whereas, with Content Collector deduplication, a single document object represents all instances of a document. Note also that with Content Collector deduplication no FileNet P8 workflow is triggered for duplicates.

Tip: Set a decision point and rules immediately after the P8 Create Document task to specify what to do with duplicates and non-duplicates.

Property Mappings

The Document class list box lists all document classes in the selected object store. Select the document class you want to use as base document class when creating the object in the repository.
Important:
  • If document classes are added while IBM Content Collector Configuration Manager is running, you will need to restart the application to see the new classes in the list.
  • The default document class for email archiving in the P8 Create Document task is ICCMail2.

The property mappings table is populated with properties and values of the selected class, including custom properties. Edit the property mappings as required.

Mandatory properties are marked with an asterisk (*) next to the name. You have to configure mappings for these properties. Note that required FileNet P8 properties can have default values assigned in FileNet P8. Those FileNet P8 properties are not marked as being mandatory, and mapping such properties in IBM Content Collector is optional.

Click Show "Hidden Properties" to display configurable properties that have been marked as hidden in the repository. Hidden properties are usually reserved for system-related or non-public information. To prevent accidentally adding a metadata value for a hidden property, these properties are, by default, not shown in the table.

To be able to map certain Content Engine system properties, your object store security must permit modification of these properties. See the topic about changing object store access rights for details. If the account that is used to start the IBM Content Collector FileNet P8 Connector service has the permission to modify system properties on the FileNet P8 Content Engine, the button Show "System Properties" is available in the configuration pane. Click this button to display and map settable FileNet P8 system properties such as:
  • Date Created, the date when the document is added to the repository
  • Creator, the login name of the user adding the document to the repository
  • Last Modifier, the name of the user who last modified the document
  • Date Last Modified, the date the document was last modified

The permission to modify system properties is required, for example, to set the file system document owner as the FileNet P8 document owner when archiving documents from a file system. You might want this mapping to be able to use the owner information for referencing documents or when your records management strategy includes the usage of owner based metadata. To set the owner accordingly, map the FileNet P8 system property Creator <system> to the file system metadata property <File, Owner>.

Lotus Notes only: When Lotus Notes email is captured, the Folder metadata field is only filled with a name if the corresponding email was obtained through a folder, that is, if a collector for manual archiving is configured to monitor drag-and-drop folders or if a collector for automatic archiving is configured to include folders. For email that is collected in another way, this field will always be empty.

If you want Content Collector to determine the class dynamically, click Advanced and select Use an expression to determine the class in the Advanced Options window. Configure the expression by using the Expression Editor. Content Collector can also dynamically create property mappings for these classes based on user-defined metadata sources. You can select the properties for dynamic mapping in the Advanced Options window. For more information see the topic about assigning FileNet P8 classes or property values dynamically.

Data correction

To adjust the data that Content Collector uses for property mappings, select one or all of these options:
Truncate strings
Any string metadata values that are set in the property mappings are truncated to fit inside the maximum length of the string property in FileNet P8. If you do not select this option, the task fails if a string metadata value does not fit inside the FileNet P8 property to which it is mapped.

A warning message is logged.

Ignore choice list properties on error
If an incoming metadata value does not match any of the values in the choice list that is associated with the FileNet P8 property to which the value is mapped, no value is set on the FileNet P8 property. If you do not select this option, the task fails if a metadata value is not an acceptable value for the FileNet P8 property.

A warning message is logged.