What is a local and external document ?
Let's first start with the term and clarify what a local and external document are.
With an external document, a document is meant that is stored in an external Enterprise Content Management (ECM) system. This can be a FileNet Content Manager, IBM Content Manager or a non-IBM product like Microsoft SharePoint or Alfresco. Those products (and more) support the Content Management Interoperability Services (CMIS) – an OASIS standard – that IBM BPM uses to access the content (folders and documents) of the ECM system.
A process as defined in IBM BPM V8.5.7 always has a root process folder. Within this folder there can be local documents and folders as well as references to external documents and folders. The root process folder itself is a local folder.
How to add documents to local folders ?
Within the Process editor in the web Process Designer you can define the initial folder structure of a process instance that is established when the process starts. You cannot define an initial set of documents that is added. Instead documents are either within an external folder that you reference in the initial folder structure, or they get added to the instance later. When configuring a local folder in the Process editor, you can define which artifacts can be added at runtime:
By default, no artifacts can be added at runtime. You can decide to allow local documents and external document references. If you enable this, your end users using the Document Explorer coach view – for example as part of the Details UI for the process instance – find appropriate options in the (+) drop down:
Choosing Add Local Document allows the end user to add either a document with file content, or to point to a URL:
Creating such a document will store it in the IBM BPM document store as document attached to the current process instance. Optionally, generic properties can be defined in the Document Explorer configuration to classify the document. Look for
Choosing Add reference opens a common dialog that allows to add references to external documents or folders depending on the settings that were made for the folder in the web Process Designer. The dialog allows to choose from the available servers and to browse through the folder hierarchy to select a document. Optionally the Reference As input can be filled to provide a name for the reference. If not provided, the document name will be used:
Creating such a reference keeps the external document untouched, but the IBM BPM system creates a reference (you may compare this with links within a file system) that it stores in his database, so that the Document Explorer then shows the contained documents of the local folder:
End users now can work with these two entities in the same way: they can view or download, update and delete the document depending on their permissions.
In any case, you got a document identifier and its server name that you stored in the documentId and documentServerName variables. Then you need to add a Content Integration step to your service. You use the predefined IBM BPM managed store server and the Add document to folder operation.
Optionally, you may here specify a reference name in addition to give the document reference a defined name. If you do not provide this optional parameter, IBM BPM will use the name of the document instead:
The document that you specify here may also be a local document that you previously created using the Create document operation against the IBM BPM document store. As Document server name for those local documents, you can use the following constant in the input mapping for the Add document to folder operation:
When to use local, when external documents ?
We now discussed how to work with these local and external documents. The question remains: when should one use local documents, when external? For this, let's compare the two a little bit:
Let's look at the life-cycle of the documents first. Local documents are attached to a process instance. When this process instance gets deleted, then all its local documents and folders are as well deleted. So, for all documents that you need to store longer than your process instance lives, either create them directly in the external ECM system, or store them temporarily with the process instance and make sure to move them before the instance gets deleted. We'll look closer on how to move those documents in a later article. Also, if you have a document that is relevant not just for a single process instance, then it is better to create them in an external system. Although, technically it is possible to create a local document associated with one process instance and to add it to local folders of other process instances. But, you then run into trouble with the next criteria: the authorization.
For local documents you have no control about the authorization. Instead the document inherits its permissions from the process instance. Everybody will see the document metadata, but only who is allowed to see the process instance (admins, instance owners, task workers) is allowed to view the document content. This can cause trouble if you use local documents across process instances, because somebody who is allowed to see one process instance will get trouble to look at a document that comes from another instance he or she is not allowed to view. Also, there is no way to get control on access permissions: every user who is allowed to see a process instance is allowed to see, update and delete the local documents associated with instance. If you think of privacy, you might not want everybody to see all documents. If you think of legal aspects, then it may be critical if everybody is allowed to delete everything. ECM systems on the other side enable you to define fine-grained access control for documents limiting access to certain groups and restricting certain actions.
In addition there are other capabilities that ECM systems have and you do not find in the IBM BPM document store. Some examples:
- To further prevent document deletion for legal reasons, retention policies can be defined that strictly enforce that certain types of documents are not deleted anytime or for a duration
- For local documents you can only define generic metadata as key-value pairs; ECM systems allow to define different document types for different use cases. For example, an InsuranceContract type can be defined for the purpose of storing contracts for an insurance policy. Metadata properties like contract number, customer number etc can be added to this document type to define a strongly typed schema.
- In the IBM BPM document store you have limited capabilities to search for documents because of the missing strongly-typed metadata. Also, full-text search within the document content is not enabled.
- ECM systems have auditing capabilities.
Local documents also have the limitation that they cannot be used for document start events, we will look at those in one of the next articles.
Here is a summary comparing local and external documents:
|Local documents||External documents|
|Life-cycle||Deleted when the process instance they are attached to is deleted||Independent lifetime|
|Authorization||Content can be viewed by anybody who is allowed to see the process instance; those users are also allowed to update and delete the document||Fine-grained access control model depending on the capabilities of the ECM system|
|Classification||Only one available document type (IBM_BPM_Document) that has only generic properties as key-value pair||Most ECM systems allow to define a hierarchy of document types, these can have business data properties|
|Search||Only based on system-defined properties, for example the creator, or the key-value pair properties||Filtering based on document type and its properties. Full-text search that searches within the document content is available depending on the capabilities of the ECM system as well.|
|Events||Can be used as activity precondition, but not for document start events.||Can be used as activity precondition and for document start events.|
Conclusion: use local documents when you have temporary content that is only relevant for the associated process instance and has no value after the instance ended. In other cases store the documents in the external ECM system. It is also okay to use local documents temporarily for process instance related content that then gets moved to the external ECM system once finalized, for example a document that goes through a set of reviews and potential modifications during a process and only the approved final version then is moved to the external ECM system.
Anti-pattern: local documents visible to all processes
For some of the aspects where I compared local and external documents you might want to intervene and say: "Wait a minute, IBM BPM documents are not required to be associated with a process instance." That's correct. But, I generally do not recommend that usage pattern, because if a document is not related to a process instance you either miss-use IBM BPM as a content management system or you try to build something where a document can be used by multiple process instances with all the garbage collection problems to get rid of them when they are finally not anymore needed.