Compound Documents

A Content Cortex compound document is a group of hierarchically organized component documents that can be assembled to form a single document. The root document at the top of the hierarchy is called the parent component; this component can have zero or more subcomponents, called child components. A child component can itself be the parent component for another compound document. Also, compound documents can share child components.

The Content Engine Java™ and .NET APIs expose compound document functions through the ComponentRelationship class and through Document class properties. ComponentRelationship objects provide the means for constructing compound documents. Each ComponentRelationship object establishes a relationship between one document as the designated parent component (the ParentComponent property) and another document as the designated child component (the ChildComponent or URIValue properties). Using multiple ComponentRelationship objects, you can network documents together in a number of different ways.

The ComponentRelationship class includes properties that provide support for:

  • Specifying the type of relationship between child and parent (see Component Relationship Types below).
  • Specifying whether all, some, or none of the child component relationship structure for an existing parent document version are copied to the next version of the parent. For more information and a code example, see Versioning Parents.
  • Specifying whether the latest version (major or minor) or only the latest major version of a child document is a candidate for binding (see Creating a Dynamic Component Relationship for a code example that uses the VersionBindType property).
  • Specifying a label that supports a dynamic label-based binding mechanism. For more information, see Label-based Binding.
  • Specifying the sort order of child documents(see Creating a Static Component Relationship for an example of using the ComponentSortOrder property). Specifying the sort orders allows, for instance, a book to be assembled with the chapters in the right order.
  • Retrieving the version series of the child component document (the ChildVersionSeries property).
  • Deleting child components when a parent document is deleted; or prevent the deletion of a parent document, of a child document if the parent still exists, or of both parent and child documents. For more information and a code example, see Deleting Components.

The Document class includes compound document-related properties for:

  • Determining the child Document objects that are bound to a parent document (ChildDocuments property).
  • Determining all parent documents for a child component (ParentDocuments property).
  • Retrieving all ComponentRelationship objects that reference a document as the parent component document (ChildRelationships property).
  • Retrieving all ComponentRelationship objects that reference a document as the child component document (ParentRelationships property).
  • Classifying a document as a compound document parent (CompoundDocumentState property). A document must be classified as a compound document before a ComponentRelationship object can reference it as a parent component.

In addition, an add-on Document property, ComponentBindingLabel, specifies the value that is matched against the value of the LabelBindValue property on a ComponentRelationship object. For more information, see Label-based Binding.

Component Relationship Types

The APIs include three built-in mechanisms for binding documents together: STATIC, DYNAMIC, and DYNAMIC_LABEL_CR. These mechanisms are component relationship types that can be specified for a ComponentRelationship object (by using the ComponentRelationshipType property). Therefore, each component relationship can be of a different type; there need not be one type for the entire compound document. Document binding automatically adds the bound version of the child component document (specified by the ChildComponent property) to the child documents collection on the parent document.

  • For a STATIC relationship, the explicitly specified child document version always gets bound.
  • For a DYNAMIC_CR relationship, the latest version or the latest major version of the child document gets bound depending on the version bind rule in effect (as specified by the VersionBindType property).
  • For a DYNAMIC_LABEL_CR relationship, the version bind rule with label value matching determines the child document version bound (as determined by the value of the LabelBindValue property on the ComponentRelationship object and the value of the ComponentBindingLabel property on the Document object).

There is also a fourth type of component relationship: URICR. This type permits a component relationship to exist between a parent Document object and a child URI document (specified by the URIValue property). Strictly speaking, no document binding can occur in this case, as the child document is not a Document object, and cannot be placed in the child documents collection on the parent.

Label-based Binding

Component relationships support a dynamic label-based binding mechanism (DYNAMIC_LABEL_CR). For these kinds of component relationships, the bind to a specific version of a child component is based on the version binding rule (LATEST_VERSION or LATEST_MAJOR_VERSION) and the label bind value (a child document's ComponentBindingLabel property, which specifies the value that is matched against the value of the LabelBindValue property).

The ComponentBindingLabel property on the Document class is not a property on the base Content Engine Document class. The property is added to the Document class as part of the Base Content Engine Extensions add-on, which is installed when an object store is created. Therefore, the Content Engine Java API does not expose accessor methods (that is, set_ComponentBindingLabel and get_ComponentBindingLabel) for this property, and the Content Engine .NET API does not expose this property. Use the methods on the Properties collection object on this child document to set and retrieve the value for this property.

Circular References

The top-down orientation of the compound document definition does not preclude circular component relationships. One ComponentRelationship object, for instance, could designate document X as the parent and document Y as the child, and another ComponentRelationship object might designate the opposite. X would then be both a parent and child with respect to Y (and vice versa). Or you might designate X as the parent for Y, Y as the parent for another document Z, and Z as the parent for X. The compound documents that exist in a circular chain of reference are still considered compound documents, even though any attempt to recursively assemble them would never end.

Example Use Cases

A book, for example, might be implemented as a compound document, where each child component contains the text for a chapter. The root parent component, representing the book as a whole, would have no associated text. The child chapter components might themselves be compound documents, and have logos as child subcomponents for inclusion into the chapter heading (during compound document assembly).

For example, the book's chapter 1 and chapter 2 components might share a logo child component, which means a ComponentRelationship object designates the chapter 1 document as the parent component (specified by the ParentComponent property) and designates the logo document as the child component (specified by the ChildComponent or the URIValue property). Also, another ComponentRelationship object designates the chapter 2 document as the parent and designates the same logo document as the child. Therefore, the chapter 1 document is the parent component for a compound document, and likewise for the chapter 2 document; the two compound documents share the logo document as a child component. Having a separate child component for each chapter simplifies the task of incorporating logos into each chapter's heading.

As another example, consider a life insurance policy that has a legal paragraph that needs to be altered for each US state. The primary compound document for such a policy might consist of four components:

  • The root parent component; it has no associated text, and would represent the policy as a whole.
  • The first child component; it has all the text before the legal paragraph.
  • The second child component; the parent component for another compound document that represents the legal paragraph.
  • The third child component; it has all the text after the legal paragraph.

With the compound document structured in this way, you do not have to parse any text to find the legal paragraph. Assembling the policy document for a particular state would consist of substituting the appropriate legal text for the second child component and then merging the three child components.

In this example, the second child component in the primary compound document is the parent to the compound document that represents the legal text. This parent component might have 50 child documents with a DYNAMIC_LABEL_CR relationship to each of them. Each document would have text for a specific state, and have a set label value, such as "AZ" for Arizona. For example, to generate a policy document for Arizona, set the label value on the component relationship objects to "AZ". Doing so causes the one expected document, which is the document for Arizona, to show up in the child documents collection on the parent document. Substituting the correct text, therefore, is a matter of setting label values and iterating through the child documents collection.