Contribute in GitHub:

Content Reuse

Best Practices

Introduction

Content reuse as a best practice take various forms and can occur at various points in the development and delivery chain.

Content Reuse and Repurposing

Content reuse is a highly valuable best practice for several reasons, including, but not limited to:

Reduction of duplicate content
Reduced cost of development, review, and maintenance
Improve consistency and quality
Reduced translation and translation review costs

Content reuse take various forms and can occur at various points in the development and delivery chain. The most common forms include:

Reuse by reference
- By simple linking
- Using property-based retrieval (include-like)
Reuse by substitution content build time
- Statically during content build
- Dynamically during content delivery

Types of content reuse

Content reuse can be simple or highly sophisticated. Not all reuse is the same, and the reuse mechanism available to the content creator depend on the system, tools, and source encoding used. The most common flavors of reuse include:

Reuse by direct link
Reuse by copy
Derivative reuse
Content referencing
Content re-purposing

Reuse by linking occurs when a content creator creates a direct link to existing content. This is the most basic form of content reuse. It has the benefit of reusing the original content, but is not able to modify it. Risks of direct linking include changes to the linked content of which the person reusing is unaware, or removal of the target content being reused, resulting in a dead link.

Reuse by copy occurs when a content creator makes a direct copy of existing information and reuses that content in an entirely different deliverable without linkage back to the original. The person or system reusing content by direct copy might or might not make minor or substantial changes to the original, but does not continue to stay in sync with the original. While immune from changes, the reused content risks changes to the original if any dependencies exist that are required to maintain consistency.

Derivative reuse occurs when a content creator makes changes to existing information based on the context in which the information is being reused, but where any updates to the original information should be communicated to the author of the derivative. Although the author has made changes, the substance is the same and any changes to the original information should be considered for the derivative. Currently, many authors copy technical information from one information unit to another, then make changes to the copy. Over time, the information in the copy becomes inaccurate and stale because the original is maintained better. Examples of this include conceptual topics and definitions that have been copied. The originals are maintained through the normal product update process, but the copies are not, resulting in differing versions of the same topic. This causes confusion for the reader and, in the worst case, the reader is misled by inaccurate information.

Content Referencing is a form of source content reuse that occurs when a person reusing content uses some type of programmatic mechanism to include source content by reference. The included content is resolved at the time of content transformation (static build) or during content rendering (static or dynamic). In many ways, this form of reuse is similar in nature to an INCLUDE in many programming languages. Content referencing is a form of direct reuse, and is one of the most sophisticated and automated and reliable methods of reuse. Direct reuse can be done in-line in the content source or be machine-managed, depending on the technology employed. In direct source referencing models such as with DITA or Markdown, each provide a mechanism to include or substitute reusable content, as small as string of character to entire topics and topic collections. The main benefit of content referencing is absolute certainty of maintaining synchronization with the original. Content referencing can also be combined with conditional inclusion (often called conditional text), to create variations of the original that can permit derivative changes to select portion of the original while maintaining reuse to common sections.

Content re-purposing is a higher form or content referencing and reuse. it is most common among topic-oriented content practitioners where entire topics are reused in different collections. In this type of reuse, there exists some collection mechanism to include and sequence whole topics using some type of inclusion mechanism, which is most common in structured content systems. For example, writers of one IBM library that requires the equivalent of 80 thousand topics to generate eight different editions of an IBM product offering (such as small business, enterprise edition and so on), was able to reduce their total source content to as little as eight thousand topics by creating eight collection 'maps' that each included only the topics that applied to each particular edition. When also combined with content referencing and conditional text, content professionals can maximize reuse at all levels.

Tools and systems that enable reuse

The type and degree of reuse available to content creator and owners varies depending on the tools and formats used. These tools fall into the following categories:

Structured authoring (direct source control)
Reuse discovery
CMS-managed reuse

Each option provides different capabilities that better suite some document types than others. This presentation AuthoringStrategies_v10.pdf|View Details provides a cogent overview of the trade-offs between DITA and Markdown.

Structured authoring

Structured authoring markup languages excel at advanced reuse and re-purposing at scale, and typically offer the most efficient approach to author, manage, and maintain long document types such as user guides and online help. Markup languages are typically used with direct source authoring systems, and offer powerful reuse and re-purposing capabilities such as reuse by reference, content substitution, conditional inclusion/exclusion, content containment, sophisticated collection management, and more. The most common structured formats used across IBM include:

XML
DITA
Markdown

Reuse discovery

Reuse discovery is typically enabled though some type of content management system or content reuse registry. In IBM Information Development, Examples of production systems that enable reuse discovery tools include the IBM Asset Reuse Manager (ARM) tool.

CMS-managed reuse

ID Content Management System (IDCMS)

Built upon IBM FIleNet, IDCMS is XML aware, meaning, it reads and automatically understands the structure and relationships between documents written in XML. This extremely powerful capability enables a degree of discovery and processing automation on a large scale. IDCMS alone contains more than two million source topics, making virtually all of them discoverable and reusable by anyone across the enterprise. IDCMS also makes use of highly automated Digital Content Services (DCS) that can generate many different types of output from the same single source. Because IDCMS is XML (DITA) and collection-aware, it automates many time-consuming processes, especially for large collections, such as automated trademarking, automated accessibility checking, automatic package and send for translation, and automated build and publish, among others. IDMCS is content reference and content collection aware. IDMCS can be used to manage collection reuse in the CMS itself. For example, if a new release of an IBM offering requires a new edition of a content collection, the content creators need only create the delta topics for the new release and let the CMS reuse all of the unmodified topics from one or more prior collections that reside in IDCMS. To learn more about IDCMS, see https://ibmdocs-test.dcs.ibm.com/docs/en/dcs-doc

Drupal

[to be added]

SalesForce

Salesforce in IBM Support has implemented the Knowledge Bar to surface and capture content reuse. An overview blog on the topic is here.

Review the KCS in Support community for more information on Knowledge Centered Service in IBM Support

Resources

The specific systems and instrumentation employed are typically associate with the specific content domain (IBM Support, IBM ID, Marketing and so on), or by content type. The systems and instrumentation available and used in each are described in the associated best practices section of this site.