From the IBM WebSphere Developer Technical Journal.
The IBM Content Model and API provide a starting point for collaborative application development. While not officially part of the public API at this time, it is available for early adopters who need to build custom content applications.
Built around a services-oriented architecture (SOA), the programming model provides applications with access to collaborative content using Service Data Objects (SDO).
It provides Create, Read, Update and Delete (CRUD) operations, and it supports additional services to enable applications to manage advanced content management features such as versioning and workflow.
A common persistence model is also supported by the API to promote collaboration between applications on content. This data model is used internally by WebSphere Portal for many of its content based applications.
The purpose of having a content model is to specify the elements that content applications have in common:
- Common metadata
- Common aggregation patterns
Therefore, we introduced a standard model for content applications, which standardizes the properties and relationships between content items such as "document" and "folder" and defines their intended usage pattern.
Part of the process of defining the model was to standardize on a set of common metadata properties (such as title, description, authors, and so on) which are common across all content types. Within the content model, the items are primarily organized into into three destinct groups: base, metadata, and content.
- Base types define the core types from which all other types extend, and provide all common properties that the content will support.
- Metadata types are those that specific content types aggregate with the base types to provide specific, well-known properties that are specific to a type of content. The metadata types have properties that further describe the state of the content beyond the base set of properties. These metadata properties can include owner information, information about when the content was published, categorization information, and so on.
- Content types are the model types that represent and contain the "interesting" application data. Some of the content types include:
CollaborativeDocument,Rendition,Draft.
Figure 1 shows a UML diagram which is a subset of the storage model used to persist document content. It illustrates the properties and relationships among the content, metadata, and base types.
Figure 1. The document model
View larger image of the document model
Table 1 shows the foundation classes upon which the content model is based.
Table 1. Base types which are part of the base IBM Content Model
| Type | Description |
|---|---|
| Content | Base content node. All extensions must subclass below this node. |
| Item | Content node allowing versioning, referencing, and control of text search |
| Element | Content node which must be contained by an Item. Used to build part-whole structures. Element nodes cannot stand by themselves and must exist under some parent Item. |
| Component | Item type which allows storage of Elements |
| Collection | Allows hierarchical Item structures |
| CompositeComponent | Allows hierarchical Component structures |
| CompositeElement | Allows hierarchical Element structures |
| Category | Provides hierarchical taxonomy structures |
| DynamicCollection | Provides a collection of content that is driven by a query |
| Folder | Collection that provides storage of Documents and (sub) Folders |
| DocumentLibrary | Special type of Folder. Encapsulates capabilities (workflow, locking, etc) for all content in the collection |
| Document | Special CompoundItem for document storage |
| BinaryElement | Storage of simple content |
Table 2 defines structured groups of properties used by multiple types.
Table 2. Metadata types
| type | Description |
|---|---|
| MetaData | Parent node for all MetaData nodetypes |
| AuditData | Provides support for tracking changes. Includes information like who created or last modified a document. |
| DynamicData | Support for application defined properties |
| PeopleData | Provides content owners and authors |
| PublishData | Meta-data related to publishing content. Includes information like the effective date or the expire date of a document. |
| CategoryData | Content categorization support. Holds a list of categories to which the content belongs |
| DescriptionData | Description meta-data. Includes a title and a description for the content. |
| StandardData | Standard meta-data for most content. Collection of the category, description, publish, audit and people meta-data |
Table 3 shows extensions to the base content model to facilitate development of collaborative applications using content repositories.
Table 3. Enhanced content types (extensions to the base IBM content model)
| Type | Description |
|---|---|
| CollaborativeDocument | Document extension with Portal Document Manager specific draft and rendition support |
| Rendition | Copy of Collaborative Document translated for viewing as a particular mime type |
| Draft | Document drafts |
| SearchTemplate | User Interface view control data. Primarily used by Portal Document Manager to hold settings used to select content to view |
The Content API architecture is comprised of several layers as shown in Figure 2. These layers can be categorized into three main areas: Service, Transport, and Mediators. Custom applications can also leverage the Binary Property Service. Each of these layers and components are discussed in detail below.
Figure 2. Content API service layers
We'll refer to the first two layers as high-level and low-level service layers. Custom applications you write using the Content API will primarily interact with these two layers. The high-level layer currently consists of the Document Management (DM) Service and the Library Service. These services do not interact with the transport layer, but do interact with the lower level service layer. These two services encapsulate typical actions which document-centric, custom applications might take. Custom applications can accomplish the same functionality provided by these two services by interacting directly with the lower-level services. Both services are provided to speed your development efforts of document-centric applications.
The DM Service provides functionality that is specific to CollaborativeDocument
and Draft objects. Document-centric applications, such as the WebSphere Portal Document Management Portlet, leverage this service heavily. The DM Service allows applications to create, delete, and publish drafts. It also contains methods to retrieve CollaborativeDocument
and Draftobjects, and to create document libraries. Applications can use the LibraryService to manipulate document libraries including copying, deleting, and retrieving libraries. It also enables applications to manipulate the categories associated with a library.
You can create custom service layers and then easily plug them into the Content API.
The low-level services are:
- LoginService. Applications must be authenticated prior to using the Content API, and results retrieved from the Content API are based on the user's authorization. Content stored in the repository can have various access limits set within the portal, and the Content API respects those access limits. Upon successful login, the LoginService will return a LoginContext to the application. Most Content API calls require a LoginContext to be passed in.
- ContentService. Includes functionality such as copying, moving, deleting, saving, and performing text searches for content within the repository. Applications can also retrieve a single item or a tree (graph) of items from the repository using this service. Lastly, applications can use this service to lock and unlock content within the repository.
- VersionService. Allows applications to version nodes within the repository, restore a versioned node, add labels to versioned nodes, and retrieve a version history of a specified node.
- WorkflowService. The content model provides basic workflow functionality composed of processes and defined tasks within a process. WorkflowService enables applications to start and cancel workflow processes. Applications can use it to claim, un-claim, retrieve, approve, and complete tasks within a process.
Below the service layers are the transport layers. Currently, the Content API only supports a local transport layer. The code calling the service layer must reside in the same Java™ VM as the back-end mediator code that accesses the repository.
The lowest layer in the architecture is the mediator layer which performs the data access with the repository. The mediator layer is also defined as part of the Service Data Object (SDO) specification (see Resources).
The mediator layer accepts and returns SDO data graphs. The data graph represents the data being sent to, or retrieved from, the repository. Upon persisting changes to the repository, mediators process changes within the data graph and send those changes to the repository. When retrieving data, mediators convert data from the repository into a data graph. The data graph is then returned and processed by the calling client.
A common use-case in applications that interact with a content repository is to display or download content directly from a URL. Given a unique identifier, the Binary Property Service can generate a URL which, when invoked, retrieves the content from the repository and streams the content to the user in the response to the request.
The Content Model is an Eclipse Modeling Framework (EMF) http://www.eclipse.org/emf implementation of SDO v1.0. We encourage developers wanting to build on top of the Content API to familiarize themselves with these specifications. This article does not attempt to explain these technologies in detail. However, a brief overview is necessary in order to discuss the Content Model implementation. Among other things, the EMF provides code generation tooling. Given a UML model, the EMF can generate code that is an implementation of the model. The code that EMF produces is also an SDO implementation. The Content Model was first defined in UML and EMF was used to produce an SDO implementation.
The SDO specification was defined by IBM, BEA, and others to provide a standardized way to represent data and for applications to interact with that data. It defines a programming paradigm as well as an API. SDO provides a generic way for applications to interact with any object model and the data contained within the objects. Any object model can be thought of as a set of objects and the attributes that are defined on each of the objects. The attributes of an object are either primitive types or references to other objects. SDO represents data at this level of abstraction.
At the core, SDO can be broken down into three main components:
- A
DataObjectrepresents any object in a model. - An attribute of an object is represented by a
Propertywhich can be a primitive type or a reference to another object. - A
DataGraphis a container for a group ofDataObjects.
For example, a DocumentLibrary can contain folders, which in turn can contain Documents. A Document has attributes such as title, author, and description. If an application queries the repository using the Content API for a particular set of documents, they are returned in the form of a DataGraph. The DataGraph would contain a tree or graph of DataObjects that satisfied the query.
Figure 3. Sample DataGraph
Once an application has a DataGraph returned from a Content API service call, it can process the data as needed. The data can be retrieved from the DataGraph and displayed in an application, or the DataGraph can be modified and returned to the Content API service.
One important SDO paradigm to point out is that applications perform all modifications to the DataGraph locally, before returning the DataGraph to be processed. One of the responsibilities of the SDO DataGraph is to keep track of any changes within the object graph. Applications can add new objects to the graph, delete objects from the graph, and modify attributes of objects. As all of these changes are happening, the DataGraph is keeping a summary of all changes within the graph. When applications are finished making changes, the DataGraph is passed back to one of the Content API services, and the mediators process the change summary. Mediators must persist the changes into the content repository.
One of the powerful features of SDO is the flexibility applications have in how they access data within a graph. Applications can traverse the graph using the generic SDO API, they can access data using the object model defined by the content model, or they can use XPath to extract data from a DataGraph. Below are examples of these methods.
Listing 1. Using the SDO API:
DataObject draftsFolder = contentService.getTree(...);
DataObject statusReport = (DataObject)draftsFolder.getList("documents").get(1);
String title = statusReport.getString("title");
|
Listing 2. Using the CM Classes:
Using the CM Classes:
Folder draftsFolder = (Folder)cs.getTree(...);
Document statusReport = (Document)draftsFolder.getDocuments().get(1);
String title = statusReport.getTitle();
|
Listing 3. Using XPath:
DataObject draftsFolder = contentService.getTree(...);
DataObject statusReport =
(DataObject)draftsFolder.get("documents[title='Status Report']");
String title = statusReport.getString("title");
|
This section shows some examples of creating a document, retrieving a document, updating some properties, and saving those changes in the repository. These examples provide a complete working sample; however, they do illustrate the Content API programming paradigm.
Listing 4 shows some common steps, such as obtaining a LoginContext and a reference to the ContentService. Listing 5 shows how to create a document at the root of the default library. Listing 6 shows how to retrieve the document just created and update some of the properties.
Listing 4. Obtaining a LoginContext
import java.util.Locale;
import com.ibm.content.provider.Provider;
import com.ibm.content.provider.ProviderRegistry;
import com.ibm.content.service.environment.Environment;
import com.ibm.content.service.environment.EnvironmentFactory;
import com.ibm.content.model.dataTypes.LoginContext;
import com.ibm.content.service.LoginService;
import com.ibm.content.service.ContentService;
...
...
// Obtain the default provider from the registry
Provider provider = ProviderRegistry.getInstance().getDefaultProvider();
// Create an environment object using the default locale
Environment env =
EnvironmentFactory.getInstance().createEnvironment(Locale.getDefault(), null);
// Obtain the LoginService from the provider
LoginService loginService =
(LoginService) provider.locateService(LoginService.class.getName());
// Obtain a LoginContext from the LoginService. A reference to a LoginContext is
// needed for all Content API calls.
LoginContext loginContext =
loginService.loginLocal(env, "wpsadmin", "wpsadmin", null, null);
// Obtain a reference to the ContentService
ContentService contentService =
(ContentService) provider.locateService(ContentService.class.getName());
|
Listing 5. Creating a document
import java.io.InputStream;
import commonj.sdo.Type;
import commonj.sdo.Property;
import com.ibm.content.model.Folder;
import com.ibm.content.model.Document;
import com.ibm.content.service.Service;
import com.ibm.content.util.TypeRegistry;
import com.ibm.content.model.dataTypes.DataStream;
import com.ibm.content.model.dataTypes.DataStreamFactory;
...
...
// Get a reference to the root folder of the default library
String path = "/contentRoot/icm:libraries";
Folder folder = (Folder) contentService.getItem(loginContext, path, Service.Options.NONE);
// Get the containing property to create the data object in
Property property = folder.getType().getProperty("documents");
// Get the type of the data object
Class typeClass = Document.class;
Type type = TypeRegistry.getInstance().getType(typeClass);
// Start logging changes in the DataGraph
folder.getDataGraph().getChangeSummary().beginLogging();
// Create the Document object
String label = "Status Report.doc";
String title = "Weekly Status Report";
Document doc = (Document) folder.createDataObject(property, type);
doc.setLabel(label);
doc.setTitle(title);
// Create a DataStream for the contents using a file loaded by the ClassLoader
InputStream fileStream = this.getClass().getResourceAsStream("/status-report.doc");
DataStream dataStream =
DataStreamFactory.getInstance().createDataStream(loginContext, fileStream);
doc.setData(dataStream);
// Apply the changes back to the repository
contentService.saveChanges(loginContext, folder.getDataGraph());
|
Listing 6. Retrieve the document we just created and update some properties
String path = "/contentRoot/icm:libraries/Status Report.doc";
Document doc =
(Document) contentService.getItem(loginContext, path, Service.Options.NONE);
// Start logging changes in the DataGraph
doc.getDataGraph().getChangeSummary().beginLogging();
// Update the title and description
String description = "Weekly status report for week ending 05/19/2006";
doc.setDescription(description);
// Apply the changes back to the repository
contentService.saveChanges(loginContext, doc.getDataGraph());
|
The documentation for the programing interface is available in JavaDoc that describes the interfaces and classes that define the API. This Javadoc is available in the download section.
This article covered the basics about the Content Model and the programming interfaces provided in WebSphere Portal V6.0 that work with the model. It provide a brief overview of SDO's and discussed how they are used with the API. It introduced the key object types that are in the model , and provided some example code to show how to get started using the programming model. You can get the full Javadoc as a downloadable ZIP file. With this information, you can start exploring the capabilities introduced with the Content Model and learn how to use its facilities in your custom applications running on WebSphere Portal 6.0.
If you are interested in pursuing the use of the Content Model and API, please contact Greg Melahn (melahn@us.ibm.com).
| Description | Name | Size | Download method |
|---|---|---|---|
| Content API JavaDoc | contentapi_javadoc.zip | 670KB | HTTP |
Information about download methods
- Participate in the discussion forum.
- IBM's reference implementation of SDO 1.0 is packaged with the Eclipse Modeling Framework (EMF). You'll find articles, FAQs, and a newsgroup on the EMF home page.
- Read an overview of the SDO 1.0 specification.
- Read the Service Data Object (SDO) specification
- Follow the standardization of SDO with JSR-235 on the JCP Web site.
- "Using Service Data Objects with Enterprise Information Integration technology" (developerWorks, July 2004) shows an example of using SDO.
- Read about JSR 170: Content Repository for Java TM technology API
-
What's new in WebSphere Portal V6? Walks through many of the features in this new version.
-
WebSphere Portal zone. Provides an ever-growing, wide variety of technical resources for WebSphere Portal.
-
WebSphere Portal product documentation. Provides fast access to the most current level of documentation for all releases and editions of WebSphere Portal, including the WebSphere Portal V6 InfoCenter.

Joe Kubik is a senior developer in the WebSphere Portal organization. His current focus is on extending access to portal content from non-browser environments including desktop applications and embedded devices. Joe holds degrees in Electrical Engineering from MIT and Computer Science from RPI. He has held numerous positions during his tenure with IBM including enterprise hardware and software development and more recently client software development. Joe currently works at IBM's Research Triangle Park location.

Mike Slavin's career has focused on developing distributed applications using CORBA, J2EE and web services, as well as designing and building complex web applications for customers across the United States. Mike spent several years in IBM Global Services as an IT Architect and most recently has moved to the IBM Software Group and is doing development work for WebSphere Portal. Mike holds an Engineering degree and graduated with honors from the University of Houston. Mike currently works at IBM's Research Triangle Park location. Mike can be contacted at mslavin@us.ibm.com.

Bill Trautman has 23 years of experience in Software Development. Bill has lead teams that have delivered Management Applications, Application Development Tooling for Pervasive Devices and Simulation Environments for For Service Processor Microcode development. Bill holds an Honors degree in Engineering Science from Penn State. Bill currently works at IBM's Research Triangle Park location. Contact Bill at btrautma@us.ibm.com.




