Editor's note: Know a lot about this topic? Want to share your expertise? Participate in the IBM Lotus software wiki program today.
Introduction and background
Over the past several decades, tremendous progress has been made in enabling users to connect with each other. Thanks to Internet protocols, high-speed networks, low-cost devices, and the standardization of document formats, it’s now simpler than ever to exchange information. We still struggle, though, to collaborate on business content. Why?
Part of the answer is that business information continues to be stored in content silos, where software vendors independently decide many of the details about how the information is organized and accessed. A document stored in two different content management systems might be created by the same application and, indeed, might have the same number of bits (a PDF file, for example).
The content management system that stores the document, though, decides on important details such as what API is used to access the document, how it can be searched, what additional properties are stored with the document, how it is versioned, what policies can be assigned, and more.
After a document is stored in a content management system, it is to some extent trapped there, and the information is difficult to access by applications not designed for that specific repository. This problem occurs not only among different software companies but also with software from the same vendor.
In the case of enterprise content management (ECM) systems, this sharing problem becomes particularly acute because the volume of information in ECM systems is extremely large and because the information itself is mission critical. The degree to which employees are able to use information can determine the success of a company; the fact that information is hard to share directly affects the competitiveness of the company.
Content management standards began to emerge in the late 1990s. Standards such as WebDAV and Java™ Content Repository (JCR) were important steps, but they still lacked many features needed to share enterprise content.
A few years ago, the Associates for Information and Image Management sponsored an initiative called the Interoperable Enterprise Content Management (IECM) Project.
A team representing ECM vendors started discussions about the idea of a standard to make it easier to share and manage content. Over time, more ECM vendors became involved in the discussions, and an Organization for the Advancement of Structured Information Standards (OASIS) technical committee was formed to write a formal specification.
After years of design, in 2009 the committee published the first version of the proposed specification for public comment. Several ECM vendors already have prototypes that implement the proposed standard.
The proposed Content Management Interoperability Services (CMIS) standard is simple and defines an extensible domain model consisting of four base content types: document, folder, relationship, and policy. The two most important CMIS types are the folder and document types because that is where most of the content is.
Let’s discuss these four content types:
- CMIS document. Similar to a file, it has properties
to hold document metadata, such as the document author and
modification date and custom properties that can be specific to a
particular company (see figure 1). The CMIS document can also contain
a content stream and renditions, such as a thumbnail view of a document.
Figure 1. Schematic of a CMIS document
- CMIS folder. This content type is a container of CMIS
documents and other CMIS folders (see figure 2). Folders can decide
what types of Documents that they can contain. It’s possible (as a
CMIS repository option) to allow documents to be contained in multiple
folders or even in no folders at all.
Folders in a CMIS repository form a strict hierarchy, in which a folder can be contained only within a single parent folder.
Figure 2. Schematic of CMIS folders
- CMIS relationship. This content type is a way to
define relationships between CMIS objects (see figure 3). An object
can have multiple relationships with other objects. CMIS repositories
are not required to enforce any semantics of relationships or enforce
referential integrity between the two objects of a relationship.
Figure 3. Schematic of a CMIS relationship
- CMIS policy. This content type is a way of defining
administrative policies to manage objects in ways that are, by
themselves, outside the scope of the CMIS specification. For example,
you can use a CMIS policy to define which documents are subject to
The content of a CMIS policy is opaque to the CMIS repository, so you can use it quite flexibly.
Figure 4. Schematic of CMIS policy
These basic types can themselves be subtyped to define new kinds of content, and it’s typical to define subtypes of a document for a specific kind of business. For example, you can define a document subtype called Customer Proposal with additional properties, such as Customer Name and Customer Contact.
The CMIS specification also defines a set of a few dozen content services to access this content (see figure 5). Each of the services has two bindings, both based on HTTP. One binding uses the Atom Publishing Protocol, and the other uses Web services based on SOAP.
Figure 5. CMIS services
Some of these services are required, but many are optional, which makes it simpler for vendors to progressively disclose the content of their repositories.
An example of a required service is getRepositoryInfo. It’s a required service because it’s the service that you use to get the root collection in the repository so that you can start to navigate content. It’s also how a repository describes its other capabilities.
An example of an optional service is the policy service because not all repositories can (or need to) express their administration policies in a general way.
An important part of the CMIS specification is the query service, which is one of the services that distinguish CMIS from earlier specifications. The principle adopted for query in CMIS was to use a query language that was already widely understood, which naturally led to the choice of Structured Query Language (SQL). The query language of CMIS is a subset of SQL-92 (ISO/IEC 9075: 1992 – Database Language SQL) with a few extensions. If you already know SQL, then it will be easy for you to query a CMIS repository.
For example, for each CMIS document type in the repository, there is a corresponding logical table name that you can use in a SQL statement to select documents of that type, using the document properties as the column names. That means it would be as simple as forming a SQL query, such as SELECT * FROM DOCUMENT, to get a list of documents in the repository.
Notice that we use the term logical table because it is not necessarily true that there would be a real relational database table for that type. Nor is it true that the implementation of a CMIS repository is based on a relational database (though many are).
Included in the query language are familiar concepts, such as inner and outer joins; full text search with a scoring function, to allow the repository to signify relevance; predicates, to let you scope the query to part of the document hierarchy; and collation.
Listing 1 shows a couple of simple examples of CMIS queries, illustrating how the query language can be used.
Listing 1. Examples of CMIS queries
SELECT Y.EMPNO, X.ANNUALSALARY FROM EMPLOYEE AS X JOIN SALARY AS Y ON ( X.EMPNO = Y.EMPNO )WHERE( 50000 >= X.ANNUALSALARY ) AND ( Y.MANAGER = ‘Y’ ) SELECT * FROM document WHERE IN_FOLDER(‘/email/customers/ourbestcustomer’) AND CONTAINS (‘invoice’) AND title LIKE ‘%project a%’
CMIS and IBM Lotus Quickr
IBM® Lotus® Quickr™ is software that makes it easy for teams to collaborate. Sharing content is important for collaboration, and, like CMIS, Lotus Quickr introduced a simple content model and set of services based on Atom and Web services.
Because Lotus Quickr was developed before the CMIS standardization effort, the details of the data model and the services that it introduced were different from CMIS. Because the basic programming pattern in Lotus Quickr (based on Atom and Web services) is similar to that of CMIS, a developer who understands Lotus Quickr content services can easily learn to use CMIS as well. In fact, the CMIS Firefox extension discussed in this article was derived from the Lotus Quickr Firefox extension described in the developerWorks® white paper, “Developing an IBM Lotus Quickr Firefox connector using Quickr services.”
One example that shows how Lotus Quickr content services evolves into CMIS is to compare the Atom service document in Lotus Quickr with the service document for CMIS. In both cases, the Atom service document lists the top-most Atom collections where you can find content. In Lotus Quickr, these collections are called document libraries, and you can see a list of these libraries in the service document. In the case of CMIS, the containers are called repositories, which is what you see listed in the service document. In either case, within a library or a repository collection you can find folders and documents represented as Atom feeds and entries.
When you get the content of one of the feeds mentioned in the service document, you can also note that Lotus Quickr and CMIS extend Atom using different XML name spaces. For example, in the case of a feed from Lotus Quickr, you see a namespace td (short for teamspace documents) as a prefix for some Lotus Quickr document properties, such as the document identifier:
In CMIS, however, you see that identifiers are described with CMIS namespaces with names such as (as you would expect) cmis. For example, the document identifier looks like this example in a CMIS feed:
<propertyId localName="cmis:objectId" propertyDefinitionId="cmis:objectId">
The concept is the same; it’s just that they are described differently in the XML document.
If you had an Atom-based Lotus Quickr document client (such as the Lotus Quickr Firefox connector) that you want to adapt to work with CMIS documents, you can reuse the fact that your client understands the syntax of Atom service documents and how to use Atom feeds.
Also, you can modify your client so that it understands the different XML namespaces. That modification most likely means that you preserve the basic structure of your program and much of the user experience but modify the part where it parses the XML.
The Firefox CMIS plug-in
Now that you’ve learned a little about CMIS, let’s look at an example of how to use CMIS to access a document repository. The example uses the Mozilla Firefox plug-in architecture, based on XML User Interface Language (XUL) and the Dojo Toolkit.
Architecture of the plugin
XUL is a framework within which you can develop applications consisting of a user experience built using the following:
- A simple markup language with images and CSS stylesheets
- Locale-sensitive strings
An XUL runtime included in Mozilla applications like Firefox runs these kinds of applications. You can extend the behavior of Mozilla by writing your own XUL applications that can be installed into Firefox.
In addition, the plug-in uses the Dojo toolkit to manage the conversation between it and the CMIS repositories. For example, it uses the Atom extensions included in Dojo to fetch the Atom service document from the CMIS repository, and it uses it to prepare the HTTP requests to perform CMIS operations, such as creating a document (see figure 6).
Figure 6. Components of the CMIS and Lotus Quickr plug-ins
Accessing a CMIS repository
To use the plug-in to manage documents in a CMIS repository, you must first make sure that the CMIS services have been deployed to your content management system.
For example, IBM has developed CMIS services that provide access to IBM FileNet® and IBM Content Manager. See the white paper, “IBM FileNet and IBM Content Manager Technology Preview for OASIS CMIS,” for more information.
Next, you need to get the URL for that server’s CMIS service document and the user ID and password if the server is secured, which usually means contacting the content management administrator.
The service document is an XML document whose format is defined by the Atom publishing protocol and that has been extended by the CMIS standard. It lists the CMIS repositories managed by that server. Think of it as your CMIS directory. The service document also describes the capabilities of the repository and has useful links to document collections, such as the documents that have been checked out. The format of the service document is part of the CMIS specification’s REST binding.
Right-click the tree view of the CMIS plug-in to display the Add Repository window (see figure 7) in which you can enter the URL of this service document and the credentials if the server is secured.
Figure 7. Add Repository window
Click the Next button in the window to invoke the doNext() function in AddRepositoryDialog.js to retrieve and parse the CMIS service document, using the following Dojo function:
dojox.atom.io.getService: function(url, callback, errorCallback, scope)
The wiring of the button that calls the doNext() function was part of the XUL dialog definition. See the line ondialogextra1="doNext() in addRepositoryDialog.xul.
Listing 2 shows an example of a CMIS service document from an IBM FileNet server that was fetched using that XUL dialog.
You can see that the service document contains useful information, including the names of the repositories. In the example in listing 2, there is just one CMIS repository, named FNSPStore (that name displays in the XML elements <title> and <repositoryId>.
When Dojo fetches and parses that CMIS service document, it stores the information from that service document into an object of type dojox.atom.io.model.Service. Using that object, the plug-in gets the needed information about the repository and then saves the information in an array named gData.
The plug-in then sorts the list so that it can be presented to the user. You can see all that action happening in the function handleService() in addRepository.js. After the sorted list of repositories displays in the next Add Repository window (see figure 8), you can select the repositories in which you are interested.
Figure 8. List of repositories
Figure 9. Repository list after adding the FNSPStore repository
The list of repositories with their associated data is saved to the file
system in the active profile for the current Firefox user, which is why
you see the list of repositories the next time you open a Firefox window.
The location of the Firefox profiles directory is system dependent; for
example, on Microsoft® Windows® XP, the location is here:
C:\Documents and Settings\<user>\Application Data\mozilla\firefox\profiles
Of course, the whole point of CMIS is to access content in different repositories, not just content from one vendor. Because the plug-in was built according to the proposed CMIS standard specification, it can be used to manage documents stored in any system, as long as that system also implements the proposed standard.
If you search the Internet for CMIS repositories, you can find several that you can download to try out free of charge. Also, a list of CMIS repository providers and some interesting CMIS clients can be found in the Lotus Quickr wiki article, “Implementations of CMIS.”
Exploring a CMIS repository
After you have the location of the CMIS repository saved in the plug-in, you can double-click the repository to begin to explore its content. This action causes the root collection from the service document to be used to fetch the names of the folders and documents from the repository. This root collection was discovered earlier when the repository was added in the handleService() function in addRepository.js.
The fetching of the content of this root collection is another case in which the Dojo toolkit is used. The function getChildrenData(/*cmiscon.TreeNode*/ node) in cmisconnector.js uses the function dojo.xhrGet to issue an XML HTTP request to get an Atom feed document listing the folders and documents in the root collection. Such a feed looks like that shown in listing 3.
Notice that this example feed contains both a CMIS folder and a CMIS document, each represented by an Atom entry. In the case of a folder entry, the entry contains the object type property with the value of cmis:folder. In the case of documents, the value is cmis:document.
When the plug-in parses a CMIS feed and encounters a cmis:document entry, it creates a node in the plug-in tree, using a document mini-icon to represent the document. The plug-in remembers the data associated with that node, the document ID and the link to the content stream of the document (the bits of the document).
This link is found in the edit-media link of the document entry, following Atom conventions. The link is fetched by use of HTTP GET, and the contents are shown in the browser window on the right side of the plug-in.
When the plug-in encounters a cmis:folder entry, it creates a node in the plug-in tree, using a folder mini-icon to represent the node. It also remembers in the data associated with that node the value of the down link from that folder's Atom entry. This down link is itself a CMIS feed of folders and documents. You can see that the pattern of navigating folders and documents is recursive. Starting from the root collection and working down through the folder hierarchy, the plug-in discovers the content as the user clicks through the tree. Each level in the hierarchy is represented as an Atom feed.
Adding documents to a CMIS repository
The canDrop function in cmisconnector.js is called when a dragged object passes over a node in the CMIS repository tree. It answers true when the object below the mouse pointer is a node representing a CMIS repository or a CMIS folder; otherwise, it answers false.
The drop function in cmisconnector.js is called when the user drops a Web page, another CMIS document, or a local file onto a node in the sidebar document tree representing a CMIS repository or CMIS folder. The drop function determines to which feed URL to publish the document and then calls the uploadUrl function.
The uploadUrl function first downloads the document, using the downloadFile function, saves it as a local file, and then calls the uploadFile. The uploadFile function determines the name of the CMIS document based on the user option described in the next section, “CMIS Connector options.” It then calls the function createEntry to publish the document to the CMIS repository.
The facility that Dojo provides in dojo.rawXhrPost that simplifies making asynchronous XML HTTP requests is used here. Dojo also provides functions in dojox.atom.io.model to extract the edit-media link from the returned CMIS document entry, so that it can be used by the uploadFile function to upload the file content to the CMIS document’s content stream.
You might have noticed that the CMIS HTTP parameter, versioningState=checkedout, is used in the HTTP POST operation. This parameter has the effect of checking out the newly created document in the CMIS repository.
This parameter was needed to allow the file content to be uploaded to the new CMIS document. Otherwise, we need to check out the document to get a private working copy that we can update. An alternative is to encode the file content using base 64-bit encoding and post that encoded content with the original HTTP POST that created the CMIS document.
Creating a CMIS folder
A folder can be created in the root collection of the repository or in a subfolder of the repository. You can add folders to a CMIS repository by right-clicking a folder and entering the name of the new folder, as in the window shown in figure 10.
Figure 10. Folder title window
<menuitem id="miNewFolder" label="&newFolder.label;
CMIS connector options
The plug-in provides some points of customization for both the user and the developer. The first point of customization is an option that lets the user specify what to use for the CMIS document name. The other options, what CMIS document and folder types to use and how to handle CMIS properties, can be changed by a developer.
The plug-in lets the user decide how to create the name of the CMIS document (the property named cmis:document) when a file is dragged and dropped onto a folder in the plug-in. The user can specify these options by selecting Tools - Add-ons, from the Firefox menu.
In the Add-ons window, you can specify whether the name of the CMIS document (cmis:name) is to be derived from the URL of the dropped document or Web page or derived from the link text of the dropped document or Web page (see figure 11). Note that, in the case of dragging and dropping local files, the values are the same.
Figure 11. Add-ons window
Regardless of which option you choose, if you select the Prompt document name when drag and drop check box, then you are prompted every time that you drag and drop a file or Web page into a CMIS folder.
Also, when you select this option, a Rename window displays when you drop a document, allowing you to specify a different name than the link text or URL computed by the plug-in (see figure 12).
Figure 12. Rename window
The Options window is specified in the XUL file, options.xul. When a document is dropped, these options are read and placed into an object named gPrefs. Then the onDrop function in cmisconnector.js is used to select which value in the dropped object to use for the CMIS document name.
Base folder and document types
When the plug-in creates a CMIS folder or a CMIS document, a template for the Atom entry is used to control what is posted to the CMIS repository. These templates are contained in the files NewFolderForPost.xml and NewDocForPost.xml. In these files, you can set the type used to create the CMIS folder or document. This ability is important because some CMIS repositories don’t allow you to create instances of the base type cmis:folder or cmis:document.
For example, if you want the plug-in to use the type customerfolder, you specify that name in the NewFolderForPost.xml file, as in the example in listing 4.
Listing 4. Specify customerfolder type
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <atom:entry xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:app="http://www.w3.org/2007/app" xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/"> <atom:title type="text">FolderTitle</atom:title> <cmisra:object> <cmis:properties> <cmis:propertyId propertyDefinitionId="cmis:objectTypeId"> <cmis:value>customerfolder</cmis:value> </cmis:propertyId> </cmis:properties> </cmisra:object> <cmis:terminator/> </atom:entry>
You can modify the property.cfg file to add the names of any properties not defined by CMIS. For example, if a repository defines a property named myrepository:tags, and you want that property to be modifiable by the user, you can add a line to the property.cfg file like this:
myrepositoryxyz:tags = true
More about the CMIS plug-in
An easy way to do that exploration is to deploy the plug-in into an application development environment such as Eclipse. (You can download the Eclipse development environment from the Eclipse Ganymede Web site.)
We’ve made this exploration a bit easier by organizing the code as an Eclipse project and packaging that into a ZIP file. To deploy the plug-in into Eclipse, download the cmisconnector.zip file in the Downloads section of this article, extract the file, and then create a Java project based on the source code in that file.
After the source code has been imported into Eclipse, you can view the source code and build the plug-in by selecting the project and then selecting Project - Build project from the menu at the top. The build operation creates a file named email@example.com.
To install the plug-in file that you build in this way, double-click the firstname.lastname@example.org file that you see in the root folder of the project. This action launches Mozilla Firefox, and the Software Installation window displays, prompting you to install the plug-in (see figure 13).
Figure 13. Install the plug-in
Click the Install Now button in the window, restart Firefox, and now you can use the plug-in that you have just built for yourself.
Tips on debugging
Having set that breakpoint, when you expand the elements of a folder in the plug-in, you stop at that line in the code, and you can inspect variables and start stepping through the code using the F5, F12, and F11 keys.
Figure 15. Breakpoint set in GetChildren() function
The CMIS specification will continue to be improved as a result of comments from the public; similarly, our CMIS plug-in will continue to be revised as new versions of the proposed standard are published.
When the specification is approved and becomes an official OASIS standard, a final version of this plug-in will be published. In addition, optional parts of the CMIS specification, such as access control and renditions, might be included in later versions of the plug-in.
Using the service definitions from CMIS with the Mozilla plug-in framework allows you to develop new kinds of applications that extend the capabilities of the browser to work with content from a variety of sources.
Although we hope the code example provided with the article is a useful way to get started, we encourage you to learn more about the proposed CMIS standard and the Mozilla plug-in framework, and to develop your own kinds of content applications. The Resources section below includes some useful links to help you learn more.
- Participate in the discussion forum.
- Read the developerWorks Lotus white paper, "Developing an IBM Lotus Quickr Firefox connector using Quickr services."
- Read the wiki article, "Content Management Interoperability Services (CMIS)."
- Refer to the Lotus Quickr product page.
- Refer to the Lotus Quickr documentation page.
- Read the developerWorks Information Management white paper, "IBM FileNet and IBM Content Manager Technology Preview for OASIS CMIS."
- Learn more about the OASIS CMIS Technical Committee.
- Refer to Mozilla Development Central.
- Learn more about Dojo Toolkit.