Collections

Content Engine collections are groups of related elements and can be one of two types: list or set. The name of a collection identifies its type. For example, a DocumentSet object is a collection of Document objects and an IdList object is a collection of Id objects.

All Content Engine API collection objects are strongly typed. When returned by the API, these type-aware collection objects contain only elements whose object type is directly related to the object type of the collection. For example, a DocumentSet collection object that is returned by the Java™ Folder.get_ContainedDocuments method or returned from the .NET IFolder.ContainedDocuments property contains only those Document objects (and any subclassed objects of those documents) contained within that folder.

Note: Although it is possible in many cases for applications to intentionally circumvent the type safety by putting objects of inappropriate types into a collection or casting one collection type to another collection type, it is recommended that applications do not rely on such behavior.

The EngineCollection interface is the base interface for all collection types in the collection class hierarchy and provides functions common to all collection objects. Its subinterface, EngineSet, provides functions common only to sets. In other words, a list-type collection has all of the functions that are provided by EngineCollection, but not of EngineSet, whereas a set-type collection has all of the functions that are provided by both EngineCollection and EngineSet.

A list-type collection is a group of dependent objects or a list of primitive data items. A list-type collection has a parent object to which it is scoped. A list-type collection object is instantiated with the createList method on the type-specific Factory class. For example, to create an instance of IdList, call Factory.IdList.createList. The elements of a list are ordered and need not be unique. List-type collections are iterated one element at a time. You can directly update a list-type collection by using type-safe methods, for example, to add or remove elements.

Note: Adding or removing elements from a list while there is an open iterator might affect further progress through that iterator. Doing so can result in errors, skipped or repeated items, or might work as you expect. Because of this unpredictable behavior, it is recommended that you close the open iterator before you add or removing elements.

A set-type collection is a group of independent objects. With the following exceptions, the elements of a set are unordered and unique:

A ComponentRelationship object has a ComponentSortOrder property that allows items in the set to be sorted.
The server ensures that for returned sets of ChildDocument and ChildRelationship objects, items are returned in sort order. A ChildDocument object can exist in more than one location in the collection if the document is reused.

You cannot directly update a set-type collection. Set-type collections can be paged, that is, they can be iterated a page at a time. For more information about paging, see Collection Paging Support and the Java PageIterator and PageMark interfaces or the .NET IPageEnumerator and IPageMark interfaces. A row set is a collection of rows that are returned from a query and has the characteristics of a set-type collection. For more information, see the RepositoryRowSet interface.

Note: When an application is iterating a set, there is no shortcut way for it to discover the absolute total of items in the collection. If your application needs to know the total items in the collection, it must count the individual items in the set (or in each page, if collection paging is being used) and calculate the total.

Collection Paging Support

In addition to the traditional "item at a time" iteration, the Content Engine API supports paging of set-type collections. Paging is automatically employed for enumeration results that are sent from the server to the client. You can also use paging for your custom applications. As an example, you might write your application to specify a page size that is approximately the number of items that are displayed in a user interface. Each page that is retrieved is rendered for presentation while the next page of results is being retrieved.

How Paging Works

Sets of independent objects and repository rows are divided into pages when they are being physically retrieved from the server; each page is a number of collection elements (objects or rows) that represent a subset of the collection elements. You can iterate a page at a time instead of one object or row at a time. For example, if a page is defined as 10 elements, and the collection has a total of 22 elements, the first paging operation returns a page that contains 10 elements, the second page returns the next 10 elements, and the third page returns the last two elements. Page iteration is exposed by the API and is especially useful for interactive applications that display a page of information at a time.

Each page iterator is initially positioned before the first page of the set. The first call to the PageIterator.nextPage method moves the iterator to the first page. The second call to nextPage moves the iterator to the second page, and so on. The nextPage method returns true until the end of the set is reached. When the iterator reaches the end of the set, it is positioned after the last page and nextPage returns false.

The getCurrentPage and getElementCount methods throw an exception if the iterator is positioned before the first page or after the last page, or between pages after a reset(mark) operation. For proper positioning, you must call nextPage on a new iterator and after a reset operation. You can call the getPageMark method at any time. Of these methods, only nextPage moves the position of the iterator.

The returned value of getElementCount is always equal to getCurrentPage().length. Use getElementCount to avoid copying the potentially large internal array just to get its length.

You can also get the current page continuation state (that is, the page on which the iterator will continue with the next call) and reset the iterator back to a previous page of results. The saved position of the iterator is called a page mark, represented by a PageMark object. The PageIterator.getPageMark method retrieves the current mark, and the reset(mark) method resets the state of the iterator to a previously saved mark. The reset method positions the iterator before the marked page; the nextPage method must be called to position the iterator to the marked page. It is also possible to mark and reset to the position before the first page and the position after the last page.

Calling the reset method (with no parameters) positions the iterator before the first page of the collection, which is essentially the same as getting a new iterator from the collection. You must then call the nextPage method to position the iterator to the first page.

The getPageSize and setPageSize methods allow you to query and adjust the internal paging size of the iterator. The new size takes effect on the next fetch of a page from the server, which is typically on the next call to nextPage. The actual size of each returned page can be smaller (including zero) or larger than the requested page size. If you do not specify a page size on these calls, the configured defaults are used. The ServerCacheConfiguration.QueryPageDefaultSize property specifies the default page size for query results and paged property values; if this property is not explicitly set to some other value, its default is 500. If you specify a page size, it must be less than the configured maximum page size set in the QueryPageMaxSize property, which has a system default value of 1000.

For stateless mid-tier software that handles client paging requests, the PageIterator includes the getCurrentPageCheckpoint and getNextPageCheckpoint methods. These methods return an opaque representation of the PageIterator method that can act as a checkpoint for a later resumption with Factory.PageIterator.resumeInstance.

The first page of a set can be prefetched from the server and cached in the client. All iterators of a set with a prefetched first page might return the same first page. All iterators fetch subsequent pages, if any, directly from the server.

For information on how to page through a collection, see Working with Collections.