com.ibm.mm.viewer
Class CMBStreamingDocServices
- java.lang.Object
com.ibm.mm.viewer.CMBStreamingDocServices
- public class CMBStreamingDocServices
- extends java.lang.Object
- implements CMBViewerConstants, java.io.Serializable
- render - Java Image objects for document pages can be obtained. These images, including their graphical annotations, can be used to provide viewing of documents in AWT and Swing-based applications.
- convert - a document or pages from the document may be converted to other forms. The form is typically gif, jpeg, or html -- browser displayable forms. This conversion is useful in web-based thin client applications, where applets or plugins for viewing are not an acceptable solution.
- reconstitute - documents composed of multiple parts on the server can be combined to form the original document. This is useful in exporting and emailing.
- manipulate - pages of documents can be permanently rotated, copied, moved, and deleted. New documents can be created from the pages of other documents.
CMBStreamingDocServices is very similar to the CMBDocumentServices bean (and is, in fact, used by CMBDocumentServices). However, it's methods use on streams rather than references CMBItem objects or other non-visual beans. This allows this class (and other classes in com.ibm.mm.viewer and com.ibm.mm.viewer.annotation packages) to be used without need for a direct connection to a content server.
Supported Document Types and Conversions
The following tables summarize the document types and conversions supported by default (using the supplied conversion engines). Additional conversion capabilities can be plugged in using by creating classes that extend CMBDocumentEngine and adding those to the list of engines in the engine properties passed on the constructor of CMBStreamingDocServices.
On Windows:
| MIME Type | Engine | Paginated | With Annotations | Page Manipulation Supported | Converted To |
| application/afp | CMBODDocumentEngine | Yes1 | No | No | text/html |
| application/afp | CMBODDocumentEngine | No | No | No | application/afp2 |
| application/lin application/ondemand line |
CMBODDocumentEngine | No | No | No | text/plain3 |
| image/tiff image/tif image/gif image/jpeg image/bmp text/plain application/vnd.ibm.modcap |
CMBMSTechDocumentEngine | Yes | Yes | Yes | image/gif image/jpeg image/tiff application/vnd.ibm.modcap text/plain5 |
| image/pcx image/dcx |
CMBMSTechDocumentEngine | Yes | Yes | No | image/gif image/jpeg |
| text/enriched text/richtext text/rtf text/html application/vnd.lotus-1-2-3 application/vnd.ms-excel application/wordperfect5.1 application/msword application/vnd.lotus-wordpro |
CMBMSTechInsoEngine | Yes | Yes | No | image/gif image/jpeg |
| text/url | CMBJavaDocumentEngine | No | No | No | text/html4 |
On AIX and Solaris:
| MIME Type | Engine | Paginated | With Annotations | Page Manipulation Supported | Converted To |
| application/afp | CMBODDocumentEngine | No | No | No | application/afp2 |
| image/tiff image/tif image/gif image/jpeg image/bmp text/plain application/vnd.ibm.modcap |
CMBMSTechDocumentEngine | Yes | Yes | Yes | image/gif image/jpeg image/tiff application/vnd.ibm.modcap text/plain5 |
| image/pcx image/dcx |
CMBMSTechDocumentEngine | Yes | Yes | No | image/gif image/jpeg |
| text/url | CMBJavaDocumentEngine | No | No | No | text/html4 |
On Linux:
| MIME Type | Engine | Paginated | With Annotations | Page Manipulation Supported | Converted To |
| application/afp | CMBODDocumentEngine | No | No | No | application/afp2 |
| image/tiff image/tif image/gif image/jpeg image/bmp text/plain application/vnd.ibm.modcap |
CMBMSTechDocumentEngine | Yes | Yes | Yes | image/gif image/jpeg image/tiff application/vnd.ibm.modcap text/plain5 |
| image/pcx image/dcx |
CMBMSTechDocumentEngine | Yes | Yes | No | image/gif image/jpeg |
| text/url | CMBJavaDocumentEngine | No | No | No | text/html4 |
Notes:
- The AFP2Web toolkit is used to perform paginated conversion to text/hml. This toolkit is not packaged with DB2 Information Integrator for Content. It is available separately. Also, the bridge to AFP2Web contained within CMBODDocumentEngine is Windows only, so AFP to HTML conversion is only available on Windows using this engine.
- AFP to AFP conversion also concatenates segments of large object documents.
- Line data to plain text conversion also converts EBCDIC to ASCII.
- text/url is a special MIME type used to indicate that a plain text document contains a URL. CMBStreamingDocServices will convert the URL into a HTML document with an auto-forwarding link.
- A document of any source type can be written to any of these target types only if it is possible for the target type. A multi-page document cannot be written to the image/jpeg and image/bmp types. An image document cannot be written to the plain/text type.
Specifying Conversion Preferences
There are three properties on CMBStreamingDocServices that are used together to determine the type of conversion that will occur for each document type:
- ConversionPreferences -- This is a Properties object mapping MIME type to the style of conversion desired: no conversion, pagination and conversion of a single page, or conversion of the entire document.
- PreferredFormats -- This is an array of MIME types of the preferred formats when conversion of the entire document is being performed. The order of the MIME types is from most preferred to least preferred.
- PreferredPageFormats -- This is an array of MIME types of the preferred formats for converted pages when page conversion is being performed. The order of the MIME types is from most preferred to least preferred.
Engine Properties
Individual document conversion engines may have special properties. These can be specified using the EngineProperties property. See the reference for each of the document engine classes for details on the specific properties supported by each engine.
The default value for EngineProperties:
- Windows, AIX, Solaris, and Linux:
ENGINES = 2 ENGINE1_CLASSNAME = com.ibm.mm.viewer.CMBODDocumentEngine ENGINE2_CLASSNAME = com.ibm.mm.viewer.mstech.CMBMSTechDocumentEngine
getEngineProperties to get the default
properties, then add values to the Properties object returned, and use setEngineProperties to update the
properties. Otherwise, your settings will override the defaults.
Engine properties must be updated before using any other methods in CMBDocumentServices for the properties to take effect.
Warning: The default configurations of engines may change from release to release. Therefore, if you modify engine properties, be sure to do so in a way that will not cause problems when the default set of engines is modified. This means, if you add an engine, you should decide to place the engine before or after the default engines and adjust the engine properties so that all of the default engines will still appear in the engine properties, in their current order.
The default engine properties are set by a file included part of the cmbview81.jar
called "cmbviewerengine.properties", and are loaded from the classpath. You can therefore also copy this file, modify
it to suit your application, and place it in front of cmbview81.jar in the classpath.
The DELAYINIT engine property :
You can choose to delay the initialization of any specific engine by setting ENGINE<n>_DELAYINIT=true. This
delays the initialization until the engine is used for document conversion. By default, initialization is not delayed
and is done during the first loadDocument() method.
Engine properties specific to the com.ibm.mm.viewer.mstech.CMBMSTechDocumentEngine are:
- PLAIN_TEXT_PAGE_WIDTH
- The page width that will be used while creating a page image of a plain text file. The default value used is 800 , that gives 80 characters per line.
- PLAIN_TEXT_PAGE_HEIGHT
- The page height that will be used while creating a page image of a plain text file. The default value used is 1100.
- MAX_GDI_BMP_WIDTH
- The maximum width in pixels of the memory GDI bitmap that will be used for converting a page. The default value here is 2400.
- MAX_GDI_BMP_HEIGHT
- The maximum height in pixels of the memory GDI bitmap that will be used for converting a page. The default value here is 2400.
- IWPM_STOPAT_EDT
- This options is for IWPM compatibility. Modca format allows and hence the MSTech engine parses a document
containing more than one set of BDT-EDT pairs showing all pages from those multiple sets. But IWPM viewer stops at
the first EDT tag. Allowed values are
trueandfalse.falseis the default value. - TIFF_1BIT_COMPRESSION
- The compression for changed pages when document is saved for 1 bit TIFF. The default is
G3_COMPRESSIONand the allowed values are:- G3_COMPRESSION
- G4_COMPRESSION
- HUFFMAN_COMPRESSION
- PACKEDBITS
- UNCOMPRESSED
- TIFF_4BIT_COMPRESSION
- The compression for changed pages when document is saved for 4 bit TIFF. The default is
PACKEDBITSand the allowed values are:- PACKEDBITS
- UNCOMPRESSED
- TIFF_8BIT_COMPRESSION
- The compression for changed pages when document is saved for 8 bit TIFF. The default is
PACKEDBITSand the allowed values are:- PACKEDBITS
- UNCOMPRESSED
- TIFF_24BIT_COMPRESSION
- With a 24 bit TIFF,
JPEG_COMPRESSIONwill be used for the compression for changed pages when the document is saved. The only value available for this is option isJPEG_COMPRESSION.
Field Summary
Constructor Summary
| Constructor and Description |
|---|
CMBStreamingDocServices()
|
CMBStreamingDocServices(CMBStreamingDocServicesCallbacks callbacks,java.util.Properties engineProperties)
Constructs CMBStreamingDocServices.
|
Method Summary
| Modifier and Type | Method and Description |
|---|---|
|
cancelOCR(CMBDocument document)
Using the OCR engine, cancel OCR of the document.
|
|
canOcr(CMBDocument document)
Check if there is an OCR engine that will support OCR action on the specified document.
|
|
copyPages(CMBDocument sourceDocument,int firstSourcePage,int lastSourcePage,CMBDocument destDocument,int destPage)
Copies a range of pages from one document to another document.
|
createDocument(java.lang.String mimeType)
Creates a new document initially with no content.
|
|
|
dropAllDocuments()
Terminates processing of all documents.
|
|
dropDocument(CMBDocument document)
Terminates processing of a document.
|
|
dropOcrPage(CMBPage ocrPage,CMBGenericDocOCRStatusListener listener)
Release OCR page resources.
|
|
getConversionProperties()
Returns the conversion properties.
|
getDocumentFromHandle(java.lang.Object hItem)
Determines the instance of CMBDocument from the associated item handle.
|
|
|
getDocumentPageCount(CMBDocument doc)
Returns the number of pages in the document.
|
getDocuments()
Returns the documents being processed.
|
|
getDocuments(int index)
Returns a particular document being processed.
|
|
|
getEngineNameForMimeType(java.lang.String docMimeType)
Return the CMBDocumentEngine class name that should handle the given document mime type.
|
|
getItemHandle(CMBDocument document)
Returns the handle related to a document.
|
|
getPreferredFormats()
Returns the preferred formats for converted documents.
|
|
getPreferredPageFormats()
Returns the preferred page formats for converted pages of documents.
|
loadDocument(java.io.InputStream firstPart,int firstPartSize,int numberOfParts,java.lang.String docMimeType,java.lang.String firstPartMimeType,java.io.InputStream annotations,java.io.InputStream resources)
Loads a document into document services.
|
|
loadDocument(java.io.InputStream firstPart,int firstPartSize,int numberOfParts,java.lang.String docMimeType,java.lang.String firstPartMimeType,java.io.InputStream annotations,java.io.InputStream resources,java.lang.String firstPartEncoding)
Loads a document into document services.
|
|
loadDocument(java.io.InputStream firstPart,int numberOfParts,java.lang.String docMimeType,java.lang.String firstPartMimeType,java.io.InputStream annotations,java.io.InputStream resources)
Loads a document into document services.
|
|
loadDocument(java.io.InputStream firstPart,int numberOfParts,java.lang.String docMimeType,java.lang.String firstPartMimeType,java.io.InputStream annotations,java.io.InputStream resources,java.lang.String firstPartEncoding)
Loads a document into document services.
|
|
loadDocument(java.net.URL firstPart,int firstPartSize,int numberOfParts,java.lang.String docMimeType,java.lang.String firstPartMimeType,java.io.InputStream annotations,java.io.InputStream resources,java.lang.String firstPartEncoding)
Loads a document into document services.
|
|
|
movePages(CMBDocument sourceDocument,int firstSourcePage,int lastSourcePage,CMBDocument destDocument,int destPage)
Moves a range of pages from one document to another document.
|
ocrDocument(CMBDocument document,CMBGenericDocOCRStatusListener listener)
Using the OCR Engine, perform OCR on each page of the currently selected document.
|
|
ocrPage(CMBPage page,CMBGenericDocOCRStatusListener listener)
Using the OCR engine, perform OCR on the page number of the document.
|
|
|
setConversionProperties(java.util.Properties properties)
Sets the conversion properties.
|
|
setItemHandle(CMBDocument document,java.lang.Object itemHandle)
Sets a handle to a document.
|
|
setPreferredFormats(java.lang.String[] formats)
Sets the preferred formats for converted documents, in the order of most preferred to least preferred.
|
|
setPreferredPageFormats(java.lang.String[] pageFormats)
Sets the preferred page formats for converted pages of documents, in the order of most preferred to least
preferred.
|
|
terminate()
Terminates all the engines.
|
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail
CMBStreamingDocServices
- public CMBStreamingDocServices( )
CMBStreamingDocServices
- public CMBStreamingDocServices( CMBStreamingDocServicesCallbacks callbacks,
- java.util.Properties engineProperties)
callbacks - implementation of CMBStreamingDocServicesCallbacks with methods that are called to retrieve additional
document parts. engineProperties - properties object defining the engines classes and their properties. If this parameter is null, the
default engine properties, as specified by the cmbviewerengine.properties file, will be used. Method Detail
getDocuments
- public CMBDocument[] getDocuments( )
getDocuments
- public CMBDocument getDocuments( int index)
- throws java.lang.ArrayIndexOutOfBoundsException
java.lang.ArrayIndexOutOfBoundsExceptiongetPreferredFormats
- public java.lang.String[] getPreferredFormats( )
setPreferredFormats
- public void setPreferredFormats( java.lang.String[] formats)
CMBDocument.write, the format of the document will be one of the preferred
formats if possible (i.e. supported by the document engine handling the document). If not, the document will be
written in its original format.
formats - the MIME types for the preferred formats. getPreferredPageFormats
- public java.lang.String[] getPreferredPageFormats( )
setPreferredPageFormats
- public void setPreferredPageFormats( java.lang.String[] pageFormats)
CMBPage.write, the format of the page will be one of the
preferred page formats.
getConversionProperties
- public java.util.Properties getConversionProperties( )
setConversionPropertiescreateDocument
- public CMBDocument createDocument( java.lang.String mimeType)
- throws CMBDocumentEngineException
Note: The document created exists only in memory. The content can be obtained using CMBDocument.write() and written to a permanent content server.
mimeType - the MIME content type of the document being created CMBDocumentEngineException - if a document engine to handle documents of this type could not be initialized, or the engine cannot
create documents of the specified mimeType. The former error is likely caused by a problem in the
specification of the document engine properties. The latter is likely due to lack of support for page
manipulation by the document engine. setConversionProperties
- public void setConversionProperties( java.util.Properties properties)
- <mimetype>=document|page|none
- mimetype is the MIME content type for the documents. The value is defined:
- document
- Convert the entire document to one of the preferred formats. Separate document pages will not be available. If conversion to a preferred format is not possible, CMBDocument.write() will return the document unconverted.
- page
- Convert pages of the document to one of the preferred page formats. If the document can be parsed and individual pages can be converted, they will be made available (through CMBDocument.getPages()). If pages cannot be converted, document conversion will still be available. If even that is not possible, CMBDocument.write() will return the document unconverted.
- none
- Do not provide any conversion of the document type. No document pages are available, and CMBDocument.write() returns the document unconverted.
loadDocument
- public CMBDocument loadDocument( java.io.InputStream firstPart,
- int numberOfParts,
- java.lang.String docMimeType,
- java.lang.String firstPartMimeType,
- java.io.InputStream annotations,
- java.io.InputStream resources)
- throws java.io.IOException
- java.lang.ClassNotFoundException
- java.lang.IllegalAccessException
- java.lang.InstantiationException
firstPart - a stream containing the first part of the document. numberOfParts - the number of parts of the document. docMimeType - the MIME content type of the document. firstPartMimeType - the MIME content type of the first part of the document. annotations - a stream containing annotations for the document pages. resources - a stream containing resources for the document. java.io.IOExceptionjava.lang.ClassNotFoundExceptionjava.lang.IllegalAccessExceptionjava.lang.InstantiationExceptionloadDocument
- public CMBDocument loadDocument( java.io.InputStream firstPart,
- int firstPartSize,
- int numberOfParts,
- java.lang.String docMimeType,
- java.lang.String firstPartMimeType,
- java.io.InputStream annotations,
- java.io.InputStream resources)
- throws java.io.IOException
- java.lang.ClassNotFoundException
- java.lang.IllegalAccessException
- java.lang.InstantiationException
firstPart - a stream containing the first part of the document. firstPartSize - the size of the firstPart stream. numberOfParts - the number of parts of the document docMimeType - the MIME content type of the document firstPartMimeType - the MIME content type of the first part of the document. annotations - a stream containing annotations for the document pages. resources - a stream containing resources for the document. java.io.IOExceptionjava.lang.ClassNotFoundExceptionjava.lang.IllegalAccessExceptionjava.lang.InstantiationExceptionloadDocument
- public CMBDocument loadDocument( java.io.InputStream firstPart,
- int numberOfParts,
- java.lang.String docMimeType,
- java.lang.String firstPartMimeType,
- java.io.InputStream annotations,
- java.io.InputStream resources,
- java.lang.String firstPartEncoding)
- throws java.io.IOException
- java.lang.ClassNotFoundException
- java.lang.IllegalAccessException
- java.lang.InstantiationException
firstPart - a stream containing the first part of the document. numberOfParts - the number of parts of the document docMimeType - the MIME content type of the document firstPartMimeType - the MIME content type of the first part of the document. annotations - a stream containing annotations for the document pages. resources - a stream containing resources for the document. firstPartEncoding - the character encoding of the first part. java.io.IOExceptionjava.lang.ClassNotFoundExceptionjava.lang.IllegalAccessExceptionjava.lang.InstantiationExceptionloadDocument
- public CMBDocument loadDocument( java.io.InputStream firstPart,
- int firstPartSize,
- int numberOfParts,
- java.lang.String docMimeType,
- java.lang.String firstPartMimeType,
- java.io.InputStream annotations,
- java.io.InputStream resources,
- java.lang.String firstPartEncoding)
- throws java.io.IOException
- java.lang.ClassNotFoundException
- java.lang.IllegalAccessException
- java.lang.InstantiationException
firstPart - a stream containing the first part of the document. firstPartSize - the size of the firstPart stream. numberOfParts - the number of parts of the document docMimeType - the MIME content type of the document firstPartMimeType - the MIME content type of the first part of the document. annotations - a stream containing annotations for the document pages. resources - a stream containing resources for the document. firstPartEncoding - the character encoding of the first part. java.io.IOExceptionjava.lang.ClassNotFoundExceptionjava.lang.IllegalAccessExceptionjava.lang.InstantiationExceptionloadDocument
- public CMBDocument loadDocument( java.net.URL firstPart,
- int firstPartSize,
- int numberOfParts,
- java.lang.String docMimeType,
- java.lang.String firstPartMimeType,
- java.io.InputStream annotations,
- java.io.InputStream resources,
- java.lang.String firstPartEncoding)
- throws java.io.IOException
- java.lang.ClassNotFoundException
- java.lang.IllegalAccessException
- java.lang.InstantiationException
firstPart - an HTTP URL to the first part of the document. firstPartSize - the size of the firstPart stream. numberOfParts - the number of parts of the document docMimeType - the MIME content type of the document firstPartMimeType - the MIME content type of the first part of the document. annotations - a stream containing annotations for the document pages. resources - a stream containing resources for the document. firstPartEncoding - the character encoding of the first part. java.io.IOExceptionjava.lang.ClassNotFoundExceptionjava.lang.IllegalAccessExceptionjava.lang.InstantiationExceptionmovePages
- public void movePages(CMBDocument sourceDocument,
- int firstSourcePage,
- int lastSourcePage,
- CMBDocument destDocument,
- int destPage)
- throws java.io.IOException
sourceDocument - the source document to move pages from firstSourcePage - the page number of the first page to move lastSourcePage - the page number of the last page to move. If this is the same value as firstSourcePage, one page is
moved. destDocument - the destination document to copy pages to. This can be the same document as sourceDocument destPage - the page in the destDocument where the pages are to be moved after. If this number is zero, the pages
are moved in front of all existing pages in destDocument. java.lang.IndexOutOfBoundsException - if the page numbers are invalid java.lang.IllegalArgumentException - if the destPage is within the range of pages to move, or if the source and destination documents are
not handled by the same document engine. java.io.IOException - If an IO error occurs reading the document content copyPages
- public void copyPages(CMBDocument sourceDocument,
- int firstSourcePage,
- int lastSourcePage,
- CMBDocument destDocument,
- int destPage)
- throws java.io.IOException
sourceDocument - the source document to copy pages from firstSourcePage - the page number of the first page to copy lastSourcePage - the page number of the last page to copy. If this is the same value as firstSourcePage, one page is
copied. destDocument - the destination document to copy pages to. This can be the same document as sourceDocument destPage - the page where document pages are to be copied after. If this number is zero, the pages are copied in
front of all existing pages in destDocument java.lang.IndexOutOfBoundsException - if the page numbers are invalid java.lang.IllegalArgumentException - if the destPage is within the range of pages to copy, or if the source and destination documents are
not handled by the same document engine. java.io.IOException - If an IO error occurs reading the document content getDocumentPageCount
- public int getDocumentPageCount( CMBDocument doc)
- throws java.io.IOException
- CMBDocumentEngineException
Note: For large documents, calling this method may require that all parts of the document be retrieved from the server and processed, which could be time consuming.
java.io.IOExceptionCMBDocumentEngineException - when the CMBSnowboundEngine is unable to get the page count. dropDocument
- public void dropDocument(CMBDocument document)
document - the document being terminated. dropAllDocuments
- public void dropAllDocuments()
setItemHandle
- public void setItemHandle(CMBDocument document,
- java.lang.Object itemHandle)
document - the document whose handle is being set. itemHandle - the value for the handle. getItemHandle
- public java.lang.Object getItemHandle( CMBDocument document)
document - the document whose handle is being retrieved. getDocumentFromHandle
- public CMBDocument getDocumentFromHandle( java.lang.Object hItem)
hItem - an object acting as a handle to the associated document. This object would have been associated with
the document using setItemHandle(document). setItemHandle(document)terminate
- public void terminate()
getEngineNameForMimeType
- public java.lang.String getEngineNameForMimeType( java.lang.String docMimeType)
docMimeType - canOcr
- public boolean canOcr(CMBDocument document)
document - The document requesting OCR support. ocrPage
- public CMBOCRLetters ocrPage(CMBPage page,
- CMBGenericDocOCRStatusListener listener)
page - The page that is requestion an OCR. listener - for OCR engine OCR status events CMBDocumentEngineException - when canOcrDocument returns false or there is no OCR engine. dropOcrPage
- public void dropOcrPage(CMBPage ocrPage,
- CMBGenericDocOCRStatusListener listener)
ocrDocument
- public CMBOCRLetters[] ocrDocument( CMBDocument document,
- CMBGenericDocOCRStatusListener listener)
document - The document that is requesting an OCR. listener - for OCR engine OCR status events CMBDocumentEngineException - when canOcrDocument returns false or there is no OCR engine. cancelOCR
- public void cancelOCR(CMBDocument document)
document - The document to cancel OCR process. CMBDocumentEngineException - when canOcrDocument returns false or there is no OCR engine.