Using the PDF indexer
The Content Manager OnDemand PDF indexer is a program that you can use to extract index data from or generate index data about Adobe PDF files.
The index data can enhance your ability to store, retrieve, and view PDF documents with Content Manager OnDemand. The PDF indexer processes PDF input files. A PDF file is a distilled version of a PostScript file, adding structure and efficiency. A PDF file can be created by Acrobat Distiller or a special printer driver program called a PDFWriter. The PDF indexer supports PDF Version 1.9 and lower input and output files. See the documentation provided with Acrobat Distiller for more information about preparing input data for the Distiller.
The PDF indexer can logically divide reports into individual items, such as statements, policies, and bills. You can define up to 128 index fields for each item in a report.
The PDF indexer uses a coordinate system to locate the text strings that determine the beginning of a group and the index values. The coordinate system uses x and y pairs imposed on a page. For each text string, you identify its upper left and lower right position on the page. The upper left corner and lower right corner form a string box. The string box is the smallest rectangle that completely encloses the text string. The origin is in the upper left hand corner of the page. The x coordinate increases to the right and y increases down the page. You also identify the page on which the text string appears. Content Manager OnDemand provides the Report Wizard in the Administrator to help you create indexing parameters for the IBM Content Manager OnDemand PDF Indexer for Multiplatforms. Also provides the ARSPDUMP program to help you identify the locations of text strings on the page.
The IBM Content Manager OnDemand for Multiplatforms: Administration Guide provides details about the Report Wizard and gives examples of how to use the Report Wizard to process line data input files. Using the Report Wizard to process PDF input files is similar to processing line data input files.
The Content Manager OnDemand Indexing Reference provides details about the PDF Indexer and shows examples about how to use it to process PDF input files.