Technical Blog Post
How indexing is misused and Italian wine
Paula Muir is a Software Developer with IBM Content Manager OnDemand for Multiplatforms in Boulder, Colorado. She has 20 years of experience with Content Manager OnDemand and 15 years of experience in the data indexing field. Her areas of expertise include indexing and loading data, and AFP and PDF architecture.
In one of the blog posts I wrote earlier while writing IBM Content Manager OnDemand Guide (an IBM Redbooks publication), I talked about how to successfully index your data for IBM Content Manager OnDemand (CMOD). Today, let’s talk about a more interesting topic: How indexing is misused.
Let's say we have some kind of financial statement that contains lists of names, account numbers, and balances. Some customers decide that they want to collect every name, account number, and balance from the statement as an index value. Then, when the results of their document search appear in the Search Results screen, they can see the information that they are looking for.
The thing is, they don't want to look at the actual document at all. They only want to look at the Search Results screen.
Here's a document – let's collect every value as an index!
It sounds great but CMOD is not really designed to do this. It is meant to be a document archive system, not a system to generate a subset of document information.
Since CMOD is not designed to do this, inevitably, customer who try this run into problems. The first problem is that the performance of the indexing and loading of the data is really bad. It's bad because the indexer is collecting gazillions of index values from their documents.
The next problem is that the index values usually don't display on the search results screen in the way that customers think they should. They may not be in the exact order that they appear in the document. This occurs because of how the indexer collects the values, or because of how the values that are returned to the client by the database.
Eventually, after much trouble, it is better not to do this.
I will spend the rest of my life exploring Italian wine. They use hundreds of different grapes, some going back to Roman times. So with many different kinds of wine, we are way beyond Cabernet Sauvignon and Chardonnay here. If you like seafood, try a Soave, made from the famous Garganega grape. Everyone's heard of that, right? If you like pasta with red sauce, try a Valpolicella Ripasso, made from three grapes you've never heard of. If you'd like to try a real stunner, shell out the money for an Amarone, Valpolicella's big brother. Made from the same grapes, they are left out to dry and shrivel and become incredibly concentrated. Although it is a dry wine, it has a searing richness of flavor. I've tried serving it with pasta, meat, even pizza, but at this point I think it works best with cheese. Just cheese and crackers on a board. It's just too big and rich for food. By the way, all these wines are from the Veneto region of Italy, near Venice.
Three wines down, 200 more Italian wines to go.
For Content Manager OnDemand related blog posts, see:
- 5 Things To Know about IBM Content Manager OnDemand
Enha nced Ret enti on M anag emen t fe atur e in IBM Con tent Man ager OnD eman d
to s ucce ssfu lly inde x yo ur d ocum ent s
- PDF Floating Triggers – If you just have 5 minutes to learn it – Here it is
- How indexing is misused and Italian wine
kin d of sto rage do I us e fo r CM OD ?
al w orld vie w of Con tent Man ager OnD eman d co ncep t s
rtin g in form atio n to a l ocal ser ver to s end to C onte nt M anag er O nDem and Supp or t
- Need to do tracing? How to turn on the Content Manager OnDemand trace facility
- The CMOD adoption process and why people love it
For more information on Content Manager OnDemand, see IBM Redbooks publications: