Index Storage and Optimization
Watson™ Explorer Engine indices are read and written as block of data. Each index uses a single block size, which can range from 1K to 64K in size. The block size defaults to 8K, but can be configured in the Indices section of the Indexing tab on a search collection's Configuration tab. To see and modify the Block Size option, click edit and click the Indices header to expand that section.
Using smaller blocks improves search performance. However, each index block uses 32 bytes of RAM in addition to the RAM required for reading the block itself. For very large indices, you may want to increase the block size in order to reduce memory requirements.
After changing the block size, the new size will not take effect until you do a full merge or completely regenerate the index. You can initiate a full merge by clicking the full merge button, located near the bottom of the Live Status tab's Indexing sub-tab. A full merge creates a single index file from your original index file and all temporary index segments, purging any pending deletes in the process. Doing a full merge can increase performance, because it is faster to consult a single index file and because fewer items in the index have to be checked to determine if they are active or deleted. A full merge can also reduce disk space consumption by producing a smaller index file after processing all pending deletes.
Watson Explorer Engine also supports index compression, which means that the blocks that are read and written from a Watson Explorer Engine index will be automatically decompressed/compressed on the fly. Using a compressed index can substantially reduce the amount of disk space used by Watson Explorer Engine, especially for large search collections. Compressed indices can also improve the performance of Watson Explorer Engine search applications because fewer disk accesses are required to read and search relevant index data. However, using a compressed index can increase system memory requirements at query time because buffers will be allocated to hold the decompressed index data.
Index compression is disabled by default. To enable index compression, select the search collection's Configuration > Indexing tab, and click edit. In the Indices section, set the Maximum Compressed Size variable to the maximum amount of data that you want a compressed block to contain. Specifying any non-zero value for this variable enables index compression. This value is also used as the size of the buffer used to hold index blocks when they are uncompressed, so it is important to strike a balance between the amount of data that you will try to compress into an index block and the block size that you are using. Specifying a Maximum Compressed Size value that is too high will waste memory by allocating overly-large buffers to hold the uncompressed index blocks. If you want to experiment with using a compressed index, a good initial value to try is a Maximum Compressed Size of 32,768, using a block size of 2048.