This blog promotes knowledge sharing through experience and collaboration. For more product information, visit our WebSphere Commerce CSE page. For easier navigation, utilize the Categories to find posts that match your interest.
FAQ: Commerce Search Buildindex (di-buildindex)
This post will cover many common questions we get about the manual search index buildindex command (di-buildindex). If you have any questions regarding search buildindex that this FAQ doesn't answer, feel free to comment on this post with your question so we can use such questions to expand our FAQ further.
1. What are the mandatory parameters for buildindex?
You can perform indexing by going to the WC_installdir/bin directory and running the di-buildindex script with just these parameters:
./di-buildindex.sh -instance <instance_name> -masterCatalogId <master_catalog_id> -dbuser <dbuser> -dbuserpwd <dbuserpwd>
./di-buildindex.sh -instance demo -masterCatalogId 10051 -dbuser db2inst1 -dbuserpwd wcsr0cks
2. How do I perform delta indexing?
To perform delta indexing, you need to run delta index preprocessing and delta buildindex. To run a delta index preprocessing, you need to set fullbuild parameter to false when running the di-preprocess script like so:
./di-buildindex.sh -instance <instance_name> -masterCatalogId <master_catalog_id> -dbuser <dbuser> -dbuserpwd <dbuserpwd> -fullbuild false
./di-buildindex.sh -instance demo -masterCatalogId 10051 -dbuser db2inst1 -dbuserpwd wcsr0cks -fullbuild false
3. How do I update/add a field in the index?
We store the files for configuring the index fields in the conf directory of the particular core the field is in:
For example, to update a field in the CatalogEntry core like shortDescription, the files would be located in:
The main configuration files for updating a field in the index are schema.xml and wc-data-config.xml.
In schema.xml, we store the definitions of the fields in the index, and their type:
<field name="shortDescription" type="wc_text" indexed="true" stored="true" multiValued="false"/>
In wc-dataconfig.xml, we have the queries used to retrieve the data from all of the temporary tables created during preprocessing and assign the corresponding values to the relevant field in the index:
<field column="SHORTDESCRIPTION" name="shortDescription" />
You can update these files to add a new field to the index. We have a walk-through on the Knowledge Center for adding a new field to the search index (longDescription in this example): http://www-01.ibm.com/support/knowledgecenter/SSZLC2_7.0.0/com.ibm.commerce.developer.doc/tasks/tsdsearchcustguideex2.htm?lang=en
You should collect the data in the following MustGather: http://www-01.ibm.com/support/docview.wss?uid=swg21675890
Once you have collected this data, you should first look into wc-dataimport-buildindex.log to verify the type of issue you are having with buildindex. For example, if it is a connection/timeout related issue, then you will have to adjust your timeout parameters or verify that the correct hostname is being used for the search server (stored in SRCHCONF and SRCHCONFEXT). Otherwise, if there is an error thrown when processing a certain core, then you will need to investigate the issue from the search server side by reviewing the search server's trace.log.
Aug 12, 2014 6:15:17 AM com.ibm.commerce.foundation.dataimport.process.DataImportProcessorMain fullDataImport
From the Solr side, you can enable *=info: org.apache.*=all: com.ibm.commerce.foundation.*=all to see the specific processing that is being done and what is being done during the slow period.
6. What types of changes can we cover using delta indexing? What types of changes require full indexing?
This will depend on the feature pack that your environment is on as well as the type of update you are doing to the index. The higher the feature pack, the more changes that only require delta indexing to be done. You can review the following Knowledge Center page for a list of type of updates to the search index, as well as if they require a full or delta indexing: http://www-01.ibm.com/support/knowledgecenter/SSZLC2_7.0.0/com.ibm.commerce.developer.doc/refs/rsdsearchindexhints.htm?lang=en
7. What are index subtypes? How can I index only these index subtypes?
Index subtypes refer to the division made between the cores under an index. We split the cores into certain index subtypes to identify relevant cores. For example, here are examples of index subtypes with their corresponding cores:
If you are only planning on updating data relevant to a specific index subtype, you can choose to only update this data in the index by adding -indexSubType <subtype> parameter to the preprocess and buildindex script. Note that you must specify this for both preprocess and buildindex script during this particular indexing scenario so that you don't get data inconsistency issues.
8. How can I index web content if I have a remote search server?
The indexer acts as a service to the web content crawler. After each crawl completes, the web content crawler directly invokes a request to the WebSphere Commerce search server with the specific URL. The indexing process then starts asynchronously. The typical URL resembles the following sample URL:
However, if you have a remote search server, this won't work, as the files for the manifest, as well as the web content itself, are on the Commerce machine. To be able to index web content with a remote search server, follow the instructions on this Knowledge Center page: http://www-01.ibm.com/support/knowledgecenter/SSZLC2_7.0.0/com.ibm.commerce.developer.doc/concepts/csdmanagesearchremotecrawl.htm?lang=en
9. Do I need to run preprocess/buildindex if I use the UpdateSearchIndex scheduled job?
No, that is not necessary since the UpdateSearchIndex scheduled job is used to automatic the indexing process by scheduling indexing to run at specific times. However, you can run preprocess/buildindex manually after making changes so you don't need to wait until UpdateSearchIndex runs again to have those changes added to the index. Behind the scenes, UpdateSearchIndex essentially performs preprocess/buildindex in a single process. In the end, an index update either from UpdateSearchIndex or preprocess/buildindex is equivalent so you can choose to use either scenario for updating the index, or a mix of both. For example, you can schedule hourly UpdateSearchIndex runs, while running preprocess/buildindex manually to trigger immediate updates after making a change to the index. For more information about configuring UpdateSearchIndex, you can review the following Knowledge Center page: http://www-01.ibm.com/support/knowledgecenter/SSZLC2_7.0.0/com.ibm.commerce.admin.doc/tasks/tsdschedsearchupdateindex.htm?lang=en