This blog promotes knowledge sharing through experience and collaboration. For more product information, visit our WebSphere Commerce CSE page. For easier navigation, utilize the Categories to find posts that match your interest.
Learning to Speak Commerce Search
The terminology in Commerce Search is vast and can be counter intuitive due to the terms used to describe certain parts being regularly mixed up. I will explain many of the common terms, and regularly misunderstood terms, so that you can become a native Commerce Search speaker in no time!
Row/Document: A set of data that describes a particular catalog object. For example, each row or document in the CatalogEntry core corresponds to a specific catalog entry.
Field: Rows or documents in the core are comprised of fields, which hold specific information about the catalog object. For example, the name field is used to hold the name information for the category in a row (or document) from the CatalogGroup core.
Core: A Solr index that contains Solr documents for a specific purpose. For example, here are some of the out-of-the-box cores that you may see used on your environment:
Index: The index is comprised of all of the search cores associated to a master catalog. Here are some example indexes that you could have of your environment:
Full indexing: Rebuilding the entire index from scratch using the manual indexing utilities (di-preprocess/di-buildindex), UpdateSearchIndex scheduled job, etc.
Delta indexing: Updating the current index with the changes captured in TI_DELTA_CATENTRY using the manual indexing utilities (di-preprocess/di-buildindex), UpdateSearchIndex scheduled job, etc.
Parallel preprocessing and distributed indexing (sharding): A process for parallelizing indexing by breaking up the index into shards and indexing each shard at the same time (in separate threads). For example, if you had a core containing 2 million catalog entries, rather than sequentially indexing all of the catalog entries in a single thread, we can use parallel preprocessing and distributed indexing to split indexing across multiple shards. If we have 10 shards, each shard can index and store 200,000 catalog entries at the same time.
Crawler: Commerce utility for crawling unmanaged content for indexing into the unstructured index (ex. HTML files).
Extension index: A core that extends the CatalogEntry core to store specific data for the catalog entries. For example, the inventory index extends the CatalogEntry core to store inventory information for each catalog entry. Since this information is separated into a different core, you can rebuild this small core often and quickly to keep Inventory counts up-to-date while indexing your potentially large CatalogEntry core once a day.
Search Runtime Terminology
Deep search sequencing: Sorting products for category navigation using the product's sequence value as well as the sequence value of it's parent category.
Shallow sequencing: Sorting products for category navigation using the product's sequence value.
Search profile: Abstraction of a specific search scenario, defined in wc-search.xml. The search profile will contain the fields being searched on, expression providers and query preprocessors/postprocessors to use, and other relevant information. For example, performing a search for products and retrieving a specific category will return different information and require searching for different data, so we should be using different search profiles for these scenarios. As a result, IBM_findCategoryByIdentifier is a search profile that can be used for retrieving category information based on a specific catgroup_id while IBM_findProductsBySearchTerm is a search profile that can be used to retrieve product information based on a search term.
Expression Provider: Used to modify the control parameters available for the search request. For example, if you want to override the sort being used for the search request, you can use an expression provider to modify the _wcf.search.sort control parameter. Expression providers allow modifications to control parameter values before being read by query preprocessors and added to the query.
Query Preprocessor: Used to modify the query before it is processed by Commerce Search. For example, if you want to filter on catalog entries that have a manufacturer name, you can use a query preprocessor to add a query parameter like fq=mfName:*. Can use control parameters provided for the search request to add data to the query (ex. add sort parameter based on value in the _wcf.search.sort control parameter).
Query Postprocessor: Used to modify the query results before it is returned as the search response. For example, a query postprocessor can be used to add additional products to the search response based on a particular condition (ex. if a specific manufacturer exists in the search results).
Autosuggest: Type-ahead functionality used in the search bar to complete your currently typed phrase with possible matches. For example, shir can match on shirt.
Spellcheck: Used when a search returns 0 (or only a few depending on your configuration) to figure out what the intended search was. For example, searching for "cofe" returns 0 results, but the spellcheck functionality suspects that you meant to search for "coffee" (which has many more matches), so this will be returned in the "Did you mean..." section of the page.
Facets: Filters for reducing the search results to make them more relevant to the user's expectations. For example, a size facet can be used to only display search results that are available in a particular size.
Descriptive attribute: Used to describe a catalog entry. For example, we can give assign a t-shirt a descriptive attribute like material, with a value of cotton. Can be used as a facet if the attribute is considered facetable.
Defining attribute: Used to define a characteristic for a catalog entry. For example, we can give assign a t-shirt a defining attribute like size, with a value of Large. Can be used as a facet if the attribute is considered facetable.
Search rule: Used to influence the ordering and/or contents of a search based on specific triggers. For example, if a user searches for coffee, you can boost the relevancy of products made by manufacturer Coffee King.
Search term association: Used to modify or add search terms in the search query, or redirect the user to a specific page. Synonyms are used to add words to the search phrase (if X is searched for, also search for Y). Replacements are used to replace words in the search phrase (if X is searched for, instead search for Y). Landing Pages are used to direct the user to a specific page if a specific search term is in the search phrase (if X is searched for, redirect the user to page Y).
Search result grouping: Used to search across groups of catalog entries, returning the group representative if there is a match on any results in the group. By default, the group representative are products and each group is made up of a product and it's associated items. With this, you can search against the product and it's items, and return the product for display if there is a match on the product or any of it's items.
Search Architecture Terminology
REST-based search: Search requests are sent to the Search server as a REST URL. Most of the search scenario is processed on the Search server itself, and the search results are returned as a JSON response.
BOD-based search: Search requests are sent to the Commerce server as a BOD request (constructed through XML). Most of the search scenario is processed on the Commerce server, with the search query being sent over to the Search server for processing, and the response is returned back to the Commerce server as a BOD response (constructed through XML).
Standard configuration: Search server is deployed locally to the Commerce server.
Advanced configuration: Search server is federated and clustered, managed by the deployment manager (DMGR).
Managed configuration: Search server is federated and clustered, like the advanced configuration process. Solr template files (master, repeater, and subordinate) are managed by the deployment manager (DMGR), allowing these configuration files to be modified in one location and pushed across all the corresponding nodes.
If there are any other terms that you hear regularly when working with Commerce Search, but aren't sure what it is referring to, feel free to post a comment below and we can figure it out.