IBM Support

Workitem Quick Search in IBM Rational Team Concert

Question & Answer


Question

How does the Workitem Quick Search perform a search in IBM Rational Team Concert?

Cause

You want to know how the search is performed to be able to understand the expected results.

Answer

The Workitem Quick Search is implemented with Lucene which is a full text search engine library. Below is the description of the algorithm used by IBM Rational Team Concert (RTC). This entry deals with the 3.x product stream.

In the standard mode, RTC generates search requests to the Lucene search.

Step one:
The text that was entered is tokenized. Each token typically maps to a word, but not always.

For instance, CamelCase words are broken up (resulting in 'camel case'). Common words are stripped out (such as 'a'). There is special casing for mixed languages, alpha-numerics and other cases. Consider wildcard tokens as a separate case from normal text.

Step two:
For a work item search, a query is constructed based on the field types in the work item. When more than one word or token is entered, it requires both words to return a result.

For example, the following query is generated when a user types in 'de'.

+((_name:de^2.0 | _content:de | _tags:de* | _meta:de)) +((_artifactType:com.ibm.team.workitem.WorkItem _containerType:com.ibm.team.workitem.WorkItem)^0.0)

Therefore, the query will:

  • Look if the letters 'de' form a word in workitem fields with the type _name, consider that twice as important as finding it elsewhere (denoted by 2.0 in the code above),
  • Look in the workitem fields marked as tags, and in this context, consider 'de*' as well as 'de',
  • Look in fields marked as _content and _meta for 'de', but it's not as important (denoted by 0.0).

Each workitem field that is searchable, is identified in one of the three bullets above, and is separated in the following ways:
_name ==> Summary
_content ==> Content, Comments
_tags ==> Tags
_meta ==> Creator, Target, Approvals Descriptor, Found In, Owner, Creator, Id

The search Hit criteria is based on Lucene. Even for a normal token search (no wildcards), Lucene uses a complicated statistically based algorithm. There are two aspects to the search process;
  1. Building the index of text to search and,

  2. Searching for the text.

Lucene searches, that do not include wildcards, compare the search tokens that the user enters to the tokens in the index. The indexed tokens are created in more or less the same way as the search tokens. Each of the matches is weighted according to the Lucene Scoring Algorithm. Where normal, non-wildcard searches, should not return partial matches unless one of the filters breaks the word at an earlier point.

Let us use the following test cases and results to demonstrate this further.

Test Cases and Results:
Create a new project area from the Scrum process. By default, it also creates the following work items in the new project area:
  • Define permissions
  • Define team members
  • Define categories and releases for work items
  • Define sprints/iterations
  • Share code with Jazz Source Control
  • Define a new build

Test Case #1:
On the project, when you search for 'iter', it returns the following workitem which contains 'Iterations' in its name. The search seems like it performs a wildcard search 'iter*'
  • Define sprints/iterations

Test Case #2:
Now if you serach for 'ite', the query returns no work item. However, by adding * to the search keyword ('ite*'), it returns the following workitems:
  • Define sprints/iterations
  • Define categories and releases for work items

The following parameters are used when searching for 'iter' and 'ite':

iter:



ite:




As mentioned earlier, non-wildcard searches should not return partial matches unless one of the filters breaks the word at an earlier point. In the case of the word iteration, the English stemming filter is breaking the word iteration up and adding iter to the index.

You can see this behavior by entering iter which matches the stemmed iteration in the index. But if you enter itera, it does not match. The search function does not know how to stem itera to turn it into iter. Same for iterat. But if you enter iterate, a word not even in the workitem, you will see iter. In this case, iterate is stemmed to iter + ate, and iter matches iter in the index. If you enter iteration into the tags fields, then itera does match. The tags field runs a wildcard search every time.

Leverage the Jazz Community

Jazz and Rational Team Concert have an active community that can provide you with additional resources. Browse and contribute to the User forums, contribute to the Team Blog and review the Team wiki.
Refer to technote 1319600 for details and links.

[{"Product":{"code":"SSUC3U","label":"IBM Engineering Workflow Management"},"Business Unit":{"code":"BU055","label":"Cognitive Applications"},"Component":"Repository","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"},{"code":"PF027","label":"Solaris"}],"Version":"3.0.1;3.0.1.1;3.0.1.2","Edition":"","Line of Business":{"code":"LOB02","label":"AI Applications"}}]

Product Synonym

Rational Team Concert

Document Information

Modified date:
16 June 2018

UID

swg21586008