Configuring name similarity

Edit online

Configure the name similarity feature by tuning parameters that govern how the system processes multiple resources when performing pattern matching.

About this task

Pattern matching enables Event Analytics to identify types of events that tend to occur together on a specific network resource. The name similarity feature extends pattern matching by enabling it to identify types of events that tend to occur together on more than one resource, where the resources within the pattern have a similar name. For examples of similar resource names that might be discovered by the name similarity feature, see Examples of name similarity.

Depending on how name similarity is configured, pattern matching will see these resource names as similar and will create a single pattern including events from all of these resource names.

Similarity threshold value: Algorithms are used to determine name similarity. First, an edit distance is calculated by a third-party algorithm. The edit distance is the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters. Then, the algorithm calculates a normalized similarity distance, which lies in the range 0.0 to 1.0. In this range, 0.0 means that the strings are identical and 1.0 means that the strings are completely different. The normalized similarity distance is calculated by using a contribution of the edit distance weighted according to the first string length, the second string length, and the number of transpositions. Finally, the name similarity algorithm calculates a normalized threshold value (in the range 0.0 to 1.0) by subtracting the normalized similarity distance from the value 1.0. A threshold value of 0.0 means strings can be completely different. A threshold value of 1.0 means that strings must match exactly.

By default name similarity is configured with values which mean that very similarly named resources are grouped together. By default, resources whose names are 90% similar and that have the same first character are grouped together. This is controlled by the name_similarity_default_threshold and name_similarity_default_lead_restriction parameters in Table 1. See there for more details.

These configuration setting should work effectively in most environments. Use this procedure to change these settings.

Note: Only change name similarity settings if you understand the underlying algorithm.

Procedure

Generate a properties file containing the latest Event Analytics system settings.
1. Navigate to the directory $IMPACT_HOME/bin.
2. Run the following command to generate a properties file containing the latest Event Analytics system settings.
```
nci_trigger server_name username/password NOI_DefaultValues_Export
 FILENAME directory/filename
```
  Where:
  - server_name is the name of the server where Event Analytics is installed.
  - user name is the user name of the Event Analytics user.
  - password is the password of the Event Analytics user.
  - NOI_DefaultValues_Export is a Netcool®/Impact policy that performs an export of the current Event Analytics system settings to a designated properties file.
  - directory is the directory where the properties file is stored.
  - filename is the name of the properties file.
  For example:
```
nci_trigger NCI impactadmin/impactpass NOI_DefaultValues_Export
 FILENAME /tmp/properties.props
```
Edit the properties file that you generated in the previous step.
For example:
```
vi /tmp/properties.props
```

Find the section of the properties file that reads as follows:

This code snippet shows the default values of the name similarity parameters.

##################################################################################
# The following properties are used to configure Name Similarity NS             ##
##################################################################################


name_similarity_feature_enable=true
name_similarity_default_pattern_enable=false
name_similarity_default_threshold=0.9
name_similarity_default_lead_restriction=1
name_similarity_default_tail_restriction=0

Update one or more of the name similarity settings.

The following table describes each of these settings.

Table 1. Name similarity settings
Parameter	Description	Values
`name_similarity_feature_enable`	Boolean that switches the name similarity feature on or off. Note: This is a global flag that governs all name similarity functionality. For example, if you set this flag to `false`, then no aspect of name similarity will be enabled, and none of the other flags in this table will have any effect.	Possible values: `true`: Name similarity is switched on. `false`: Name similarity is switched off. Default value: `true`
`name_similarity_default_pattern_enable`	Boolean that specifies whether to apply name similarity processing to historical patterns, meaning patterns that were created before name similarity was introduced into the Netcool Operations Insight® solution. Name similarity was introduced into Netcool Operations Insight in V1.5.0, which corresponds to Netcool/Impact fix pack 14.	Possible values: `true`: Apply name similarity processing to historical patterns. `false`: Do not apply name similarity processing to historical patterns. Default value: `false`
`name_similarity_default_threshold`	String comparison threshold value, where `0` equates to completely dissimilar strings, and `1` equates to identical strings. The value specified in the `name_similarity_default_threshold` parameter, is used to determine whether two strings are similar. Note: The string similarity test is also governed by the lead and tail restriction parameters described in the following rows.	Possible values: `0` to `1` inclusive Default value: `0.9`, i.e. resource names must be 90% similar.
`name_similarity_default_lead_restriction`	Number of characters at the beginning of the strings being compared that must be identical. Important: If this number of characters is not identical then the strings automatically fail the similarity test.	Default value: `1`, i.e. resource names must start with the same character. Note: This default setting assumes that the front end of the strings being compared is usually different.
`name_similarity_default_tail_restriction`	Number of characters at the end of the strings being compared that must be identical. Important: If this number of characters is not identical then the strings automatically fail the similarity test.	Default value: `0` Note: This default setting assumes that the tail end of the strings being compared is usually the same; for example "`.com`".

Import the modified properties file into Event Analytics.
1. Ensure you are in the directory $IMPACT_HOME/bin.
2. Run the following command to perform an import of Event Analytics system settings from a designated properties file.
```
nci_trigger server_name username/password NOI_DefaultValues_Configure
 FILENAME directory/filename
```
  Where:
  - server_name is the name of the server where Event Analytics is installed.
  - user name is the user name of the Event Analytics user.
  - password is the password of the Event Analytics user.
  - NOI_DefaultValues_Configure is a Netcool/Impact policy that performs an import of Event Analytics system settings from a designated properties file.
  - directory is the directory where the properties file is stored.
  - filename is the name of the properties file.
  For example:
```
nci_trigger NCI impactadmin/impactpass NOI_DefaultValues_Configure
 FILENAME /tmp/properties.props
```