IM IBM InfoSphere Global Name Recognition, Version 4.2

NameHunter overview

The NameHunter® programming library includes functions and classes that enable developers to add enhanced personal and organizational name searching to a new or existing application.

The NameHunter APIs give your application the ability to support user requests such as "Give me the 10 closest names to JAMES SLESINGER from my name list", "Show me all names in a database that match JOHN WONG with similarity of 90% or more", or "Tell me the degree of similarity between PAUL VANESANN and P. VANLESANN".

NameHunter uses integrated linguistic, probabilistic and string-similarity techniques to achieve search results well beyond those delivered by standard string-similarity metrics such as edit-distance, or name-grouping algorithms such as standard Soundex or NYSIIS.

The NameHunter libraries are coded in standard C++ and can be integrated into any application written in C++. Therefore, the NameHunter library can be used on any platform that supports a C++ compiler. NameHunter was designed for simplicity, ease of integration, maximum run-time flexibility, and extensibility.

Note: NameHunter can process a maximum of six tokens per name field. All tokens after the first six tokens are ignored.
  • Culture-specific and configurable NH searches
    Each individual search performed by NameHunter is uniquely configurable by adjusting numerous run-time comparison parameters.
  • Titles, affixes, and qualifier (TAQ) data
    IBM provides a list of multi-cultural titles, affixes, and qualifiers (TAQs) that are used in name matching. TAQs are not thrown away when performing a name comparison because they can contribute to the overall name score depending on the search parameters.
  • Name token variants
    A name token variant is an alternative of a specified name that is considered to be equivalent to that name, but which differs from it in its particular external form. Variants usually arise from spelling variations. NameHunter provides functions to load these files.
  • Terms
    Terms are name tokens that refer to a concept. They are most commonly used for organization names.
  • Name regularization
    Name regularization is a feature that, if enabled, generates normalized versions of name tokens. The effect of these normalized spellings is to enable NameHunter to identify names that are widely understood to be related (usually on the basis of highly similar or equivalent pronunciations), even though their spellings are quite different.
  • Integrating the NameHunter API in applications
    To integrate the NameHunter API into your applications, use the NameHunter.h header file, which contains a complete definition of the API.
  • Linking to other data
    NameHunter stores and searches name data. Most systems store and use other kinds of data in addition to names, for example: addresses, account numbers, physical attributes, and images. For this reason, NameHunter provides an ID field for each entry in the SearchList class.
Parent topic: Searching for names using NameHunter


Feedback