When it comes to providing secure access for speech applications, a wide array of options are available in the market today, from Q&A sessions conducted by live agents to more automated solutions based on touch-tone PINs, or password matching based on traditional speech recognition technology. All these solutions are prone to fraud, as the authentication mechanism is based on information that can be provided by anyone.
The ability to authenticate someone’s identity based on his or her voice is referred to as speaker verification. It significantly reduces the risks of unauthorized access, since the authentication mechanism is based on the unique features of someone’s voiceprint (much like a fingerprint is in the tactile sense).
IBM’s WebSphere Voice Server integrates with WebSphere Application Server and provides voice authentication. Speaker verification in WebSphere Voice Server takes speech in, and matches against an enrolled voiceprint for authentication. It adds an extra layer of protection to sensitive information. For example, if a user’s account ID and password are stolen, the imposter would be detected by the system when he or she tries to get in pretending to be someone else.
IBM’s speaker verification technology provides a grammar, language, and text independent authentication mechanism. You can enroll saying anything, in any language, and have it verify you. Some of the benefits of the speaker verification feature of WebSphere Voice Server include:
- Language Independence
- One speaker verification engine can handle all languages. A user can enroll in one language and be verified in another
- Text Independence
- A user can say anything, not bound by a grammar.
- Speaker Tracking
- The system can monitor calls for assurance that verified speaker answered all prompts.
The speaker verification component for WebSphere Voice Server must first be tuned before it can be fully utilized. Tuning a speaker verification environment involves enrolling multiple voices, collecting target and non-target score values, and generating a statistical picture of the speaker verification environment in order to predict the score value. This article provides a sample speaker verification tuning application to facilitate this work.
A score represents how a particular audio scored against a given enrolled voice print. Scores are expressed as numbers between -1 and +1, with -1 being the least similar (mismatch) and +1 being the most similar (match). A target score value is a score that reflects how a particular audio file matched its corresponding voice print. A non-target score represents how an audio file scored against a voice other than its own. To properly use speaker verification, a user must know what acceptance threshold criteria to use to accept or reject a claimant by creating a statistical model. To generate this statistical model, many enrollments and audio score values need to be processed.
That is where this speaker verification tuning package comes in. The tuning package consists of three parts.
- An enterprise application that can be installed on a WebSphere Application Server machine to record audio files needed for tuning.
- A stand-alone application to submit all collected audio files and generate target and non-target score reports.
- A stand-alone application to analyze the target and non-target scores and calculate the acceptance threshold for your speaker verification environment.
Installing and configuring the tuning package
The tuning package must be installed in the same machine where you have the WVS Feature Pack for speaker verification installed. Download one of the packages below and extract its contents to $WSV_ROOT/tuning. Follow the instructions available in $WSV_ROOT/tuning/Readme.htm for installing and configuring the tuning package.
In order for the speaker verification engine to perform properly, it needs to be tuned to accommodate the specific characteristics of your target deployment environment. The samples supplied in this article will facilitate that activity and provide you the desired acceptance threshold best for you.
| Name | Size | Download method |
|---|---|---|
| tuningpackage.tar | 390 KB | HTTP |
| tuningpackage.zip | 311 KB | HTTP |
Information about download methods




