Tweak your audio model for better speech recognition

Fine tune audio input for accuracy with these strategies and tools

From the developerWorks archives

Colin Beckingham

Date archived: January 4, 2017 | First published: June 12, 2012

Dealing with an inadequately prepared audio model can be frustrating, particularly for beginners in the speech recognition field who are working with their own speaker-dependent models. Unlike keyboard and mouse input, which is relatively positive in action and easily interpreted by the operating system, audio input to a speech recognizer is less positive and depends heavily on the breadth and depth of the audio model. Programmers can ease the process of analyzing recognition errors by providing tools. A reasonable goal is to move from five errors in 10 to less than one in a thousand: Find out how using tools constructed with Python and PostgreSQL.

This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some steps and illustrations may have changed.

Zone=Open source
ArticleTitle=Tweak your audio model for better speech recognition