AI

IBM Research AI at INTERSPEECH 2019

Share this post:

The 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019) took place in Graz, Austria earlier this month. IBM Research AI was proud to support the conference as a silver sponsor and to share our latest research results in 17 papers presented at the conference (listed below).

Many of IBM’s clients need to provide customer care over a wide variety of channels, including e-mail, text chat, and spoken interactions over the telephone. Our clients must provide their customers with a uniformly good experience across these different channels and extract actionable insights from these interactions. To meet these goals, we need to improve the underlying speech technologies: speech-to-text, text-to-speech, and spoken language understanding.

IBM serves a wide range of clients operating in many different industries, each with its own unique terminology and requirements. In order to support a broad set of speech applications, we focus on two goals:

  1. building strong base models that give our clients good “out-of-the-box” performance, and
  2. supplying tools that empower our clients to customize models for their own use cases.

Because we are researchers, we also pursue fundamental, exploratory work that may not go into products or services in the near term.

Our Interspeech 2019 papers reflect our short- and long-term goals, and show the depth and diversity of our speech research. We invite you to read the papers that interest you, and reach out to the authors if you want to learn more.

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews,” Aldeneh, M. Jaiswal, M. Picheny, M. McInnis, and E. Mower Provost

Forget a Bit to Learn Better: Soft Forgetting for CTC-based Automatic Speech Recognition,K. AudhkhasiG. Saon, Z. Tüske, B. Kingsbury, and M. Picheny

Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text,” M. K. Baskar, S. Watanabe, R. Astudillo, T. Hori, L. Burget, and J. Černocký,

Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition,X. Cui and M. Picheny

Direct Neuron-wise Fusion of Cognate Neural Networks,T. FukudaM. Suzuki, and G. Kurata

Adversarial Black-Box Attacks on Automatic Speech Recognition Systems Using Multi-Objective Evolutionary Optimization,” S. Khare, R. Aralikatte, and S. Mani

High quality, lightweight and adaptable TTS using LPCNet,Z. KonsS. ShechtmanA. Sorin, C. Rabinovitz, and R. Hoory

Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation,G. Kurata and K. Audhkhasi

Multi-task CTC Training with Auxiliary Feature Reconstruction for End-to-end Speech Recognition,G. Kurata and K. Audhkhasi

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition,” K. C. Mac, X. CuiW. Zhang, and M. Picheny

Challenging the Boundaries of Speech Recognition: The MALACH Corpus,” M. Picheny, Z. Tüske, B. KingsburyK. AudhkhasiX. Cui, and G. Saon

A New Approach for Automating Analysis of Responses on Verbal Fluency Tests from Subjects At-Risk for Schizophrenia,” M. Pietrowicz, C. Agurto, R. NorelE. EyigozG. Cecchi, Z. Bilgrami, and C. Corcoran

Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks,” L. Sarı, S. Thomas, and M. Hasegawa-Johnson

Detection and Recovery of OOVs for Improved English Broadcast News Captioning,S.ThomasK. Audhkhasi, Z. Tüske, Y. Huang, and M. Picheny

Advancing sequence-to-sequence based speech recognition,” Z. Tüske, K. Audhkhasi, and G. Saon

Few-Shot Audio Classification with Attentional Graph Neural Networks,” S. Zhang, Y. Qin, K. Sun, and Y. Lin

A Highly-Efficient Distributed Deep Learning System For Automatic Speech Recognition,W. ZhangX. Cui, U. Finkler, G. SaonA. KayiA. BuyuktosunogluB. Kingsbury, D. Kung, and M. Picheny

 

Distinguished Research Staff Member, IBM Research

Ron Hoory

Senior Technical Staff Member, IBM Research

Gakuto Kurata

Senior Technical Staff Member, IBM Research-Tokyo

More AI stories

IBM Research Pioneers Technologies Behind New AI for IT Capabilities

IBM is launching today a broad range of new AI-powered capabilities and services to help CIOs automate various aspects of IT development, infrastructure and operations, including IBM Watson AIOps and Accelerator for Application Modernization with AI. As is the case with much of IBM’s AI development, significant portions of the technologies underlying Watson AIOps and the Accelerator were born out of IBM Research. 

Continue reading

IBM Research AI at ICASSP 2020

The 45th International Conference on Acoustics, Speech, and Signal Processing is taking place virtually from May 4-8. IBM Research AI is pleased to support the conference as a bronze patron and to share our latest research results, described in nine papers that will be presented at the conference.

Continue reading

IBM Research Progresses Field of Human-Computer Interaction (HCI)

IBM Research's contributions to CHI 2020 focus on creating and designing AI technologies that center on user needs and societal values, spanning the topics of novel human-AI partnerships, AI UX and design, trusted AI, and AI for accessibility. 

Continue reading