AI

IBM Research AI at INTERSPEECH 2019

Share this post:

The 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019) took place in Graz, Austria earlier this month. IBM Research AI was proud to support the conference as a silver sponsor and to share our latest research results in 17 papers presented at the conference (listed below).

Many of IBM’s clients need to provide customer care over a wide variety of channels, including e-mail, text chat, and spoken interactions over the telephone. Our clients must provide their customers with a uniformly good experience across these different channels and extract actionable insights from these interactions. To meet these goals, we need to improve the underlying speech technologies: speech-to-text, text-to-speech, and spoken language understanding.

IBM serves a wide range of clients operating in many different industries, each with its own unique terminology and requirements. In order to support a broad set of speech applications, we focus on two goals:

  1. building strong base models that give our clients good “out-of-the-box” performance, and
  2. supplying tools that empower our clients to customize models for their own use cases.

Because we are researchers, we also pursue fundamental, exploratory work that may not go into products or services in the near term.

Our Interspeech 2019 papers reflect our short- and long-term goals, and show the depth and diversity of our speech research. We invite you to read the papers that interest you, and reach out to the authors if you want to learn more.

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews,” Aldeneh, M. Jaiswal, M. Picheny, M. McInnis, and E. Mower Provost

Forget a Bit to Learn Better: Soft Forgetting for CTC-based Automatic Speech Recognition,K. AudhkhasiG. Saon, Z. Tüske, B. Kingsbury, and M. Picheny

Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text,” M. K. Baskar, S. Watanabe, R. Astudillo, T. Hori, L. Burget, and J. Černocký,

Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition,X. Cui and M. Picheny

Direct Neuron-wise Fusion of Cognate Neural Networks,T. FukudaM. Suzuki, and G. Kurata

Adversarial Black-Box Attacks on Automatic Speech Recognition Systems Using Multi-Objective Evolutionary Optimization,” S. Khare, R. Aralikatte, and S. Mani

High quality, lightweight and adaptable TTS using LPCNet,Z. KonsS. ShechtmanA. Sorin, C. Rabinovitz, and R. Hoory

Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation,G. Kurata and K. Audhkhasi

Multi-task CTC Training with Auxiliary Feature Reconstruction for End-to-end Speech Recognition,G. Kurata and K. Audhkhasi

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition,” K. C. Mac, X. CuiW. Zhang, and M. Picheny

Challenging the Boundaries of Speech Recognition: The MALACH Corpus,” M. Picheny, Z. Tüske, B. KingsburyK. AudhkhasiX. Cui, and G. Saon

A New Approach for Automating Analysis of Responses on Verbal Fluency Tests from Subjects At-Risk for Schizophrenia,” M. Pietrowicz, C. Agurto, R. NorelE. EyigozG. Cecchi, Z. Bilgrami, and C. Corcoran

Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks,” L. Sarı, S. Thomas, and M. Hasegawa-Johnson

Detection and Recovery of OOVs for Improved English Broadcast News Captioning,S.ThomasK. Audhkhasi, Z. Tüske, Y. Huang, and M. Picheny

Advancing sequence-to-sequence based speech recognition,” Z. Tüske, K. Audhkhasi, and G. Saon

Few-Shot Audio Classification with Attentional Graph Neural Networks,” S. Zhang, Y. Qin, K. Sun, and Y. Lin

A Highly-Efficient Distributed Deep Learning System For Automatic Speech Recognition,W. ZhangX. Cui, U. Finkler, G. SaonA. KayiA. BuyuktosunogluB. Kingsbury, D. Kung, and M. Picheny

 

Distinguished Research Staff Member, IBM Research

Ron Hoory

Senior Technical Staff Member, IBM Research

Gakuto Kurata

Senior Technical Staff Member, IBM Research-Tokyo

More AI stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page: https://research.ibm.com/blog

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading