IBM Research AI at ICASSP 2020

Share this post:

The 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020) is taking place virtually from May 4-8. IBM Research AI is pleased to support the conference as a bronze patron and to share our latest research results, described in 17 papers that will be presented at the conference (listed below).

These papers span a range of disciplines, including automatic speech recognition, training of large deep learning models, spoken language understanding, speaker diarization and speaker change detection, and multimodal machine learning. They are the result of work done at four of IBM’s global research laboratories in Haifa, Israel, São Paulo, Tokyo, and Yorktown Heights, New York and also in collaboration with our academic partners at the University of Illinois, Urbana-Champaign, the University of Thessaly, Waseda University, the University of Tokyo, and the University of Iowa.

Fast training of deep neural networks for speech recognition,” Guojing Cong, Brian Kingsbury, Chih-Chieh Yang, and Tianyi Liu, Wednesday, Speech Recognition: Acoustic Modelling I (Poster)

Speaker embeddings incorporating acoustic conditions for diarization,” Yosuke Higuchi, Masayuki Suzuki, and Gakuto Kurata, Speaker Diarization and Characterization (Poster)

Converting written language to spoken language with neural machine translation for language modeling,” Shintaro Ando, Masaykui Suzuki, Nobuyasyu Itoh, Gakuto Kurata, and Nobuaki Minematsu, Language Understanding and Modeling (Poster)

Training spoken language understanding systems with non-parallel speech and text,” Leda Sarı, Samuel Thomas, and Mark Hasegawa-Johnson, Language Understanding and Modeling (Poster)

Improving efficiency in large-scale decentralized distributed training,” Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, and Michael Picheny, Industry Session on Large-Scale Distributed Learning Strategies (Oral)

Context and Uncertainty Modeling for Speaker Change Detection,” Hagai Aronowitz and Weizhong Zhu, COLL-L2: Session 3R: Robustness Reproducibility Replicability (Oral)

Audio-assisted image inpainting for talking faces,”Alexandros Koumparoulis, Gerasimos Potamianos, Samuel Thomas, and Edmilson da Silva Morais, Machine Learning for Speech Synthesis III (Poster)

Leveraging unpaired text data for training end-to-end speech-to-intent systems,” Yingui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, and Michael Picheny, Spoken Language Understanding and Dialogue II (Oral)

Alignment-length synchronous decoding for RNN transducer”, George Saon, Zoltán Tüske, and Kartik Audhkhasi, Large Vocabulary Continuous Speech Recognition and Search (Poster)

Towards an Efficient and General Framework of Robust Training for Graph Neural Networks,”Kaidi Xu, Sijia Liu, Pin-Yu Chen, Mengshu Sun, Caiwen Ding, Bhavya Kailkhura, and Xue Lin, A Signal-Processing View of Graph Neural Networks (Oral)

Preservation of Anomalous Subgroups on Variational Autoencoder Transformed Data,”Samuel C. Maina, Reginald E. Bryant, William Ogallo, Kush R. Varshney, Skyler Speakman, Celia Cintas, Aisha Walcott-Bryant, and Robert-Florian Samoilescu, Adversarial Attacks and Fast Algorithms (Poster)

Learn-By-Calibrating: Using Calibration as a Training Objective,”Jayaraman J. Thiagarajan, Bindya Venkatesh, and Deepta Rajan, Adversarial Attacks and Fast Algorithms (Poster)

Characterizing Adversarial Speech Examples Using Self-Attention U-Net Enhancement,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Chin-Hui Lee, Adversarial Machine Learning (Oral)

Decentralized Stochastic Non-convex Optimization over Weakly Connected Time-varying Digraphs,”Songtao Lu and Chai Wah Wu, Optimization Techniques II (Poster)

Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data,”Takayuki Katsuki, Kun Zhao, and Takayuki Yoshizumi, Signal Processing for Emerging Industry Applications (Oral)

ADVMS: A Multi-source Multi-cost Defense against Adversarial Attacks,”Xiao Wang, Siyue Wang, Pin-Yu Chen, Xue Lin, and Peter Chin, Anonymization, Security and Privacy (Poster)

Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Yi Ouyang, Chin-Hui Lee, and Xiaoli Ma, Sequential Learning (Oral)

Investigating Generalization in Neural Networks Under Optimally Evolved Training Perturbations,” Subhajit Chaudhury, Toshihiko Yamasaki, Deep learning techniques (Poster)

Inventing What’s Next.

Stay up to date with the latest announcements, research, and events from IBM Research through our newsletter.


Senior Technical Staff Member, IBM Research-Tokyo

Brian Kingsbury

Distinguished Research Staff Member, IBM Research

Ron Hoory

Senior Technical Staff Member, IBM Research

More AI stories

Who. What. Why. New IBM algorithm models how the order of prior actions impacts events

To address the problem of ordinal impacts, our team at IBM T. J. Watson Research Center has developed OGEMs – or Ordinal Graphical Event Models – new dynamic, probabilistic graphical models for events. These models are part of the broader family of statistical and causal models called graphical event models (GEMs) that represent temporal relations where the dynamics are governed by a multivariate point process.

Continue reading

IBM’s Squawk Bot AI helps make sense of financial data flood

In our recent work, we detail an AI and machine learning mechanism able to assist in correlating a large body of text with numerical data series used to describe financial performance as it evolves over time. Our deep learning-based system pulls out from large amounts of textual data potentially relevant and useful textual descriptions that explain the performance of a financial metric of interest – without the need of human experts or labelled data.

Continue reading

IBM’s innovation: Topping the US patent list for 28 years running

A patent is evidence of an invention, protecting it through legal documentation, and importantly, published for all to read. The number of patents IBM produces each year – and in 2020, it was more than 9,130 US patents – demonstrates our continuous, never-ending commitment to research and innovation.

Continue reading