IBM Research AI at ICASSP 2020

Share this post:

The 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020) is taking place virtually from May 4-8. IBM Research AI is pleased to support the conference as a bronze patron and to share our latest research results, described in 17 papers that will be presented at the conference (listed below).

These papers span a range of disciplines, including automatic speech recognition, training of large deep learning models, spoken language understanding, speaker diarization and speaker change detection, and multimodal machine learning. They are the result of work done at four of IBM’s global research laboratories in Haifa, Israel, São Paulo, Tokyo, and Yorktown Heights, New York and also in collaboration with our academic partners at the University of Illinois, Urbana-Champaign, the University of Thessaly, Waseda University, the University of Tokyo, and the University of Iowa.

Fast training of deep neural networks for speech recognition,” Guojing Cong, Brian Kingsbury, Chih-Chieh Yang, and Tianyi Liu, Wednesday, Speech Recognition: Acoustic Modelling I (Poster)

Speaker embeddings incorporating acoustic conditions for diarization,” Yosuke Higuchi, Masayuki Suzuki, and Gakuto Kurata, Speaker Diarization and Characterization (Poster)

Converting written language to spoken language with neural machine translation for language modeling,” Shintaro Ando, Masaykui Suzuki, Nobuyasyu Itoh, Gakuto Kurata, and Nobuaki Minematsu, Language Understanding and Modeling (Poster)

Training spoken language understanding systems with non-parallel speech and text,” Leda Sarı, Samuel Thomas, and Mark Hasegawa-Johnson, Language Understanding and Modeling (Poster)

Improving efficiency in large-scale decentralized distributed training,” Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, and Michael Picheny, Industry Session on Large-Scale Distributed Learning Strategies (Oral)

Context and Uncertainty Modeling for Speaker Change Detection,” Hagai Aronowitz and Weizhong Zhu, COLL-L2: Session 3R: Robustness Reproducibility Replicability (Oral)

Audio-assisted image inpainting for talking faces,”Alexandros Koumparoulis, Gerasimos Potamianos, Samuel Thomas, and Edmilson da Silva Morais, Machine Learning for Speech Synthesis III (Poster)

Leveraging unpaired text data for training end-to-end speech-to-intent systems,” Yingui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, and Michael Picheny, Spoken Language Understanding and Dialogue II (Oral)

Alignment-length synchronous decoding for RNN transducer”, George Saon, Zoltán Tüske, and Kartik Audhkhasi, Large Vocabulary Continuous Speech Recognition and Search (Poster)

Towards an Efficient and General Framework of Robust Training for Graph Neural Networks,”Kaidi Xu, Sijia Liu, Pin-Yu Chen, Mengshu Sun, Caiwen Ding, Bhavya Kailkhura, and Xue Lin, A Signal-Processing View of Graph Neural Networks (Oral)

Preservation of Anomalous Subgroups on Variational Autoencoder Transformed Data,”Samuel C. Maina, Reginald E. Bryant, William Ogallo, Kush R. Varshney, Skyler Speakman, Celia Cintas, Aisha Walcott-Bryant, and Robert-Florian Samoilescu, Adversarial Attacks and Fast Algorithms (Poster)

Learn-By-Calibrating: Using Calibration as a Training Objective,”Jayaraman J. Thiagarajan, Bindya Venkatesh, and Deepta Rajan, Adversarial Attacks and Fast Algorithms (Poster)

Characterizing Adversarial Speech Examples Using Self-Attention U-Net Enhancement,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Chin-Hui Lee, Adversarial Machine Learning (Oral)

Decentralized Stochastic Non-convex Optimization over Weakly Connected Time-varying Digraphs,”Songtao Lu and Chai Wah Wu, Optimization Techniques II (Poster)

Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data,”Takayuki Katsuki, Kun Zhao, and Takayuki Yoshizumi, Signal Processing for Emerging Industry Applications (Oral)

ADVMS: A Multi-source Multi-cost Defense against Adversarial Attacks,”Xiao Wang, Siyue Wang, Pin-Yu Chen, Xue Lin, and Peter Chin, Anonymization, Security and Privacy (Poster)

Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Yi Ouyang, Chin-Hui Lee, and Xiaoli Ma, Sequential Learning (Oral)

Investigating Generalization in Neural Networks Under Optimally Evolved Training Perturbations,” Subhajit Chaudhury, Toshihiko Yamasaki, Deep learning techniques (Poster)

Inventing What’s Next.

Stay up to date with the latest announcements, research, and events from IBM Research through our newsletter.


Senior Technical Staff Member, IBM Research-Tokyo

Brian Kingsbury

Distinguished Research Staff Member, IBM Research

Ron Hoory

Senior Technical Staff Member, IBM Research

More AI stories

IBM launches blockchain for high-end textile for transparency of the supply chain

A new solution for the textile industry use blockchain allows users to track the entire spectrum of fabric manufacturing.

Continue reading

Using SecDevOps to design and embed security and compliance into development workflows

IBM Research has initiated focused efforts called Code Risk Analyzer to bring security and compliance analytics to DevSecOps. Code Risk Analyzer is a new feature of IBM Cloud Continuous Delivery, a cloud service that helps provision toolchains, automate builds and tests, and control quality with analytics.

Continue reading

IBM Research and the Broad Institute Seek to Unravel the True Risks of Genetic Diseases

In 2019, IBM and the Broad Institute of MIT and Harvard started a multi-year collaborative research program to develop powerful predictive models that can potentially enable clinicians to identify patients at serious risk for cardiovascular disease (1, 2). At the start of our collaboration, we proposed an approach to develop AI-based models that combine and […]

Continue reading