IBM Research AI at ICASSP 2020

Share this post:

The 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020) is taking place virtually from May 4-8. IBM Research AI is pleased to support the conference as a bronze patron and to share our latest research results, described in 17 papers that will be presented at the conference (listed below).

These papers span a range of disciplines, including automatic speech recognition, training of large deep learning models, spoken language understanding, speaker diarization and speaker change detection, and multimodal machine learning. They are the result of work done at four of IBM’s global research laboratories in Haifa, Israel, São Paulo, Tokyo, and Yorktown Heights, New York and also in collaboration with our academic partners at the University of Illinois, Urbana-Champaign, the University of Thessaly, Waseda University, the University of Tokyo, and the University of Iowa.

Fast training of deep neural networks for speech recognition,” Guojing Cong, Brian Kingsbury, Chih-Chieh Yang, and Tianyi Liu, Wednesday, Speech Recognition: Acoustic Modelling I (Poster)

Speaker embeddings incorporating acoustic conditions for diarization,” Yosuke Higuchi, Masayuki Suzuki, and Gakuto Kurata, Speaker Diarization and Characterization (Poster)

Converting written language to spoken language with neural machine translation for language modeling,” Shintaro Ando, Masaykui Suzuki, Nobuyasyu Itoh, Gakuto Kurata, and Nobuaki Minematsu, Language Understanding and Modeling (Poster)

Training spoken language understanding systems with non-parallel speech and text,” Leda Sarı, Samuel Thomas, and Mark Hasegawa-Johnson, Language Understanding and Modeling (Poster)

Improving efficiency in large-scale decentralized distributed training,” Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, and Michael Picheny, Industry Session on Large-Scale Distributed Learning Strategies (Oral)

Context and Uncertainty Modeling for Speaker Change Detection,” Hagai Aronowitz and Weizhong Zhu, COLL-L2: Session 3R: Robustness Reproducibility Replicability (Oral)

Audio-assisted image inpainting for talking faces,”Alexandros Koumparoulis, Gerasimos Potamianos, Samuel Thomas, and Edmilson da Silva Morais, Machine Learning for Speech Synthesis III (Poster)

Leveraging unpaired text data for training end-to-end speech-to-intent systems,” Yingui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, and Michael Picheny, Spoken Language Understanding and Dialogue II (Oral)

Alignment-length synchronous decoding for RNN transducer”, George Saon, Zoltán Tüske, and Kartik Audhkhasi, Large Vocabulary Continuous Speech Recognition and Search (Poster)

Towards an Efficient and General Framework of Robust Training for Graph Neural Networks,”Kaidi Xu, Sijia Liu, Pin-Yu Chen, Mengshu Sun, Caiwen Ding, Bhavya Kailkhura, and Xue Lin, A Signal-Processing View of Graph Neural Networks (Oral)

Preservation of Anomalous Subgroups on Variational Autoencoder Transformed Data,”Samuel C. Maina, Reginald E. Bryant, William Ogallo, Kush R. Varshney, Skyler Speakman, Celia Cintas, Aisha Walcott-Bryant, and Robert-Florian Samoilescu, Adversarial Attacks and Fast Algorithms (Poster)

Learn-By-Calibrating: Using Calibration as a Training Objective,”Jayaraman J. Thiagarajan, Bindya Venkatesh, and Deepta Rajan, Adversarial Attacks and Fast Algorithms (Poster)

Characterizing Adversarial Speech Examples Using Self-Attention U-Net Enhancement,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Chin-Hui Lee, Adversarial Machine Learning (Oral)

Decentralized Stochastic Non-convex Optimization over Weakly Connected Time-varying Digraphs,”Songtao Lu and Chai Wah Wu, Optimization Techniques II (Poster)

Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data,”Takayuki Katsuki, Kun Zhao, and Takayuki Yoshizumi, Signal Processing for Emerging Industry Applications (Oral)

ADVMS: A Multi-source Multi-cost Defense against Adversarial Attacks,”Xiao Wang, Siyue Wang, Pin-Yu Chen, Xue Lin, and Peter Chin, Anonymization, Security and Privacy (Poster)

Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Yi Ouyang, Chin-Hui Lee, and Xiaoli Ma, Sequential Learning (Oral)

Investigating Generalization in Neural Networks Under Optimally Evolved Training Perturbations,” Subhajit Chaudhury, Toshihiko Yamasaki, Deep learning techniques (Poster)

Inventing What’s Next.

Stay up to date with the latest announcements, research, and events from IBM Research through our newsletter.


Senior Technical Staff Member, IBM Research-Tokyo

Brian Kingsbury

Distinguished Research Staff Member, IBM Research

Ron Hoory

Senior Technical Staff Member, IBM Research

More AI stories

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading