AI

IBM Research AI at ICASSP 2020

Share this post:

The 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020) is taking place virtually from May 4-8. IBM Research AI is pleased to support the conference as a bronze patron and to share our latest research results, described in 17 papers that will be presented at the conference (listed below).

These papers span a range of disciplines, including automatic speech recognition, training of large deep learning models, spoken language understanding, speaker diarization and speaker change detection, and multimodal machine learning. They are the result of work done at four of IBM’s global research laboratories in Haifa, Israel, São Paulo, Tokyo, and Yorktown Heights, New York and also in collaboration with our academic partners at the University of Illinois, Urbana-Champaign, the University of Thessaly, Waseda University, the University of Tokyo, and the University of Iowa.

Fast training of deep neural networks for speech recognition,” Guojing Cong, Brian Kingsbury, Chih-Chieh Yang, and Tianyi Liu, Wednesday, Speech Recognition: Acoustic Modelling I (Poster)

Speaker embeddings incorporating acoustic conditions for diarization,” Yosuke Higuchi, Masayuki Suzuki, and Gakuto Kurata, Speaker Diarization and Characterization (Poster)

Converting written language to spoken language with neural machine translation for language modeling,” Shintaro Ando, Masaykui Suzuki, Nobuyasyu Itoh, Gakuto Kurata, and Nobuaki Minematsu, Language Understanding and Modeling (Poster)

Training spoken language understanding systems with non-parallel speech and text,” Leda Sarı, Samuel Thomas, and Mark Hasegawa-Johnson, Language Understanding and Modeling (Poster)

Improving efficiency in large-scale decentralized distributed training,” Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, and Michael Picheny, Industry Session on Large-Scale Distributed Learning Strategies (Oral)

Context and Uncertainty Modeling for Speaker Change Detection,” Hagai Aronowitz and Weizhong Zhu, COLL-L2: Session 3R: Robustness Reproducibility Replicability (Oral)

Audio-assisted image inpainting for talking faces,”Alexandros Koumparoulis, Gerasimos Potamianos, Samuel Thomas, and Edmilson da Silva Morais, Machine Learning for Speech Synthesis III (Poster)

Leveraging unpaired text data for training end-to-end speech-to-intent systems,” Yingui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, and Michael Picheny, Spoken Language Understanding and Dialogue II (Oral)

Alignment-length synchronous decoding for RNN transducer”, George Saon, Zoltán Tüske, and Kartik Audhkhasi, Large Vocabulary Continuous Speech Recognition and Search (Poster)

Towards an Efficient and General Framework of Robust Training for Graph Neural Networks,”Kaidi Xu, Sijia Liu, Pin-Yu Chen, Mengshu Sun, Caiwen Ding, Bhavya Kailkhura, and Xue Lin, A Signal-Processing View of Graph Neural Networks (Oral)

Preservation of Anomalous Subgroups on Variational Autoencoder Transformed Data,”Samuel C. Maina, Reginald E. Bryant, William Ogallo, Kush R. Varshney, Skyler Speakman, Celia Cintas, Aisha Walcott-Bryant, and Robert-Florian Samoilescu, Adversarial Attacks and Fast Algorithms (Poster)

Learn-By-Calibrating: Using Calibration as a Training Objective,”Jayaraman J. Thiagarajan, Bindya Venkatesh, and Deepta Rajan, Adversarial Attacks and Fast Algorithms (Poster)

Characterizing Adversarial Speech Examples Using Self-Attention U-Net Enhancement,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Chin-Hui Lee, Adversarial Machine Learning (Oral)

Decentralized Stochastic Non-convex Optimization over Weakly Connected Time-varying Digraphs,”Songtao Lu and Chai Wah Wu, Optimization Techniques II (Poster)

Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data,”Takayuki Katsuki, Kun Zhao, and Takayuki Yoshizumi, Signal Processing for Emerging Industry Applications (Oral)

ADVMS: A Multi-source Multi-cost Defense against Adversarial Attacks,”Xiao Wang, Siyue Wang, Pin-Yu Chen, Xue Lin, and Peter Chin, Anonymization, Security and Privacy (Poster)

Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning,”Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Yi Ouyang, Chin-Hui Lee, and Xiaoli Ma, Sequential Learning (Oral)

Investigating Generalization in Neural Networks Under Optimally Evolved Training Perturbations,” Subhajit Chaudhury, Toshihiko Yamasaki, Deep learning techniques (Poster)

Senior Technical Staff Member, IBM Research-Tokyo

Brian Kingsbury

Distinguished Research Staff Member, IBM Research

Ron Hoory

Senior Technical Staff Member, IBM Research

More AI stories

IBM Research Pioneers Technologies Behind New AI for IT Capabilities

IBM is launching today a broad range of new AI-powered capabilities and services to help CIOs automate various aspects of IT development, infrastructure and operations, including IBM Watson AIOps and Accelerator for Application Modernization with AI. As is the case with much of IBM’s AI development, significant portions of the technologies underlying Watson AIOps and the Accelerator were born out of IBM Research. 

Continue reading

IBM Research Progresses Field of Human-Computer Interaction (HCI)

IBM Research's contributions to CHI 2020 focus on creating and designing AI technologies that center on user needs and societal values, spanning the topics of novel human-AI partnerships, AI UX and design, trusted AI, and AI for accessibility. 

Continue reading

IBM Research AI at ICLR 2020: Advancing Trusted, Secure and Precision-Focused AI

IBM Research AI plans to showcase more than a dozen papers at ICLR 2020 covering a diversity of topics including breakthroughs in ways of infusing common sense into AI, securing machine learning from adversarial attacks and maintaining precision of inferencing while reducing energy use.

Continue reading