AI
AI Year in Review: Highlights of Papers from IBM Research in 2019
January 17, 2020 | Written by: John R. Smith
Categorized: AI | Publications
Share this post:
IBM Research has a long history as a leader in the field of Artificial Intelligence (AI). IBM’s pioneering work in AI dates back to the field’s inception in the 1950s, when IBM developed one of the first instances of machine learning, which was applied to the game of checkers. Since then, IBM has been responsible for achieving major milestones in AI, ranging from Deep Blue – the first chess-playing computer to defeat a reigning world champion, to Watson – the first natural language question and answering system able to win at Jeopardy!, to last year’s Project Debater – the first AI system that can build persuasive arguments on its own and effectively engage in debates on complex topics.
IBM’s leadership in AI continued in earnest in 2019, which was notable for a growing focus on critical topics such as making trustworthy AI work in practice, creating new AI engineering paradigms to scale AI for a broader use, and continuing to advance core AI capabilities in language, speech, vision, knowledge & reasoning, human-centered AI, and more. While recent years have seen incredible progress in “narrow AI,” built on technologies like deep learning, IBM Research pushed its AI research in 2019 towards developing a new foundational underpinning of AI for enterprise applications by addressing important problems like learning more from less, enabling trusted AI by ensuring the fairness, explainability, adversarial robustness, and transparency of AI systems, and integrating learning and reasoning as a way to understand more in order to do more.
This shift towards “broad AI” as required for enterprise use cases — which in many ways reflects a maturing of the AI field — and a growing set of requirements stemming from putting AI into production drove a marked increase in our scientific papers in 2019. IBM Research saw a surge in our paper counts across the top AI conferences – AAAI,ICLR, ICML, CVPR, ACL, KDD, IJCAI and NeurIPS. Our active presence at top AI conferences was punctuated by prominent leadership in workshop organization, challenge task organization, invited talks, tutorials, and a continuous stream of demos of new and novel AI technologies. IBM Research organized numerous workshops on key emerging topics such as reasoning and learning for dialog systems, complex question answering, multi-modal learning, and deep learning on graphs. Our work was recognized with keynote talks, including Project Debater at EMNLP-2019 and Advancing, Trusting, and Scaling AI at SenSys-2019, and IBM’s AI Ethics Global Leader Francesca Rossi was recognized with the prestigious IJCAI Distinguished Service Award. Our foundational work on AI will continue to benefit from deepening partnerships with leading academic institutions, such as MIT, as part of the IBM-MIT Watson AI Lab, and universities as part of the AI Horizons Network.
As we look forward to a continued strong focus in 2020 on further developing and advancing AI for enterprise applications, some of which are described in our 2020 AI predictions, we reflect back on some of our key topics and notable papers in 2019. The more complete set of AI papers is available on our IBM Research AI publications site.
Human-AI Collaboration in Data Science
Data science is at the heart of training machine learning models for AI tasks. There is a growing focus on increasing automation of data science, known as AutoAI, where tools and platforms for AI model development, such as IBM’s Watson Studio, are becoming increasingly capable of automatically ingesting and pre-processing data, engineering new features, conducting training — including hyperparameter search and neural architecture search — and scoring models based on target metrics. Given the early stage of AutoAI, IBM Research published an important study, “Human-AI Collaboration in Data Science: Exploring Data Scientists’ Perceptions of Automated AI,” at CSCW-2019 to understand how AutoAI will impact the practice of data science. Based on interviews with 20 practicing data scientists, the study reveals new insights on how AutoAI systems can be crafted with the goal of augmenting rather than automating the work and practices of data science teams.
Explainable AI
IBM Research developed a new comprehensive open source toolkit, called AI Explainability 360 (AIX360), that incorporates diverse state-of-the-art methods and evaluation metrics for explainability and interpretability. As described in “One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques,” AIX360 helps identify gaps in AI systems where explainability methods are needed and provides tools and a platform to incorporate them. The result is AI systems that allow users to gain better insight into the machine’s decision-making process. The AI Explainability 360 toolkit includes numerous algorithms for explainability published by IBM Research in 2019 including “Representation by Selecting Prototypes with Importance Weights,” (IEEE ICDM-2019), “TED: Teaching AI to Explain Its Decisions” (AAAI/ACM AIES-2019), “Generating Contrastive Explanations with Monotonic Attribute Functions” and “Generalized Linear Rule Models” (ICML-2019).
Learning Causal Relationships from Data
Understanding casual relationships from data is important and challenging. For one, knowing causal relationships gives essential information on how certain changes produce certain effects. Causal modeling also provides insights that can improve fairness of statistical inferencing involving outcome variables. IBM Research, together with Columbia University and Purdue University, developed a new approach for learning plausible causal graphs from observational and interventional data with latent variables. The paper, “Learning and Characterizing Causal Graphs with Latent Variables” published at NeurIPS-2019, makes a key finding that exploits “do-calculus” rules to reverse-engineer the class of possible causal graphs that describe an underlying system. The result is a better understanding of how variables that represent the factors of variation causally relate to each other in AI systems.
Learning More from Less
Learning from less data is important to allow AI systems to rapidly adapt to new tasks. Towards this goal, IBM Research, in collaboration with University of California, San Diego and University of Texas at Austin, published a novel method for transfer learning in “SpotTune: Transfer Learning through Adaptive Fine-Tuning” at CVPR-2019 that automatically decides which layers of a neural network to fine-tune or freeze during training for a new task, thus balancing the optimal use of new and prior information. This is important for learning efficiently and effectively for new tasks, when possibly only little new training data is available. SpotTune outperformed other state-of-art transfer learning methods on 12 out of 14 standard datasets and achieved the highest score on the Visual Decathlon challenge, a competitive benchmark for testing the performance of multi-domain learning algorithms with a total of 10 datasets.
Neural Program Induction
Viewing a scene and answering questions about it is a fairly easy task for people. This is not so for deep learning models, which typically struggle to do this accurately. In order to address this challenge, MIT and IBM published “The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision” at ICLR-2019. This work uses trained neural network models to analyze the visual scenes and parse natural language questions about the scenes, respectively. The output of these models is the generation of a symbolic program that is populated from the information extracted about the scenes. This induced symbolic program is then run to effectively answer the question. This compelling example of neural program induction shows the power and promise of integrating neural and symbolic techniques as a way to advance AI systems.
Domain Adaptation for Conversational Systems
Conversational systems are essential for enterprise applications such as customer care. They require multiple AI components including speech-to-text (STT), text-to-speech (TTS), dialog, and question answering, which need to be adapted to the specific domain and usage context for each deployment. This introduces significant technical challenges such as efficiently fine-tuning corresponding AI models with minimal training data. As part of this direction, IBM Research published “High quality, lightweight and adaptable TTS using LPCNet” at InterSpeech-2019, which presents a new lightweight adaptable neural TTS system for generating high quality speech that can learn new voices with a small amount of data. The system can synthesize speech with close to natural quality while running three times faster than real-time on a standard CPU, which greatly reduces computational requirements for TTS deployment in conversational systems.
Question Answering for Technical Support
IBM Research published a new natural language-based question-answering dataset and leaderboard for the technical support domain called TechQA. The goal, as described in “The TechQA paper,” is to stimulate research on transfer learning and domain adaptation that can leverage today’s common question-answering scenarios — driven by existing large-scale open-domain datasets with short questions and answers — to achieve new critical enterprise use cases that involve complex questions with potentially long answers, notably in the technical support domain. TechQA is comprised of real questions posed by users on an existing technical forum. The TechQA site provides more information on how to access the question/answer pair data, as well as a collection of more than 800,000 technotes, which are available for research.
Computational Argumentation
IBM Research created Project Debater, the first AI system that can debate humans on complex topics, which in 2019 successfully engaged in a live public debate with champion debater Harish Natarajan. Behind this important AI milestone has been a significant scientific effort that produced more than 40 scientific papers and numerous datasets. At the ACL-2019 conference, IBM Research published several notable papers on computational argumentation that address the essential components of Project Debater including data-driven speech writing and delivery, listening comprehension, and modeling of human dilemmas. IBM Research further extended Project Debater with Speech by Crowd that allows it to collect crowd-sourced free-text arguments on-the-fly and construct persuasive pro and con viewpoints that assist human decision support.

(credit: Visually Attractive for IBM)
Adversarial Robustness
IBM Research published a first major release of the open-source Adversarial Robustness 360 Toolbox (ART) in 2019 as a library for adversarial machine learning that provides state-of-the-art tools to defend and verify AI models against adversarial attacks. ART also provides a foundation for investigating new research related to adversarial robustness. IBM’s paper “AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Blackbox Neural Networks ” published at AAAI-2019, presents one such direction in the form of a new query-efficient framework that allows testing of adversarial robustness of AI models in challenging black-box settings. AutoZoom reduces the required number of queries to achieve a target attack performance by over 90% compared to prior methods, making it an efficient and practical tool for evaluating adversarial robustness where there is limited access to the AI models.
HW Acceleration with Reduced Precision Training
IBM Research continued to advance its 8-bit training platform to improve performance and maintain accuracy for the most challenging emerging deep learning models, as presented in “Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks” at NeurIPS-2019. IBM developed a new hybrid format that fully preserves model accuracy across a wide spectrum of deep learning models in image classification, natural language processing, and speech and object detection. HFP8 is part of IBM Research’s work on Digital AI Cores within the IBM Research AI Hardware Center, opened earlier in 2019, and part of the center’s ambitious roadmap for AI acceleration. These advances support a critical need of AI hardware to handle increased model processing power while managing energy consumption.

IBM Fellow, IBM Research AI and Future of Computing, IBM Research
Exploring quantum spin liquids as a reservoir for atomic-scale electronics
In “Probing resonating valence bond states in artificial quantum magnets,” we show that quantum spin liquids can be built and probed with atomic precision.
Fine-grained visual recognition for mobile AR technical support
Our team of researchers recently published paper “Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support,” in IEEE ISMAR 2020, which outlines an augmented reality (AR) solution that our colleagues in IBM Technology Support Services use to increase the rate of first-time fixes and reduce the mean time to recovery from a hardware disruption.
Hardware-aware approach for fault-tolerant quantum computation
Our article “Topological and subsystem codes on low-degree graphs with flag qubits” [1], published in Physical Review X, takes a bottom-up approach to quantum error correcting codes that are adapted to a heavy-hexagon lattice – a topology we implement in our latest 65-qubit Hummingbird (r2) chip, available to IBM Q Network users in the Manhattan-named system.