deep learning

Using machine learning to solve a dense hydrogen conundrum

Hydrogen is the simplest element in the universe, yet its behavior in extreme conditions such as very high pressure and temperature is still far from being well understood. Dense hydrogen constitutes the bulk of the content of giant gas planets and brown dwarf stars and it’s a material of interest for both fundamental physics and […]

Continue reading

Reducing Speech-to-Text Model Training Time on Switchboard-2000 from a Week to Under Two Hours

Published in our recent ICASSP 2020 paper in which we successfully shorten the training time on the 2000-hour Switchboard dataset, which is one of the largest public ASR benchmarks, from over a week to less than two hours on a 128-GPU IBM high-performance computing cluster. To the best of our knowledge, this is the fastest training time recorded on this dataset.

Continue reading

A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition

In a recently published paper in this year’s INTERSPEECH, we were able to achieve additional improvement on the efficiency of Asynchronous Decentralized Parallel Stochastic Gradient Descent, reducing the training time from 11.5 hours to 5.2 hours using 64 NVIDIA V100 GPUs.

Continue reading

Making Sense of Neural Architecture Search

It is no surprise that following the massive success of deep learning technology in solving complicated tasks, there is a growing demand for automated deep learning. Even though deep learning is a highly effective technology, there is a tremendous amount of human effort that goes into designing a deep learning algorithm.

Continue reading

Novel AI tools to accelerate cancer research

At the 18th European Conference on Computational Biology and the 27th Conference on Intelligent Systems for Molecular Biology, IBM will present significant, novel research that led to the implementation of three machine learning solutions aimed at accelerating and guiding cancer research.

Continue reading

Distributed Software-Defined Networking Control by Deep Reinforcement Learning for 5G and Beyond

IEEE ICC 2019 “Best Paper” details novel deep reinforcement learning approach to maximize overall performance of Software-Defined Networking that supports 5G.

Continue reading

AI Models Predict Breast Cancer with Radiologist-Level Accuracy

Our team of IBM researchers published research in Radiology around a new AI model that can predict the development of malignant breast cancer in patients within the year, at rates comparable to human radiologists.

Continue reading

High-Efficiency Distributed Learning for Speech Modeling

A distributed deep learning architecture for automatic speech recognition that shortens run time without compromising model accuracy.

Continue reading

NeuNetS: Automating Neural Network Model Synthesis for Broader Adoption of AI

NeuNetS uses AI to automatically synthesize deep neural networks faster and more easily than ever before, scaling up the deployment and adoption of AI.

Continue reading

Efficient Deep Learning Training on the Cloud with Small Files

Here I describe an approach to efficiently train deep learning models on machine learning cloud platforms (e.g., IBM Watson Machine Learning) when the training dataset consists of a large number of small files (e.g., JPEG format) and is stored in an object store like IBM Cloud Object Storage (COS). As an example, I train a […]

Continue reading

Delta-Encoder: Synthesizing a Full Set of Samples From One Image

Delta-encoder is a novel approach for few- and one-shot object recognition, in which a modified auto-encoder (called delta-encoder) extracts transferable intra-class deformations (deltas) between same-class pairs of training examples, then applies them to a few examples of a new class (unseen during training) to efficiently synthesize samples from that class. The synthesized samples are then […]

Continue reading

Probabilistic Programming with Pyro in WML

In a previous post we explained how to write a probabilistic model using Edward and run it on the IBM Watson Machine Learning (WML) platform. In this post, we discuss the same example written in Pyro, a deep probabilistic programming language built on top of PyTorch. Deep probabilistic programming languages (DPPLs) such as Edward and […]

Continue reading