IBM Research-Ireland

Using AI to Design Deep Learning Architectures

Share this post:

Selecting the best architecture for deep learning architectures is typically a time-consuming process that requires expert input, but using AI can streamline this process. I am developing an evolutionary algorithm for architecture selection that is up to 50,000 times faster than other methods, with only a small increase in error rate.

Deep learning models are applied in many IBM Watson products and services and can perform challenging tasks such as visual recognition, text to speech and vice versa, playing board games and much more. These models emulate the workings of the human brain, and, like the brain, their architecture is crucial to their function.

At IBM, engineers and scientists select the best architecture for a deep learning model from a large set of possible candidates. Today this is a time-consuming manual process; however, using a more powerful automated AI solution to select the neural network can save time and enable non-experts to apply deep learning faster. My evolutionary algorithm is designed to reduce the search time for the right deep learning architecture to just hours, making the optimization of deep learning network architecture affordable for everyone.

An evolutionary algorithm for deep learning networks

My proposed method treats a convolutional neural network architecture as a sequence of neuro-cells, then applies a series of mutations in order to fi nd a structure that improves the performance of the neural network for a given dataset and machine learning task. This approach substantially shortens network training time. The mutations alter the structure of the network, but do not change the network’s predictions, and can include adding layers, adding new connections, or widening kernels or layers.

Function-preserving mutation in optimizing deep learning networks

Figure 1. Example of a function-preserving mutation. The architecture on the right has a mutation but gives the same prediction as the architecture on the left (represented by the same colors).

Experimental evaluation

I compared my new neuro-evolutional approach with several other methods on the task of image classification on the CIFAR-10 and CIFAR-100 datasets. These datasets are collections of images commonly used to train machine learning and computer vision algorithms. My algorithm had slightly higher classification error but required significantly less time, compared with state-of-the-art human-designed architectures, results of architecture search methods based on reinforcement learning, and results for other automated methods based on evolutionary algorithms. It was up to 50,000 times faster than some other methods, with an error rate at most 0.6{ccf696850f4de51e8cea028aa388d2d2d2eef894571ad33a4aa3b26b43009887} higher than the best competitor on the benchmark dataset CIFAR-10.

The figures below visualize the algorithm’s optimization process. In Figure 2, each dot represents a different structure, and the connecting lines represent mutations. The color scale shows the accuracy of each structure, and the x axis represents time. Accuracy increases quickly over the first 10 hours, then progress is slow but steady afterwards.

Optimization of the evolutionary algorithm for designing deep learning networks

Figure 2. Optimization of the evolutionary algorithm over time.

Figure 3 shows the mutations to the network structure at each hour of the process.

Evolution of a deep learning netowrk structure over time

Figure 3. Evolution of a network structure over time. Some intermediate states are not shown. (click to enlarge)

In future, I hope to integrate this optimization method into IBM’s cloud services and make it available to  clients. Furthermore, I plan to extend it to larger datasets like ImageNet and additional kinds of data such as time-series and text.

I will present this approach at the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD) conference in Dublin, Ireland, in September. This event is the premier machine learning and data mining conference in Europe and it will be co-hosted by IBM Research-Ireland, University College Dublin School of Computer Science and the Insight Centre for Data Analytics. As part of the extensive conference programme, our adversarial AI experts will be hosting the 1st Workshop on Recent Advances in Adversarial Machine Learning (Nemesis’18), which aims to bring together researchers and practitioners to discuss recent advances in the rapidly evolving field of adversarial machine learning.

Deep Learning Architecture Search by Neuro-Cell-based Evolution with Function-Preserving Mutations
Martin Wistuba
European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2018, Dublin, Ireland, September 10-14, 2018

More IBM Research-Ireland stories

IBM RXN for Chemistry: Unveiling the grammar of the organic chemistry language

In our paper “Extraction of organic chemistry grammar from unsupervised learning of chemical reactions,” published in the peer-reviewed journal Science Advances, we extract the "grammar" of organic chemistry's "language" from a large number of organic chemistry reactions. For that, we used RXNMapper, a cutting-edge, open-source atom-mapping tool we developed.

Continue reading

New Qiskit design: Introducing Qiskit application modules

We’re pleased to announce the first steps to restructure Qiskit towards a true runtime environment that even better-reflects the needs of the developer community. We’re rearranging the framework’s original “elements” — Terra, Aer, Ignis and Aqua — into more focused application modules that target specific user groups, and plug into the tools used by the experts in different domains.

Continue reading

IBM Quantum systems accelerate discoveries in science

IBM's quantum systems powered 46 non-IBM presentations in order to help discover new algorithms, simulate condensed matter and many-body systems, explore the frontiers of quantum mechanics and particle physics, and push the field of quantum information science forward overall.

Continue reading