IBM Research-Ireland

Using AI to Design Deep Learning Architectures

Share this post:

Selecting the best architecture for deep learning architectures is typically a time-consuming process that requires expert input, but using AI can streamline this process. I am developing an evolutionary algorithm for architecture selection that is up to 50,000 times faster than other methods, with only a small increase in error rate.

Deep learning models are applied in many IBM Watson products and services and can perform challenging tasks such as visual recognition, text to speech and vice versa, playing board games and much more. These models emulate the workings of the human brain, and, like the brain, their architecture is crucial to their function.

At IBM, engineers and scientists select the best architecture for a deep learning model from a large set of possible candidates. Today this is a time-consuming manual process; however, using a more powerful automated AI solution to select the neural network can save time and enable non-experts to apply deep learning faster. My evolutionary algorithm is designed to reduce the search time for the right deep learning architecture to just hours, making the optimization of deep learning network architecture affordable for everyone.

An evolutionary algorithm for deep learning networks

My proposed method treats a convolutional neural network architecture as a sequence of neuro-cells, then applies a series of mutations in order to fi nd a structure that improves the performance of the neural network for a given dataset and machine learning task. This approach substantially shortens network training time. The mutations alter the structure of the network, but do not change the network’s predictions, and can include adding layers, adding new connections, or widening kernels or layers.

Function-preserving mutation in optimizing deep learning networks

Figure 1. Example of a function-preserving mutation. The architecture on the right has a mutation but gives the same prediction as the architecture on the left (represented by the same colors).

Experimental evaluation

I compared my new neuro-evolutional approach with several other methods on the task of image classification on the CIFAR-10 and CIFAR-100 datasets. These datasets are collections of images commonly used to train machine learning and computer vision algorithms. My algorithm had slightly higher classification error but required significantly less time, compared with state-of-the-art human-designed architectures, results of architecture search methods based on reinforcement learning, and results for other automated methods based on evolutionary algorithms. It was up to 50,000 times faster than some other methods, with an error rate at most 0.6{ccf696850f4de51e8cea028aa388d2d2d2eef894571ad33a4aa3b26b43009887} higher than the best competitor on the benchmark dataset CIFAR-10.

The figures below visualize the algorithm’s optimization process. In Figure 2, each dot represents a different structure, and the connecting lines represent mutations. The color scale shows the accuracy of each structure, and the x axis represents time. Accuracy increases quickly over the first 10 hours, then progress is slow but steady afterwards.

Optimization of the evolutionary algorithm for designing deep learning networks

Figure 2. Optimization of the evolutionary algorithm over time.

Figure 3 shows the mutations to the network structure at each hour of the process.

Evolution of a deep learning netowrk structure over time

Figure 3. Evolution of a network structure over time. Some intermediate states are not shown. (click to enlarge)

In future, I hope to integrate this optimization method into IBM’s cloud services and make it available to  clients. Furthermore, I plan to extend it to larger datasets like ImageNet and additional kinds of data such as time-series and text.

I will present this approach at the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD) conference in Dublin, Ireland, in September. This event is the premier machine learning and data mining conference in Europe and it will be co-hosted by IBM Research-Ireland, University College Dublin School of Computer Science and the Insight Centre for Data Analytics. As part of the extensive conference programme, our adversarial AI experts will be hosting the 1st Workshop on Recent Advances in Adversarial Machine Learning (Nemesis’18), which aims to bring together researchers and practitioners to discuss recent advances in the rapidly evolving field of adversarial machine learning.

Deep Learning Architecture Search by Neuro-Cell-based Evolution with Function-Preserving Mutations
Martin Wistuba
European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2018, Dublin, Ireland, September 10-14, 2018

More IBM Research-Ireland stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page:

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading