Biologically-inspired deep learning predicts chords of Bach

Today, as reported in Nature Machine Intelligence,¹ my colleagues and I have demonstrated a novel approach to deep learning that incorporates biologically-inspired neural dynamics and enables in-memory acceleration, bringing it closer to the way in which the human brain works. The results point towards the broad adoption of more biologically-realistic deep learning for applications in artificial intelligence (AI).

Spiking neural networks (SNNs) possess inherently temporal dynamics that rely on the timing of sparse all-or-none spikes and internal state stored in the neuronal membrane potential, reflecting the way real neurons work. Exploiting these dynamics has been considered very promising and SNNs were embraced as the next generation of AI after the less biologically-realistic artificial neural networks (ANNs). Despite significant efforts to harness the unique capabilities of SNN dynamics most SNN-related research activities have had limited success compared to the spectacular progress in deep learning of ANNs.

At IBM Research Europe we have been investigating both SNNs and ANNs for more than a decade, and one day we were struck with the thought: “Could we combine the characteristics of the neural dynamics of a spiking neuron and an ANN?”

The answer is yes, we could. More specifically, we have modelled a spiking neuron using a construct comprising two recurrently-connected artificial neurons — we call it a spiking neural unit (SNU).

Introducing the Spiking Neural Unit

SNU bridges the SNN with the ANN world. From a practical perspective, SNU enables direct application of the deep learning advances to developing and training sophisticated SNNs with backpropagation through time. It enables a reuse of architectures, frameworks, training algorithms and infrastructure. From a theoretical perspective, the unique biologically-realistic dynamics of SNNs become available for the deep learning community. The SNU may operate in either the spiking mode or the non-spiking soft mode. In both cases, its dynamics provide different capabilities to those of recurrent neural networks, LSTM- or GRU-based networks, and require the fewest parameters per unit among the existing ANN models for temporal data processing.

Furthermore, SNU lends itself to efficient implementation in ANN accelerators and is particularly well-suited for applications using in-memory computing. In-memory computing is a promising new approach for AI hardware that takes inspiration from the architecture of the brain, in which memory and computations are combined in the neurons. In-memory computing avoids the energy cost of shuffling data back and forth between separate memory and processors by performing computations in memory — phase change memory technology is a promising candidate for such implementation, which is well understood and is on its way to commercialization in the coming years. Our work involves experimental demonstration of in-memory SNU implementation that exhibits a robustness to hardware imperfections that is superior to that of other state-of-the-art ANN units.

Demonstrating SNUs with Handwritten Digits, Language Modeling and Bach’s Music

To demonstrate the SNU’s capabilities in solving practical tasks, we applied it to handwritten digit recognition using the MNIST dataset, language modelling using the Penn Treebank dataset, and polyphonic music prediction using the Johann Sebastian Bach chorales dataset. After performing each task, our approach achieved state-of-the-art performance for SNNs, and in the soft mode, surpassed the performance of similar architectures using ANNs.

For the handwritten digit recognition, we used a challenging spiking variant of the dataset. We first evaluated the impact of the depth on fully connected SNU-based SNNs. As we increased the network depth, accuracy increased, eventually achieving a mean recognition accuracy of 98.47% for a 7-layer SNN and 98.5% using sSNUs in a 4-layer network. To further improve the recognition accuracy, and to illustrate how the SNU concept directly benefits from the deep learning advancements, we replaced artificial neurons with SNUs in a standard convolutional architecture, leaving all other settings and hyperparameters unchanged. After 100 epochs, we obtained an average accuracy of 99.21% for training with the standard dataset and 99.53% for training with inputs preprocessed with elastic distortions – surpassing that of the various state-of-the-art SNN implementations.

The language modelling task involves predicting the next word based on the context of the previously observed sequence of words. For example, given any part of the sentence, “offered rates for the dollar deposits in the London market”, our system was almost perfectly able to correctly predict the consecutive words. Having seen “offered rates for the dollar deposits in the”, it made an interesting mistake, predicting “USA” instead of “London”. Simultaneously, this demonstrates that it correctly captured the meaning, expecting a location as the next word, and even proposing “USA” based on “the dollar” being mentioned a few words earlier.

Diagram showing how SNU-based networks can be used for language modelling and music modelling.

Language modelling (top): the network predicts the consecutive words based on the context obtained from the past words. An actual example from an SNU-based network is presented. Music prediction (bottom): the model predicts the probabilities of the consecutive notes that are to be played. Credit: Nature Machine Intelligence

A feed-forward SNU-based version of this architecture achieved test perplexity of 137.7, which is better than that of traditional natural language processing approaches, such as the ‘5-gram’ method. To the best of our knowledge, this is the first example of language modelling performed with SNNs on the Penn Treebank dataset, and our result sets the SNN state-of-the-art performance. Application of sSNUs with recurrent connections improved the result down to 108.4, which surpassed that of the corresponding LSTM-based architecture without dropout.

The task of polyphonic music prediction on the Johann Sebastian Bach dataset was to predict at each time step the set of notes, i.e. a chord, to be played in the consecutive time step. We used an SNU-based architecture with an output layer of sigmoidal neurons that allows a direct comparison of the obtained loss values to these from ANNs. The SNU-based network achieved an average loss of 8.72 and set the SNN state-of-the-art performance for the Bach chorales dataset. An sSNU-based network further reduced the average loss to 8.39 and surpassed corresponding architectures using state-of-the-art ANN units.

For a long time, SNN and ANN research on algorithms and AI hardware accelerator architectures have been developing separately, but in this paper we bridged these neural network architectures by proposing an SNU that incorporates the biologically-inspired neural dynamics in the form of a novel ANN unit, offering broad adoption of biologically-inspired neural dynamics in challenging applications and opening new avenues for neuromorphic hardware acceleration.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter