A Biologically Plausible Learning Algorithm for Neural Networks

Share this post:

In spite of the great success of deep learning on a range of computationally challenging tasks, questions remain as to the extent of the similarity between the computational properties of deep neural networks and those of the human brain. The particularly nonbiological aspect of deep learning is the supervised training process with a backpropagation algorithm, which requires massive amounts of labeled data, and a nonlocal learning rule for changing the weights. My colleague and I developed a learning algorithm inspired by synaptic plasticity rules conceptually similar to those operating in real biological neural networks. We describe this algorithm in our paper “Unsupervised Learning by Competing Hidden Units” published last week in the journal Proceedings of the National Academies of Science of the United States of America. The proposed algorithm learns the weights of the lower layer of neural networks in a completely unsupervised fashion. These weights are agnostic about the task that the neural network will have to solve eventually in higher layers. In spite of this, they can be used to train a good classifier in the higher layers that is tailored for some specific task. The entire algorithm utilizes local learning rules for updating the weights.

The key idea of the algorithm relies on a network motif with the lateral inhibition between the hidden neurons and a learning rule that is inspired by Hebbian-like plasticity mechanisms known to exist in biology. For every data point, such as an image, the hidden neurons compete with each other so that eventually only a handful of them remain active while the majority fall below the activation threshold. The activities of those active hidden neurons together with the activities of the visible neurons are then used to update the weights. This learning rule utilizes only local information that is directly knowable by the two neurons that are connected by a given weight. The growth of the weights as the training progresses is shown in the video below.

The paper compares the performance of two networks. The first one is trained using the two-stage procedure: the proposed “biological” training of the first layer followed by the standard gradient descent training of the classifier in the top layer. The second one is trained end-to-end with the backpropagation algorithm on a supervised task. In our paper we investigate the proposed “biological” algorithm in the framework of fully connected neural networks with one hidden layer on the pixel permutation invariant MNIST and CIFAR-10 datasets.

In the case of MNIST, the weights of the hidden layer together with the errors on the training and the test sets for the two networks are shown in the figure below. The network that is trained with the backpropagation algorithm end-to-end demonstrates the well-known benchmarks: training error = 0%, test error = 1.5%. The network trained in a “biological” way reaches the error on the training set 0.4%. Thus, it never fits the training data perfectly. At the same time, the error on the held-out test set is 1.46%, the same as the error of the network that is trained end-to-end. This is surprising because the proposed “biological” algorithm learned the weights of the first layer without knowing what task these weights will be used for, unlike the network that is trained end-to-end. Also, the “biological” algorithm was constrained by the requirement to utilize only local plasticity rules for learning those weights.

Another interesting aspect is that the weights of the proposed “biological” algorithm (left panel) are very different from the weights learned by the backpropagation algorithm (middle panel). At the same time, they are not just copies of the individual training examples, and encode both the presence (red color) and the absence (blue color) of ink in the MNIST images. The full “biological” network learns a distributed representation of the training data over multiple hidden units.

A similar comparison was done on CIFAR-10 dataset. In this case the accuracy of our “biological” algorithm is slightly worse than that of the network trained end-to-end. However, it still demonstrates a good performance.

Learn more details by reading the full paper or watching a video lecture discussing the results. You can also download the code used for the “biological” training.

Research Staff Member, IBM Research

More Publications stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page:

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading