2016 was a big year for brain-inspired computing. My team and I proved in our paper “Convolutional networks for fast, energy-efficient neuromorphic computing” that the value of this breakthrough is that it can perform neural network inference at unprecedented ultra-low energy consumption. Simply stated, our TrueNorth chip’s non-von Neumann architecture mimics the brain’s neural architecture — giving it unprecedented efficiency and scalability over today’s computers.
The brain-inspired TrueNorth processor [is] a 70mW reconfigurable silicon chip with 1 million neurons, 256 million synapses, and 4096 parallel and distributed neural cores. For systems, we present a scale-out system loosely coupling 16 single-chip boards and a scale-up system tightly integrating 16 chips in a 4´4 configuration by exploiting TrueNorth’s native tiling.
For the scale-up systems we summarize our approach to physical placement of neural network, to reduce intra- and inter-chip network traffic. The ecosystem is in use at over 30 universities and government / corporate labs. Our platform is a substrate for a spectrum of applications from mobile and embedded computing to cloud and supercomputers.
TrueNorth, once loaded with a neural network model, can be used in real-time as a sensory streaming inference engine, performing rapid and accurate classifications while using minimal energy. TrueNorth’s 1 million neurons consume only 70 mW, which is like having a neurosynaptic supercomputer the size of a postage stamp that can run on a smartphone battery for a week.
Recently, in collaboration with Lawrence Livermore National Laboratory, U.S. Air Force Research Laboratory, and U.S. Army Research Laboratory, we published our fifth paper at IEEE’s prestigious Supercomputing 2016 conference that summarizes the results of the team’s 12.5-year journey (see the associated graphic) to unlock this value proposition.
Applying the mind of a chip
Three of our partners, U.S. Army Research Lab, U.S. Air Force Research Lab and Lawrence Livermore National Lab, contributed sections to the Supercomputing paper each showcasing a different TrueNorth system, as summarized by my colleagues Jun Sawada, Brian Taba, Pallab Datta, and Ben Shaw:
U.S. Army Research Lab (ARL) prototyped a computational offloading scheme to illustrate how TrueNorth’s low power profile enables computation at the point of data collection. Using the single-chip NS1e board and an Android tablet, ARL researchers created a demonstration system that allows visitors to their lab to hand write arithmetic expressions on the tablet, with handwriting streamed to the NS1e for character recognition, and recognized characters sent back to the tablet for arithmetic calculation.
Of course, the point here is not to make a handwriting calculator, it is to show how TrueNorth’s low power and real time pattern recognition might be deployed at the point of data collection to reduce latency, complexity and transmission bandwidth, as well as back-end data storage requirements in distributed systems.
U.S. Air Force Research Lab (AFRL) contributed another prototype application utilizing a TrueNorth scale-out system to perform a data-parallel text extraction and recognition task. In this application, an image of a document is segmented into individual characters that are streamed to AFRL’s NS1e16 TrueNorth system for parallel character recognition. Classification results are then sent to an inference-based natural language model to reconstruct words and sentences. This system can process 16,000 characters per second! AFRL plans to implement the word and sentence inference algorithms on TrueNorth, as well.
Lawrence Livermore National Lab (LLNL) has a 16-chip NS16e scale-up system to explore the potential of post-von Neumann computation through larger neural models and more complex algorithms, enabled by the native tiling characteristics of the TrueNorth chip. For the Supercomputing paper, they contributed a single-chip application performing in-situ process monitoring in an additive manufacturing process. LLNL trained a TrueNorth network to recognize seven classes related to track weld quality in welds produced by a selective laser melting machine. Real-time weld quality determination allows for closed-loop process improvement and immediate rejection of defective parts. This is one of several applications LLNL is developing to showcase TrueNorth as a scalable platform for low-power, real-time inference.
Looking to the future: Real-time sense & response
As TrueNorth is still a proof-of-concept research prototype, we will continue gathering feedback from our ecosystem partners to imagine the future generations of processors that achieve fundamental limits of time, space, and energy, while pushing the boundaries of what is achievable with neural networks inference to fundamentally transform how people live and work.
Soon we will release APIs to enable our ecosystem partners to hook up many real-time sensors to TrueNorth. Early prototypes, like Samsung’s digital eye and UC Irvine’s self-driving robot are already under experimentation. I told a crowd of researchers at this year’s IEEE International Electron Devices Meeting (IEDM) that I am confident we will achieve business-at-scale within the next four years.
For more information and regular updates, see my blog.
The TrueNorth Timeline