October 11, 2017 | Written by: IBM Research Editorial Staff
Categorized: AI | Systems
Share this post:
I recently participated in a panel at Applied Materials’ 2017 Analyst Day to talk about artificial intelligence (AI). Yes, a materials company asked me, an executive overseeing semiconductor research, to join other technologists to give our view of AI – demonstrating how interest in AI has permeated all aspects of the IT industry!
To lead into those hardware elements, my fellow panelists and I were first asked about what AI and the explosion of data implies for computing devices. Where should AI data be processed – the cloud or the edge? To that, I told the audience: “We believe that data is our next natural resource. We’re not going to discard it. We’re going to figure out how to generate value from it.”
Eventually, more and more training will migrate to the edge. It will be cloud and edge. Not cloud or edge.
AI has two components to it. The model build and training is the computation intensive and expensive portion; inferencing, or scoring, is the application of the trained model on new data. The surge in AI is a consequence of an exponential growth in data, specifically unstructured data, advances in algorithm development, and cheap compute. Much of today’s computation remains tied to the same hardware built for spreadsheets and databases, and even with GPU clusters it can take weeks or months to generate the most complex models in data centers, on the cloud.
Right now, most of the training is still being done in the cloud, whereas inferencing – making use of that model – is done at the edge. However, going forward, driving both training and inferencing to the edge will be essential to leverage full datasets at the edge with real-time model updates. This trend, though, is in direct conflict with the power hungry and inefficient computation hardware used today for AI workloads. The automotive industry, autonomous vehicles in particular, exemplifies the drive to migrate AI computation to “the edge.” Matt Johnson, NXP’s senior vice president of automotive said, in example, “a vehicle would be ‘the edge’ but it needs access to the cloud. The edge values efficiency…and certain functions have to happen at the edge.”
We (the industry) have to innovate so some of the compute, currently done on the cloud, can be transferred to the edge with more devices, which must operate cheaply at lower power. Eventually, more and more training will migrate to the edge. It will be cloud and edge. Not cloud or edge. The change to something new in this so-called era of AI will come from innovation lower in the stack.
Optimizing architectures and compute models for AI
The rise of AI is an inflection the world hasn’t seen for the last 60 years – going from the tabulating machines, to the programmable systems which we still use. And even though we at IBM proclaimed the onset of the cognitive era in 2011, exemplified by Watson’s debut on Jeopardy!, programmable systems, like logic and memory, continue to dominate global computation.
Panelist and former Intel executive Christos Georgiopoulos, who now teaches at Florida State University, put it this way: “Traditional workloads we’ve known for the last 40 years don’t apply. AI requires different capabilities from the machines we build. As we move into AI workloads, it will require new system design.”
Honestly, I was excited to see Applied Materials turn its attention to AI. A common misconception is that AI is all software. No! We need optimization at the algorithm level, the systems level, the chip design level, device level, and eventually materials level. This “new design” means going down in the stack, from solutions to device.
AI workloads are different. Our brains are much more efficient than computers for classifying unstructured data, like facial recognition and natural language processing, due in part to the reduced precision required for a brain to make a reasonable classification. By exploiting the reduced precision requirements for unstructured data workloads1, the innate efficiency advantage of analog computing can be harnessed2. A roadmap propelled by differentiating architecture and analog accelerators will drive exponential improvements in AI computation efficiency, fueling the acceleration of AI learning cycles and AI solutions.
We closed the panel by answering “what will deliver the analog compute that will deliver AI workloads?”
To that, and the promise of several orders of magnitude reduction in power, or improvement in efficiency, the panel agreed it will come from materials innovation – at the bottom of the stack. That’s where we will build the new devices with new features. It’s at the materials level that model building can move from cloud to edge, to cloud and edge.
Again, professor Georgiopoulos: “I truly believe we’re at the inflection point where, when we exit it, society may look completely different – from our healthcare to how we drive. And all of that is going to be driven by materials innovation.”
Listen to the Era of AI panel:
Author(s). Date. Article title. Journal title. Volume(issue):location.
1S. Gupta, A. Agrawal, K. Gopalakrishnan, P. Narayanan (2015) Deep Learning with Limited Numerical Precision. Proceedings of Machine Learning Research. 37:1737-1746.
2Gokmen T and Vlasov Y (2016) Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices: Design Considerations. Front. Neurosci. 10:333.