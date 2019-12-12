A GPU is a piece of hardware capable of performing math computations over a huge amount of data at the same time. It’s not as fast as a central processing unit (CPU), but if one gives it a ton of data to process, it does so massively in parallel and, even though each operation runs more slowly, the parallelism of applying math operations to more data at once beats the CPU performance by far, allowing you to get your answers faster.

Big data and the GPU have provided the breakthroughs we needed to put neural networks to good practice. And that brings us to where we are with AI today. Organizations can now apply this combination to their business and uncover insights from their vast universe of data by training a neural network for that.

To successfully apply AI in your business, the first step is to make sure you have lots of data. A neural network performs poorly if trained with little data or with inadequate data. The second step is to prepare the data. If you’re creating a model capable of detecting malfunctioning insulators in power lines, you must provide it data about working ones and all types of malfunctioning ones. The third step is to train a neural network, which requires lots of computation power. Then after you train a neural network and it performs satisfactorily, it can be put to production to do inferencing.

Inferencing

Inferencing is the term that describes the act of using a neural network to provide insights after is has been trained. Think of it like someone who’s studying something (being trained) and then, after graduation, goes to work in a real-world scenario (inferencing). It takes years of study to become a doctor, just as like it takes lots of processing power to train a neural network. But doctors don’t take years to perform a surgery on a patient, and, likewise, neural networks take sub-seconds to provide an answer given real world data. This happens because the inferencing phase of a neural network-based solution doesn’t require much processing power. It requires only a fraction of the processing power needed for training. As a consequence, you don’t need a powerful piece of hardware to put a trained neural network to production, but you could use a more modest server, called an inference server, whose only purpose is to execute a trained AI model.

What the AI lifecycle looks like:

Deep learning projects have a peculiar lifecycle because of the way the training process works.

Organizations these days are facing the challenge of how to apply deep learning to analyzing their data and obtaining insights from it. They need to have enough data to train a neural network model. That data has to be representative to the problem they’re trying to solve; otherwise the results won’t be accurate. And they need a robust IT infrastructure made up of GPU-rich clusters of servers to train their AI models on. The training phase may go on for several iterations until the results are satisfactory and accurate. Once that happens, the trained neural network is put to production on much less powerful hardware. The data processed during the inferencing phase can retro feed the neural network model to correct it or enhance it according to the latest trends being created in newly acquired data. Therefore, this process of training and retraining happens iteratively over time. A neural network that’s never retrained will age over time and potentially become inaccurate with new data.

This post offers a high-level view of how data, training and inferencing are all key aspects of deep learning solutions. There's a lot more to be said about the hardware, software and services that can help businesses achieve successful AI implementations, and in upcoming articles I'll take a deeper dive into each area.