Both AI inference and AI training involve a model making predictions about input data. The difference lies in their respective purposes and, in the case of AI training, in the extra steps taken toward that purpose.
Training is where the “learning” in machine learning occurs. In model training, a machine learning model makes predictions on a batch of training data examples. In supervised learning, a loss function calculates the average error (or “loss”) of each prediction, and an optimization algorithm is used to update model parameters in a way that reduces loss. This process is iteratively repeated until loss has been minimized to an acceptable level. Reinforcement learning works similarly, albeit with the goal of maximizing a reward function instead of minimizing a loss function.
In short, AI training typically entails both a forward pass in which the model generates an output in response to each input and a backward pass in which potential improvements to the model’s parameters are calculated. These parameter updates comprise a machine learning model’s “knowledge.”
In AI inference, the trained model then makes predictions on real-world input data. AI inference works by using what it has “learned”—that is, the model parameter updates that were made in order to improve its performance on the training data—to infer the correct output for the new input data. Unlike in model training, inference entails only a forward pass.
While training and inference are usually separate, distinct stages, its worth noting that they’re not quite mutually exclusive. For instance, a social media platform’s recommendation algorithm has already been trained on large data sets of user behavior before you join the platform, and is performing inference each time it provides content suggestions to you. But that trained model is also continually fine-tuned on your individual behaviors, refining its suggestions based on how you personally engage with content.