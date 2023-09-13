Generative models

Generative algorithms, which usually entail unsupervised learning, model the distribution of data points, aiming to predict the joint probability P(x,y) of a given data point appearing in a particular space. A generative computer vision model might thereby identify correlations like “things that look like cars usually have four wheels” or “eyes are unlikely to appear above eyebrows.”

These predictions can inform the generation of outputs the model deems highly probable. For example, a generative model trained on text data can power spelling and autocomplete suggestions; at the most complex level, it can generate entirely new text. Essentially, when an LLM outputs text, it has computed a high probability of that sequence of words being assembled in response to the prompt it was given.

Other common use cases for generative models include image synthesis, music composition, style transfer and language translation.

Examples of generative models include:

Diffusion models: diffusion models gradually add Gaussian noise to training data until it’s unrecognizable, then learn a reversed “denoising” process that can synthesize output (usually images) from random seed noise.

Variational autoencoders (VAEs): VAEs consist of an encoder that compresses input data and a decoder that learns to reverse the process and map likely data distribution.

Transformer models: Transformer models use mathematical techniques called “attention” or “self-attention” to identify how different elements in a series of data influence one another. The “GPT” in OpenAI’s Chat-GPT stands for “Generative Pretrained Transformer.”

Discriminative models

Discriminative algorithms, which usually entail supervised learning, model the boundaries between classes of data (or “decision boundaries”), aiming to predict the conditional probability P(y|x) of a given data point (x) falling into a certain class (y). A discriminative computer vision model might learn the difference between “car” and “not car” by discerning a few key differences (like "if it doesn’t have wheels, it’s not a car”), allowing it to ignore many correlations that a generative model must account for. Discriminative models thus tend to require less computing power.

Discriminative models are, naturally, well suited to classification tasks like sentiment analysis—but they have many uses. For example, decision tree and random forest models break down complex decision-making processes into a series of nodes, at which each “leaf” represents a potential classification decision.

Use cases

While discriminative or generative models may generally outperform one another for certain real-world use cases, many tasks could be achieved with either type of model. For example, discriminative models have many uses in natural language processing (NLP) and often outperform generative AI for tasks like machine translation (which entails the generation of translated text).

Similarly, generative models can be used for classification using Bayes’ theorem. Rather than determining which side of a decision boundary an instance is on (like a discriminative model would), a generative model could determine the probability of each class generating the instance and pick the one with higher probability.

Many AI systems employ both in tandem. In a generative adversarial network, for example, a generative model generates sample data and a discriminative model determines whether that data is “real” or “fake.” Output from the discriminative model is used to train the generative model until the discriminator can no longer discern “fake” generated data.