Self-supervised learning is a subset of unsupervised learning: all self-supervised learning techniques are unsupervised learning, but most unsupervised learning does not entail self-supervision.
Neither unsupervised nor self-supervised learning use labels in the training process: both methods learn intrinsic correlations and patterns in unlabeled data, rather than externally imposed correlations from annotated datasets. Apart from this shared focus on unlabeled data, the differences between self-supervised and unsupervised learning largely mirror the differences between unsupervised and supervised learning.
Problems using conventional unsupervised learning do not measure results against any pre-known ground truth. For example, an unsupervised association model could power an e-commerce recommendation engine by learning which products are frequently purchased together. The utility of the model is not derived from replicating human predictions, but from discovering correlations not apparent to human observers.
Self-supervised learning does measure results against a ground truth, albeit one implicitly derived from unlabeled training data. Like supervised models, self-supervised models are optimized using a loss function: an algorithm measuring the divergence (“loss”) between ground truth and model predictions. During training, self-supervised models use gradient descent during backpropagation to adjust model weights in a way that minimizes loss (and thereby improves accuracy).
Driven by this key difference, the two methods focus on different use cases: unsupervised models are used for tasks like clustering, anomaly detection and dimensionality reduction that do not require a loss function, whereas self-supervised models are used for classification and regression tasks typical to supervised learning.