In this discussion I will refer to records rather than what SPSS usually calls cases in order to avoid confusion with case as in case control.
FUZZY takes two datasets as input (the demander and supplier datasets), matches the records according to a set of BY variables, and provides various ways of writing the output. It does not have a dialog box interface, but running FUZZY /HELP displays the complete syntax. Matches can be required to be exact on all variables, or a tolerance or "fuzz" factor, which could be zero, can be specified for each matching variable. (String matches can only have fuzz 0.)
FUZZY proceeds by finding for each demander record all of the supplier records that are close enough on the BY variables. This requires a lot of comparisons! It then proceeds through the demander records and picks a supplier record at random from all those eligible for that record. (You can request multiple supplier records for each demander if needed.) No attempt is made to find the closest eligible record, since there is no measure of closeness across the set of BY variables and for other reasons.
While one would generally want to specify an exact match for categorical variables, at least those with nominal measurement level, continuous variables such as income or age might require some fuzz. New output from FUZZY can help to diagnose which BY specifications cause a record to go unmatched. Here is a table produced by FUZZY that shows how the BY criteria restrict the matches.
Next, the table shows that among comparisons after removing the 5% that matched exactly, 85% did not match on origin. Then, considering only the records where there was an exact match on origin, 75% of the comparisons did not match on cylinder. Each row of the table is based on the comparisons that passed, i.e., were within tolerance, on all of the preceding rows.
The next table shows the distribution of eligible matches for the pairing pass (this example is based on a very small dataset). It shows how many eligible records there were for each demander record in the pairing pass. It shows that there were two demander records for which there were zero eligible supplier records, three where there was only one, and one where there were two to ten eligibles. This gives you a good idea of how rich the supplier dataset is in matchables, but it doesn't say anything about which variables have the biggest effects on pairing.