Posted in: AI

Contrastive Explanations Help AI Explain Itself by Identifying What is Missing

One could describe a pirate as a sailor with a missing eye, or a tripod could be thought of as a table with a missing leg. Descriptions like these, based on what is missing, are called contrastive explanations. Such explanations are a requirement for doctors when evaluating patients. Just saying that the heart beat was normal is not enough; a general practitioner must also list the abnormalities (such as irregular heartbeat, etc.) that were absent. This ensures that any doctor who may read the practitioner’s report in the future can identify all the conditions that were checked. The conditions that were absent, referred to as pertinent negatives, are also critical in performing differential diagnosis. For instance, a patient showing symptoms of fever, cough and cold but no sputum or chills will most likely be diagnosed as having flu rather than pneumonia. The presence of fever, cough and cold could indicate both flu or pneumonia; however, the absence of sputum or chills confirms the diagnosis of flu. The importance of pertinent negatives is witnessed in criminology as well. An empty safe with valuables missing at a crime scene may strongly indicate theft being the primary motive over and above others (such as personal vendetta). Hence, the aspects that are absent in a given situation can be critical in creating meaningful and accurate explanations or judgements.

contrastive explanations

Comparison of our CEM versus LRP and LIME on MNIST. PP/PN stands for Pertinent Positive/Negative and are highlighted in cyan/pink respectively. For LRP, green is neutral, red/yellow is positive relevance, and blue is negative relevance. For LIME, red is positive relevance and white is neutral.

AT IBM Research, we applied this premise to improve the explanations generated by artificial intelligence (AI) technologies. We recently created and implemented an AI algorithm that can provide contrastive explanations for black box models such as deep neural networks. Our method, called contrastive explanations method (CEM), highlights not only what should be minimally and sufficiently present to justify the classification of an input example by a neural network (pertinent positives), but also what should be minimally and necessarily absent (pertinent negatives), in order to form a more complete and well-rounded explanation. We thus want to generate explanations of the form:

An input x is classified in class y because features Fi,…, Fk are present and because features Fm,…, Fp are absent.

To the best of our knowledge, this is the first AI method that provides such contrastive explanations. The code for our method can be accessed on Github. One may argue that there is some information loss in our form of explanation; however, we believe that such explanations are lucid and easily understandable by humans who can always further delve into the details of our generated explanations such as the precise feature values, which are readily available.

In addition to generating meaningful explanations, we believe there are other applications for our method. Our explanations could potentially be used for model selection and model debugging. If two models have similar test set performance but one of them provides much superior explanations based on our method, then that particular model might be preferred over the other, as it is likely to be more robust once deployed. Moreover, analyzing the explanations, especially for misclassified instances, can possibly provide insight into limitations of the learned model and ideally some directions to potentially improve it. We are now exploring the potential of these and many more applications of the contrastive explanations method to push the pace of AI development.

Amit Dhurandhar

Research Scientist, IBM Research