The deadliest skin cancer is melanoma, which will be responsible for over 9,000 deaths in the United States in 20171. Melanoma is unique among cancers in that it arises as a visible and identifiable mark on the surface of the skin – unlike cancers of the breast, lung, or colon that develop hidden from our view. This would suggest that computer vision, which has demonstrated human equivalency in visual recognition tasks such as facial and object identification, would be ideally suited to aid in early detection of melanoma. However, physicians and patients continue to rely upon their naked eye to recognize melanoma. This begs an obvious question: why aren’t computers aiding the human eye in melanoma detection?
The reason, in my opinion, is not due to a deficiency in computer vision technology or an innate complexity of melanoma detection. Rather, the biggest roadblock to date has been the inability of the medical community to generate large, well-designed, public datasets of skin images with requisite metadata to train systems for accurate detection. This dataset bottleneck has prohibited the study of computer-aided melanoma detection on a large and meaningful scale and prevented comparative studies of the few algorithms developed by those researchers fortunate to have access to non-public skin image datasets. Studies published in this environment contribute to the ongoing “replication crisis” that exists in medicine today; results are impossible to reproduce (or improve upon) by independent researchers if datasets are hidden in private silos.
The International Skin Imaging Collaboration (ISIC) is beginning to address this unmet need though the creation of a large, open-source, public archive of high quality, annotated skin images. At present, the ISIC Archive contains over 13,000 images of skin lesions, including more than 1,000 images of melanomas, with a long-term goal of housing millions of images from multiple imaging modalities for use by: (a) physicians and educators to improve teaching and identification of skin cancer, (b) the general public for self-education, and (c) computer vision scientists to develop and test algorithms for skin cancer detection.
Using a dataset curated from the ISIC Archive, our academia-industry team from Memorial Sloan Kettering Cancer, Emory University, IBM Research, and Kitware, Inc. organized the first international melanoma image detection challenge at the 2016 International Symposium for Biomedical Imaging in Prague, Czech Republic. Twenty-five teams participated and we recently published our results in the Journal of American Academy of Dermatology, comparing the performance of the automated computer algorithms to dermatologists who specialize in skin cancer detection. In this challenge, the average performance of the dermatologists equaled the melanoma diagnostic accuracy of the top individual computer algorithms, but was surpassed by a machine learning fusion algorithm using predictions from 16 algorithms.
Based on these results, do I anticipate being replaced by a computer over the next 5-10 years? No, for two reasons: 1) the study had a number of limitations, including not having a fully diverse representation of the human population and possible diseases, and 2) clinicians use and employ skills beyond image recognition. Our study had numerous limitations and was conducted in a highly artificial setting that doesn’t come close to everyday clinical practice involving patients.
For example, when examining a suspicious skin lesion, a dermatologist would not only consider relevant clinical data, such as age, lesion history/symptoms, past personal or family history of skin cancer, and context of the lesion relative to the appearance of the patient’s other skin lesions, but might also palpate its texture, wipe it with rubbing alcohol, adjust lighting, or re-position the patient. The contribution of these additional historical and physical examination factors to melanoma diagnosis is unknown, but likely to be significant, and unfortunately we were not able to include these data in our study. Dermatologists also consider dozens of possible diagnoses (as well as the potential medical, psychosocial, cosmetic, financial, and legal ramifications of their decisions) during an examination of a patient and we tested only two diagnoses, melanoma and moles, in the computer challenge.
Nonetheless, having made our dataset available to the broader scientific community, I hope that our efforts represent a new, transparent path forward that spurs interest in melanoma detection among the computer vision community. In the meantime, I will continue to work with my colleagues to build larger, more varied datasets in the ISIC Archive that will accelerate the development of deep learning methods for melanoma detection and more closely replicate the challenges encountered when examining skin lesions on patients. Our recently concluded 2017 challenge is a small step in this direction but there is a lot of work left to do.