Objective: Diseases such as age-related macular degeneration (AMD) are classified based on human rubrics that are prone to bias. Supervised neural networks trained using human-generated labels require labor-intensive annotations and are restricted to specific trained tasks. Here, we trained a self-supervised deep learning network using unlabeled fundus images, enabling data-driven feature classification of AMD severity and discovery of ocular phenotypes. Design: Development of a self-supervised training pipeline to evaluate fundus photographs from the Age-Related Eye Disease Study (AREDS). Participants: One hundred thousand eight hundred forty-eight human-graded fundus images from 4757 AREDS participants between 55 and 80 years of age. Methods: We trained a deep neural network with self-supervised Non-Parametric Instance Discrimination (NPID) using AREDS fundus images without labels then evaluated its performance in grading AMD severity using 2-step, 4-step, and 9-step classification schemes using a supervised classifier. We compared balanced and unbalanced accuracies of NPID against supervised-trained networks and ophthalmologists, explored network behavior using hierarchical learning of image subsets and spherical k-means clustering of feature vectors, then searched for ocular features that can be identified without labels. Main Outcome Measures: Accuracy and kappa statistics. Results: NPID demonstrated versatility across different AMD classification schemes without re-training and achieved balanced accuracies comparable with those of supervised-trained networks or human ophthalmologists in classifying advanced AMD (82% vs. 81–92% or 89%), referable AMD (87% vs. 90–92% or 96%), or on the 4-step AMD severity scale (65% vs. 63–75% or 67%), despite never directly using these labels during self-supervised feature learning. Drusen area drove network predictions on the 4-step scale, while depigmentation and geographic atrophy (GA) areas correlated with advanced AMD classes. Self-supervised learning revealed grader-mislabeled images and susceptibility of some classes within more granular AMD scales to misclassification by both ophthalmologists and neural networks. Importantly, self-supervised learning enabled data-driven discovery of AMD features such as GA and other ocular phenotypes of the choroid (e.g., tessellated or blonde fundi), vitreous (e.g., asteroid hyalosis), and lens (e.g., nuclear cataracts) that were not predefined by human labels. Conclusions: Self-supervised learning enables AMD severity grading comparable with that of ophthalmologists and supervised networks, reveals biases of human-defined AMD classification systems, and allows unbiased, data-driven discovery of AMD and non-AMD ocular phenotypes.
- Age-related macular degeneration
- Artificial intelligence
- Deep learning
- Machine learning
ASJC Scopus subject areas