Expert dermatologists maintain a higher diagnostic accuracy for skin cancer than current artificial intelligence models when operating in realistic, complex clinical environments. A study published in JAMA Dermatology found that while AI systems can outperform junior clinicians, they struggle to replicate the nuanced judgment of specialists who possess over a decade of experience in identifying rare or atypical lesions.
Why do AI models struggle with real-world skin lesions?
Diagnostic accuracy for AI often declines when models move from controlled laboratory environments to the messy reality of a clinic. According to the study led by J. Anriot, algorithms frequently stumble when encountering diverse patient metadata and atypical lesion presentations that do not fit standard training patterns. While a first-generation convolutional neural network (CNN) achieved only 56.7% accuracy, newer foundation models showed improvement, yet they still failed to reach the 74.2% accuracy rate set by veteran dermatologists.
Researchers tested 1,117 skin lesion cases, utilizing a mix of clinical and dermoscopic images to mimic the actual workflow of a dermatology office rather than using curated, “perfect” datasets.
How does AI performance compare across experience levels?
The gap between human expertise and machine intelligence is narrowing for early-career doctors but remains wide for specialists. Data from the JAMA Dermatology report indicates that the PanDerm unimodal foundation model achieved 72.2% accuracy, effectively outperforming clinicians with fewer than three years of experience, who averaged 68.2%. However, the AI reached a performance ceiling comparable only to mid-level dermatologists—those with three to ten years of experience—failing to surpass the top-tier experts.
Performance Comparison: Human vs. AI
| Evaluator | Accuracy (%) |
|---|---|
| Expert Dermatologist (>10 years) | 74.2% |
| PanDerm Unimodal AI | 72.2% |
| Junior Physician (<3 years) | 68.2% |
What is the future of human-AI collaboration in dermatology?
Industry consensus points toward AI functioning as a support tool rather than a replacement for human clinicians. By handling triage and assisting junior staff, AI can help reduce diagnostic errors caused by physician fatigue, according to the research team. The goal is to integrate these models into existing digital diagnostic workflows to provide a second opinion, but the study asserts that expert human judgment remains the gold standard for clinical safety.

When using AI-assisted diagnostic tools in a clinical setting, always treat the output as a supplemental data point rather than a definitive diagnosis, especially for complex or non-standard lesions.
Frequently Asked Questions
Can AI replace dermatologists in skin cancer screening?
No. While AI models can outperform junior doctors, they do not yet match the accuracy of expert dermatologists in realistic clinical settings, according to findings in JAMA Dermatology.
Is AI useful for dermatology clinics today?
Yes. AI acts as a valuable clinical support tool, particularly for assisting junior staff with triage and reducing errors linked to clinician fatigue.
Why is AI accuracy lower in clinics than in labs?
Algorithms often struggle with the “real-world” diversity of skin lesions, including rare or atypical presentations that are rarely found in the highly controlled, clean datasets used during initial AI training.
Are you a healthcare provider interested in the latest diagnostic technology? Subscribe to our weekly medical technology newsletter for updates on how AI is changing clinical practice.
