Surprising variability in melanoma diagnostic findings

Elmore et al. figure
In one case, 36 pathologists used 18 different diagnostic terms to define the same tissue section, shown here at two magnifications.

Although pathologists are likely to agree when evaluating skin biopsies that are benign or highly malignant, they often disagree when lesions fall into intermediate categories, new research finds. Pathologists’ diagnostic interpretations of melanoma in situ and early stage invasive melanoma – categories that are not well characterized – were neither reproducible nor accurate. And this may be creating the potential for both overdiagnosis and underdiagnosis of melanoma, according to the new study with two OHSU co-authors, Heidi D. Nelson, M.D., M.P.H., and Patricia Carney, Ph.D.

The researchers, including first author Joann Elmore, M.D., M.P.H., at the University of Washington, analyzed the diagnostic findings of 187 pathologists practicing in 10 states in comparison with each other and with a consensus diagnosis reached by a panel of three experienced skin pathologists. To measure reproducibility, the researchers also had the 187 pathologists interpret the same set of skin biopsies on two separate occasions at least eight months apart.

Accuracy was reasonably good in cases at the extremes of disease severity. Agreement was 92 percent for the benign lesions classified as mild atypia (lesions that contain cells multiplying into abnormal patterns and showing some of the early features of cancer). And agreement was 72 percent for lesions classified as high-stage invasive melanoma.

But accuracy plummeted for cases in the middle of the spectrum. Less than half of the diagnoses were in concordance with the expert consensus for cases classified as severely atypical lesions, melanoma in situ, or early stage invasive melanoma. Likewise, pathologists’ interpretations of the same case on two occasions lacked reproducibility in the mid-range. Intraobserver reproducibility dropped to about 35 percent for cases classified as moderate atypia, for example. The paper describing the findings was published in the British Medical Journal.

Elmore said the study was inspired by her experience as a patient undergoing a skin biopsy ten years ago, which resulted in three different independent interpretations, ranging from benign to invasive melanoma:

Because my skin biopsy only looked “suspicious” for melanoma, we obtained a second biopsy. The biopsy specimens were then sent to two independent pathologists—who returned two different diagnoses at the polar extremes: One pathologist said it was benign; the other said it was suspicious for invasive melanoma. We then did what most physicians do at that point: we sought yet another opinion, in this case, from a pathologist who has written textbooks on the topic and has decades of experience. His assessment fell in the middle of the diagnostic spectrum: not invasive melanoma, rather an atypical Spitz lesion that can mimic melanoma but is benign.

A previous study co-authored by the same researchers found similar variability in the interpretation of breast biopsies. Nearly 1 in 5 women given a diagnosis of ductal carcinoma in situ, or DCIS, had a biopsy specimen interpreted as either benign or atypia by the consensus panel.

In the new paper, the authors said pathologists would do well to adopt a standardized classification system for skin lesions. As it stands, more than 50 terms may be used to describe the same melanocytic lesion, they found. A smaller and standardized set of classifications could make it easier for pathologists, primary clinicians and patients to communicate with each other.

The authors also proposed adding standardized statements to pathology reports reminding readers that melanocytic lesions are challenging to interpret, especially in the middle diagnostic classes, leaving room for considerable uncertainty in diagnostic findings.

Going forward, they said new objective techniques need to be developed to support pathologists’ visual assessments, such as future systems using digital whole slide imaging platforms to obtain second opinions or molecular analysis of skin biopsies.

◊ ◊ ◊


Pathologists’ diagnosis of invasive melanoma and melanocytic proliferations: observer accuracy and reproducibility study by Joann G. Elmore, Raymond L. Barnhill, David E. Elder, Gary M. Longton, Margaret S. Pepe, Lisa M. Reisch, Patricia A. Carney, Linda J. Titus, Heidi D. Nelson, Tracy Onega, Anna N. A. Tosteson, Martin A. Weinstock, Stevan R. Knezevich, and Michael W. Piepkorn. BMJ, June 28, 2017.

Variability in pathologists’ interpretations of individual breast biopsy slides: a population perspective by Joann G. Elmore, Heidi D. Nelson, Margaret S. Pepe, Gary M. Longton, Anna N.A. Tosteson, Berta Geller, Tracy Onega, Patricia A. Carney, Sara L. Jackson, Kimberly H. Allison, and Donald L. Weaver. Annals of Internal Medicine (2016)