Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast!

Juhun Lee, Robert M. Nishikawa, Ingrid Reiser, John M Boone

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The purpose of this study was to determine radiologists' diagnostic performances on different image reconstruction algorithms that could be used to optimize image-based model observers. We included a total of 102 pathology proven breast computed tomography (CT) cases (62 malignant). An iterative image reconstruction (IIR) algorithm was used to obtain 24 reconstructions with different image appearance for each image. Using quantitative image feature analysis, three IIRs and one clinical reconstruction of 50 lesions (25 malignant) were selected for a reader study. The reconstructions spanned a range of smooth-low noise to sharp-high noise image appearance. The trained classifiers' AUCs on the above reconstructions ranged from 0.61 (for smooth reconstruction) to 0.95 (for sharp reconstruction). Six experienced MQSA radiologists read 200 cases (50 lesions times 4 reconstructions) and provided the likelihood of malignancy of each lesion. Radiologists' diagnostic performances (AUC) ranged from 0.7 to 0.89. However, there was no agreement among the six radiologists on which image appearance was the best, in terms of radiologists' having the highest diagnostic performances. Specifically, two radiologists indicated sharper image appearance was diagnostically superior, another two radiologists indicated smoother image appearance was diagnostically superior, and another two radiologists indicated all image appearances were diagnostically similar to each other. Due to the poor agreement among radiologists on the diagnostic ranking of images, it may not be possible to develop a model observer for this particular imaging task.

Original languageEnglish (US)
Title of host publicationMedical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment
PublisherSPIE
Volume9787
ISBN (Electronic)9781510600225
DOIs
StatePublished - 2016
EventMedical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment - San Diego, United States
Duration: Mar 2 2016Mar 3 2016

Other

OtherMedical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment
CountryUnited States
CitySan Diego
Period3/2/163/3/16

Fingerprint

Image reconstruction
Pathology
Tomography
Classifiers
Imaging techniques
lesions
Computer-Assisted Image Processing
image reconstruction
Area Under Curve
Noise
Radiologists
ranking
pathology
readers
classifiers
breast
low noise
Breast
tomography

Keywords

  • Breast cancer
  • Breast computed tomography
  • Diagnostic performance
  • Model observers
  • Reader study

ASJC Scopus subject areas

  • Atomic and Molecular Physics, and Optics
  • Electronic, Optical and Magnetic Materials
  • Biomaterials
  • Radiology Nuclear Medicine and imaging

Cite this

Lee, J., Nishikawa, R. M., Reiser, I., & Boone, J. M. (2016). Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast! In Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment (Vol. 9787). [978707] SPIE. https://doi.org/10.1117/12.2216253

Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast! / Lee, Juhun; Nishikawa, Robert M.; Reiser, Ingrid; Boone, John M.

Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment. Vol. 9787 SPIE, 2016. 978707.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, J, Nishikawa, RM, Reiser, I & Boone, JM 2016, Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast! in Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment. vol. 9787, 978707, SPIE, Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment, San Diego, United States, 3/2/16. https://doi.org/10.1117/12.2216253
Lee J, Nishikawa RM, Reiser I, Boone JM. Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast! In Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment. Vol. 9787. SPIE. 2016. 978707 https://doi.org/10.1117/12.2216253
Lee, Juhun ; Nishikawa, Robert M. ; Reiser, Ingrid ; Boone, John M. / Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast!. Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment. Vol. 9787 SPIE, 2016.
@inproceedings{019f82efbd444f7ebce4899a94d469fd,
title = "Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast!",
abstract = "The purpose of this study was to determine radiologists' diagnostic performances on different image reconstruction algorithms that could be used to optimize image-based model observers. We included a total of 102 pathology proven breast computed tomography (CT) cases (62 malignant). An iterative image reconstruction (IIR) algorithm was used to obtain 24 reconstructions with different image appearance for each image. Using quantitative image feature analysis, three IIRs and one clinical reconstruction of 50 lesions (25 malignant) were selected for a reader study. The reconstructions spanned a range of smooth-low noise to sharp-high noise image appearance. The trained classifiers' AUCs on the above reconstructions ranged from 0.61 (for smooth reconstruction) to 0.95 (for sharp reconstruction). Six experienced MQSA radiologists read 200 cases (50 lesions times 4 reconstructions) and provided the likelihood of malignancy of each lesion. Radiologists' diagnostic performances (AUC) ranged from 0.7 to 0.89. However, there was no agreement among the six radiologists on which image appearance was the best, in terms of radiologists' having the highest diagnostic performances. Specifically, two radiologists indicated sharper image appearance was diagnostically superior, another two radiologists indicated smoother image appearance was diagnostically superior, and another two radiologists indicated all image appearances were diagnostically similar to each other. Due to the poor agreement among radiologists on the diagnostic ranking of images, it may not be possible to develop a model observer for this particular imaging task.",
keywords = "Breast cancer, Breast computed tomography, Diagnostic performance, Model observers, Reader study",
author = "Juhun Lee and Nishikawa, {Robert M.} and Ingrid Reiser and Boone, {John M}",
year = "2016",
doi = "10.1117/12.2216253",
language = "English (US)",
volume = "9787",
booktitle = "Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment",
publisher = "SPIE",

}

TY - GEN

T1 - Can model observers be developed to reproduce radiologists' diagnostic performances? Our study says not so fast!

AU - Lee, Juhun

AU - Nishikawa, Robert M.

AU - Reiser, Ingrid

AU - Boone, John M

PY - 2016

Y1 - 2016

N2 - The purpose of this study was to determine radiologists' diagnostic performances on different image reconstruction algorithms that could be used to optimize image-based model observers. We included a total of 102 pathology proven breast computed tomography (CT) cases (62 malignant). An iterative image reconstruction (IIR) algorithm was used to obtain 24 reconstructions with different image appearance for each image. Using quantitative image feature analysis, three IIRs and one clinical reconstruction of 50 lesions (25 malignant) were selected for a reader study. The reconstructions spanned a range of smooth-low noise to sharp-high noise image appearance. The trained classifiers' AUCs on the above reconstructions ranged from 0.61 (for smooth reconstruction) to 0.95 (for sharp reconstruction). Six experienced MQSA radiologists read 200 cases (50 lesions times 4 reconstructions) and provided the likelihood of malignancy of each lesion. Radiologists' diagnostic performances (AUC) ranged from 0.7 to 0.89. However, there was no agreement among the six radiologists on which image appearance was the best, in terms of radiologists' having the highest diagnostic performances. Specifically, two radiologists indicated sharper image appearance was diagnostically superior, another two radiologists indicated smoother image appearance was diagnostically superior, and another two radiologists indicated all image appearances were diagnostically similar to each other. Due to the poor agreement among radiologists on the diagnostic ranking of images, it may not be possible to develop a model observer for this particular imaging task.

AB - The purpose of this study was to determine radiologists' diagnostic performances on different image reconstruction algorithms that could be used to optimize image-based model observers. We included a total of 102 pathology proven breast computed tomography (CT) cases (62 malignant). An iterative image reconstruction (IIR) algorithm was used to obtain 24 reconstructions with different image appearance for each image. Using quantitative image feature analysis, three IIRs and one clinical reconstruction of 50 lesions (25 malignant) were selected for a reader study. The reconstructions spanned a range of smooth-low noise to sharp-high noise image appearance. The trained classifiers' AUCs on the above reconstructions ranged from 0.61 (for smooth reconstruction) to 0.95 (for sharp reconstruction). Six experienced MQSA radiologists read 200 cases (50 lesions times 4 reconstructions) and provided the likelihood of malignancy of each lesion. Radiologists' diagnostic performances (AUC) ranged from 0.7 to 0.89. However, there was no agreement among the six radiologists on which image appearance was the best, in terms of radiologists' having the highest diagnostic performances. Specifically, two radiologists indicated sharper image appearance was diagnostically superior, another two radiologists indicated smoother image appearance was diagnostically superior, and another two radiologists indicated all image appearances were diagnostically similar to each other. Due to the poor agreement among radiologists on the diagnostic ranking of images, it may not be possible to develop a model observer for this particular imaging task.

KW - Breast cancer

KW - Breast computed tomography

KW - Diagnostic performance

KW - Model observers

KW - Reader study

UR - http://www.scopus.com/inward/record.url?scp=84976298031&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84976298031&partnerID=8YFLogxK

U2 - 10.1117/12.2216253

DO - 10.1117/12.2216253

M3 - Conference contribution

VL - 9787

BT - Medical Imaging 2016: Image Perception, Observer Performance, and Technology Assessment

PB - SPIE

ER -