Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline

Daniel J. Goff, Thomas W Loehfelm

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86% of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.

Original languageEnglish (US)
Pages (from-to)1-8
Number of pages8
JournalJournal of Digital Imaging
DOIs
StateAccepted/In press - Oct 30 2017

Fingerprint

Natural Language Processing
Radiology
Pipelines
Processing
Radiology Information Systems
Pelvis
Gold
Abdomen
Information systems
Imaging techniques
Radiologists

Keywords

  • Data extraction
  • NLP
  • Radiology report
  • Report summarization

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging
  • Computer Science Applications

Cite this

Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline. / Goff, Daniel J.; Loehfelm, Thomas W.

In: Journal of Digital Imaging, 30.10.2017, p. 1-8.

Research output: Contribution to journalArticle

@article{f59c5cdd0cf54027a5e1149693ce309b,
title = "Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline",
abstract = "Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86{\%} of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.",
keywords = "Data extraction, NLP, Radiology report, Report summarization",
author = "Goff, {Daniel J.} and Loehfelm, {Thomas W}",
year = "2017",
month = "10",
day = "30",
doi = "10.1007/s10278-017-0030-2",
language = "English (US)",
pages = "1--8",
journal = "Journal of Digital Imaging",
issn = "0897-1889",
publisher = "Springer New York",

}

TY - JOUR

T1 - Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline

AU - Goff, Daniel J.

AU - Loehfelm, Thomas W

PY - 2017/10/30

Y1 - 2017/10/30

N2 - Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86% of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.

AB - Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86% of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.

KW - Data extraction

KW - NLP

KW - Radiology report

KW - Report summarization

UR - http://www.scopus.com/inward/record.url?scp=85032672275&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032672275&partnerID=8YFLogxK

U2 - 10.1007/s10278-017-0030-2

DO - 10.1007/s10278-017-0030-2

M3 - Article

C2 - 29086081

AN - SCOPUS:85032672275

SP - 1

EP - 8

JO - Journal of Digital Imaging

JF - Journal of Digital Imaging

SN - 0897-1889

ER -