Coding neuroradiology reports for the Northern Manhattan Stroke Study

A comparison of natural language processing and manual review

Jacob S. Elkins, Carol Friedman, Bernadette Boden-Albala, Ralph L Sacco, George Hripcsak

Research output: Contribution to journalArticle

45 Citations (Scopus)

Abstract

Automated systems using natural language processing may greatly speed chart review tasks for clinical research, but their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automated and manual coding in the data acquisition tasks of an ongoing clinical research study, the Northern Manhattan Stroke Study (NOMASS). We identified 471 neuroradiology reports of brain images used in the NOMASS study. Using both automated and manual coding, we completed a standardized NOMASS imaging form with the information contained in these reports. We then generated ROC curves for both manual and automated coding by comparing our results to the original NOMASS data, where study investigators directly coded their interpretations of brain images. The areas under the ROC curves for both manual and automated coding were the main outcome measure. The overall predictive value of the automated system (ROC area 0.85, 95% CI 0.84-0.87) was not statistically different from the predictive value of the manual coding (ROC area 0.87, 95% CI 0.83-0.91). Measured in terms of accuracy, the automated system performed slightly worse than manual coding. The overall accuracy of the automated system was 84% (CI 83-85%). The overall accuracy of manual coding was 86% (CI 84-88%). The difference in accuracy between the two methods was small but statistically significant (P = 0.026). Errors in manual coding appeared to be due to differences between neurologists' and neuroradiologists' interpretations, different use of detailed anatomic terms, and lack of clinical information. Automated systems can use natural language processing to rapidly perform complex data acquisition tasks. Although there is a small decrease in the accuracy of the data as compared to traditional methods, automated systems may greatly expand the power of chart review in clinical research design and implementation. (C) 2000 Academic Press.

Original languageEnglish
Pages (from-to)1-10
Number of pages10
JournalComputers and Biomedical Research
Volume33
Issue number1
DOIs
StatePublished - Feb 1 2000
Externally publishedYes

Fingerprint

Natural Language Processing
Stroke
Data acquisition
Brain
Natural language processing systems
Processing
ROC Curve
Research
Area Under Curve
Imaging techniques
Research Design
Research Personnel
Outcome Assessment (Health Care)

ASJC Scopus subject areas

  • Medicine (miscellaneous)

Cite this

Coding neuroradiology reports for the Northern Manhattan Stroke Study : A comparison of natural language processing and manual review. / Elkins, Jacob S.; Friedman, Carol; Boden-Albala, Bernadette; Sacco, Ralph L; Hripcsak, George.

In: Computers and Biomedical Research, Vol. 33, No. 1, 01.02.2000, p. 1-10.

Research output: Contribution to journalArticle

Elkins, Jacob S. ; Friedman, Carol ; Boden-Albala, Bernadette ; Sacco, Ralph L ; Hripcsak, George. / Coding neuroradiology reports for the Northern Manhattan Stroke Study : A comparison of natural language processing and manual review. In: Computers and Biomedical Research. 2000 ; Vol. 33, No. 1. pp. 1-10.
@article{2ef6af76fd904596a5f98e29a8f45f58,
title = "Coding neuroradiology reports for the Northern Manhattan Stroke Study: A comparison of natural language processing and manual review",
abstract = "Automated systems using natural language processing may greatly speed chart review tasks for clinical research, but their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automated and manual coding in the data acquisition tasks of an ongoing clinical research study, the Northern Manhattan Stroke Study (NOMASS). We identified 471 neuroradiology reports of brain images used in the NOMASS study. Using both automated and manual coding, we completed a standardized NOMASS imaging form with the information contained in these reports. We then generated ROC curves for both manual and automated coding by comparing our results to the original NOMASS data, where study investigators directly coded their interpretations of brain images. The areas under the ROC curves for both manual and automated coding were the main outcome measure. The overall predictive value of the automated system (ROC area 0.85, 95{\%} CI 0.84-0.87) was not statistically different from the predictive value of the manual coding (ROC area 0.87, 95{\%} CI 0.83-0.91). Measured in terms of accuracy, the automated system performed slightly worse than manual coding. The overall accuracy of the automated system was 84{\%} (CI 83-85{\%}). The overall accuracy of manual coding was 86{\%} (CI 84-88{\%}). The difference in accuracy between the two methods was small but statistically significant (P = 0.026). Errors in manual coding appeared to be due to differences between neurologists' and neuroradiologists' interpretations, different use of detailed anatomic terms, and lack of clinical information. Automated systems can use natural language processing to rapidly perform complex data acquisition tasks. Although there is a small decrease in the accuracy of the data as compared to traditional methods, automated systems may greatly expand the power of chart review in clinical research design and implementation. (C) 2000 Academic Press.",
author = "Elkins, {Jacob S.} and Carol Friedman and Bernadette Boden-Albala and Sacco, {Ralph L} and George Hripcsak",
year = "2000",
month = "2",
day = "1",
doi = "10.1006/cbmr.1999.1535",
language = "English",
volume = "33",
pages = "1--10",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Coding neuroradiology reports for the Northern Manhattan Stroke Study

T2 - A comparison of natural language processing and manual review

AU - Elkins, Jacob S.

AU - Friedman, Carol

AU - Boden-Albala, Bernadette

AU - Sacco, Ralph L

AU - Hripcsak, George

PY - 2000/2/1

Y1 - 2000/2/1

N2 - Automated systems using natural language processing may greatly speed chart review tasks for clinical research, but their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automated and manual coding in the data acquisition tasks of an ongoing clinical research study, the Northern Manhattan Stroke Study (NOMASS). We identified 471 neuroradiology reports of brain images used in the NOMASS study. Using both automated and manual coding, we completed a standardized NOMASS imaging form with the information contained in these reports. We then generated ROC curves for both manual and automated coding by comparing our results to the original NOMASS data, where study investigators directly coded their interpretations of brain images. The areas under the ROC curves for both manual and automated coding were the main outcome measure. The overall predictive value of the automated system (ROC area 0.85, 95% CI 0.84-0.87) was not statistically different from the predictive value of the manual coding (ROC area 0.87, 95% CI 0.83-0.91). Measured in terms of accuracy, the automated system performed slightly worse than manual coding. The overall accuracy of the automated system was 84% (CI 83-85%). The overall accuracy of manual coding was 86% (CI 84-88%). The difference in accuracy between the two methods was small but statistically significant (P = 0.026). Errors in manual coding appeared to be due to differences between neurologists' and neuroradiologists' interpretations, different use of detailed anatomic terms, and lack of clinical information. Automated systems can use natural language processing to rapidly perform complex data acquisition tasks. Although there is a small decrease in the accuracy of the data as compared to traditional methods, automated systems may greatly expand the power of chart review in clinical research design and implementation. (C) 2000 Academic Press.

AB - Automated systems using natural language processing may greatly speed chart review tasks for clinical research, but their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automated and manual coding in the data acquisition tasks of an ongoing clinical research study, the Northern Manhattan Stroke Study (NOMASS). We identified 471 neuroradiology reports of brain images used in the NOMASS study. Using both automated and manual coding, we completed a standardized NOMASS imaging form with the information contained in these reports. We then generated ROC curves for both manual and automated coding by comparing our results to the original NOMASS data, where study investigators directly coded their interpretations of brain images. The areas under the ROC curves for both manual and automated coding were the main outcome measure. The overall predictive value of the automated system (ROC area 0.85, 95% CI 0.84-0.87) was not statistically different from the predictive value of the manual coding (ROC area 0.87, 95% CI 0.83-0.91). Measured in terms of accuracy, the automated system performed slightly worse than manual coding. The overall accuracy of the automated system was 84% (CI 83-85%). The overall accuracy of manual coding was 86% (CI 84-88%). The difference in accuracy between the two methods was small but statistically significant (P = 0.026). Errors in manual coding appeared to be due to differences between neurologists' and neuroradiologists' interpretations, different use of detailed anatomic terms, and lack of clinical information. Automated systems can use natural language processing to rapidly perform complex data acquisition tasks. Although there is a small decrease in the accuracy of the data as compared to traditional methods, automated systems may greatly expand the power of chart review in clinical research design and implementation. (C) 2000 Academic Press.

UR - http://www.scopus.com/inward/record.url?scp=0034054634&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034054634&partnerID=8YFLogxK

U2 - 10.1006/cbmr.1999.1535

DO - 10.1006/cbmr.1999.1535

M3 - Article

VL - 33

SP - 1

EP - 10

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

IS - 1

ER -