Reproducibility and feasibility of strategies for morphologic assessment of renal biopsies using the nephrotic syndrome study network digital pathology scoring system

Jarcy Zee, Jeffrey B. Hodgin, Laura H. Mariani, Joseph P. Gaut, Matthew B. Palmer, Serena M. Bagnasco, Avi Z. Rosenberg, Stephen M. Hewitt, Lawrence B. Holzman, Brenda W. Gillespie, Laura Barisoni-Thomas

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Context.—Testing reproducibility is critical for the development of methodologies for morphologic assessment. Our previous study using the descriptor-based Nephrotic Syndrome Study Network Digital Pathology Scoring System (NDPSS) on glomerular images revealed variable reproducibility. Objective.—To test reproducibility and feasibility of alternative scoring strategies for digital morphologic assessment of glomeruli and explore use of alternative agreement statistics. Design.—The original NDPSS was modified (NDPSS1 and NDPSS2) to evaluate (1) independent scoring of each individual biopsy level, (2) use of continuous measures, (3) groupings of individual descriptors into classes and subclasses prior to scoring, and (4) indication of pathologists’ confidence/uncertainty for any given score. Three and 5 pathologists scored 157 and 79 glomeruli using the NDPSS1 and NDPSS2, respectively. Agreement was tested using conventional (Cohen j) and alternative (Gwet agreement coefficient 1 [AC1]) agreement statistics and compared with previously published data (original NDPSS). Results.—Overall, pathologists’ uncertainty was low, favoring application of the Gwet AC1. Greater agreement was achieved using the Gwet AC1 compared with the Cohen j across all scoring methodologies. Mean (standard deviation) differences in agreement estimates using the NDPSS1 and NDPSS2 compared with the single-level original NDPSS were 0.09 (0.17) and 0.17 (0.17), respectively. Using the Gwet AC1, 79% of the original NDPSS descriptors had good or excellent agreement. Pathologist feedback indicated the NDPSS1 and NDPSS2 were time-consuming. Conclusions.—The NDPSS1 and NDPSS2 increased pathologists’ scoring burden without improving reproducibility. Use of alternative agreement statistics was strongly supported. We suggest using the original NDPSS on whole slide images for glomerular morphology assessment and for guiding future automated technologies.

Original languageEnglish (US)
Pages (from-to)613-625
Number of pages13
JournalArchives of Pathology and Laboratory Medicine
Volume142
Issue number5
DOIs
StatePublished - May 1 2018

Fingerprint

Nephrotic Syndrome
Pathology
Kidney
Biopsy
Uncertainty
Pathologists
Technology

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Medical Laboratory Technology

Cite this

Reproducibility and feasibility of strategies for morphologic assessment of renal biopsies using the nephrotic syndrome study network digital pathology scoring system. / Zee, Jarcy; Hodgin, Jeffrey B.; Mariani, Laura H.; Gaut, Joseph P.; Palmer, Matthew B.; Bagnasco, Serena M.; Rosenberg, Avi Z.; Hewitt, Stephen M.; Holzman, Lawrence B.; Gillespie, Brenda W.; Barisoni-Thomas, Laura.

In: Archives of Pathology and Laboratory Medicine, Vol. 142, No. 5, 01.05.2018, p. 613-625.

Research output: Contribution to journalArticle

Zee, Jarcy ; Hodgin, Jeffrey B. ; Mariani, Laura H. ; Gaut, Joseph P. ; Palmer, Matthew B. ; Bagnasco, Serena M. ; Rosenberg, Avi Z. ; Hewitt, Stephen M. ; Holzman, Lawrence B. ; Gillespie, Brenda W. ; Barisoni-Thomas, Laura. / Reproducibility and feasibility of strategies for morphologic assessment of renal biopsies using the nephrotic syndrome study network digital pathology scoring system. In: Archives of Pathology and Laboratory Medicine. 2018 ; Vol. 142, No. 5. pp. 613-625.
@article{c19a3e7513894f219016a359028ecad6,
title = "Reproducibility and feasibility of strategies for morphologic assessment of renal biopsies using the nephrotic syndrome study network digital pathology scoring system",
abstract = "Context.—Testing reproducibility is critical for the development of methodologies for morphologic assessment. Our previous study using the descriptor-based Nephrotic Syndrome Study Network Digital Pathology Scoring System (NDPSS) on glomerular images revealed variable reproducibility. Objective.—To test reproducibility and feasibility of alternative scoring strategies for digital morphologic assessment of glomeruli and explore use of alternative agreement statistics. Design.—The original NDPSS was modified (NDPSS1 and NDPSS2) to evaluate (1) independent scoring of each individual biopsy level, (2) use of continuous measures, (3) groupings of individual descriptors into classes and subclasses prior to scoring, and (4) indication of pathologists’ confidence/uncertainty for any given score. Three and 5 pathologists scored 157 and 79 glomeruli using the NDPSS1 and NDPSS2, respectively. Agreement was tested using conventional (Cohen j) and alternative (Gwet agreement coefficient 1 [AC1]) agreement statistics and compared with previously published data (original NDPSS). Results.—Overall, pathologists’ uncertainty was low, favoring application of the Gwet AC1. Greater agreement was achieved using the Gwet AC1 compared with the Cohen j across all scoring methodologies. Mean (standard deviation) differences in agreement estimates using the NDPSS1 and NDPSS2 compared with the single-level original NDPSS were 0.09 (0.17) and 0.17 (0.17), respectively. Using the Gwet AC1, 79{\%} of the original NDPSS descriptors had good or excellent agreement. Pathologist feedback indicated the NDPSS1 and NDPSS2 were time-consuming. Conclusions.—The NDPSS1 and NDPSS2 increased pathologists’ scoring burden without improving reproducibility. Use of alternative agreement statistics was strongly supported. We suggest using the original NDPSS on whole slide images for glomerular morphology assessment and for guiding future automated technologies.",
author = "Jarcy Zee and Hodgin, {Jeffrey B.} and Mariani, {Laura H.} and Gaut, {Joseph P.} and Palmer, {Matthew B.} and Bagnasco, {Serena M.} and Rosenberg, {Avi Z.} and Hewitt, {Stephen M.} and Holzman, {Lawrence B.} and Gillespie, {Brenda W.} and Laura Barisoni-Thomas",
year = "2018",
month = "5",
day = "1",
doi = "10.5858/arpa.2017-0181-OA",
language = "English (US)",
volume = "142",
pages = "613--625",
journal = "Archives of Pathology and Laboratory Medicine",
issn = "0003-9985",
publisher = "College of American Pathologists",
number = "5",

}

TY - JOUR

T1 - Reproducibility and feasibility of strategies for morphologic assessment of renal biopsies using the nephrotic syndrome study network digital pathology scoring system

AU - Zee, Jarcy

AU - Hodgin, Jeffrey B.

AU - Mariani, Laura H.

AU - Gaut, Joseph P.

AU - Palmer, Matthew B.

AU - Bagnasco, Serena M.

AU - Rosenberg, Avi Z.

AU - Hewitt, Stephen M.

AU - Holzman, Lawrence B.

AU - Gillespie, Brenda W.

AU - Barisoni-Thomas, Laura

PY - 2018/5/1

Y1 - 2018/5/1

N2 - Context.—Testing reproducibility is critical for the development of methodologies for morphologic assessment. Our previous study using the descriptor-based Nephrotic Syndrome Study Network Digital Pathology Scoring System (NDPSS) on glomerular images revealed variable reproducibility. Objective.—To test reproducibility and feasibility of alternative scoring strategies for digital morphologic assessment of glomeruli and explore use of alternative agreement statistics. Design.—The original NDPSS was modified (NDPSS1 and NDPSS2) to evaluate (1) independent scoring of each individual biopsy level, (2) use of continuous measures, (3) groupings of individual descriptors into classes and subclasses prior to scoring, and (4) indication of pathologists’ confidence/uncertainty for any given score. Three and 5 pathologists scored 157 and 79 glomeruli using the NDPSS1 and NDPSS2, respectively. Agreement was tested using conventional (Cohen j) and alternative (Gwet agreement coefficient 1 [AC1]) agreement statistics and compared with previously published data (original NDPSS). Results.—Overall, pathologists’ uncertainty was low, favoring application of the Gwet AC1. Greater agreement was achieved using the Gwet AC1 compared with the Cohen j across all scoring methodologies. Mean (standard deviation) differences in agreement estimates using the NDPSS1 and NDPSS2 compared with the single-level original NDPSS were 0.09 (0.17) and 0.17 (0.17), respectively. Using the Gwet AC1, 79% of the original NDPSS descriptors had good or excellent agreement. Pathologist feedback indicated the NDPSS1 and NDPSS2 were time-consuming. Conclusions.—The NDPSS1 and NDPSS2 increased pathologists’ scoring burden without improving reproducibility. Use of alternative agreement statistics was strongly supported. We suggest using the original NDPSS on whole slide images for glomerular morphology assessment and for guiding future automated technologies.

AB - Context.—Testing reproducibility is critical for the development of methodologies for morphologic assessment. Our previous study using the descriptor-based Nephrotic Syndrome Study Network Digital Pathology Scoring System (NDPSS) on glomerular images revealed variable reproducibility. Objective.—To test reproducibility and feasibility of alternative scoring strategies for digital morphologic assessment of glomeruli and explore use of alternative agreement statistics. Design.—The original NDPSS was modified (NDPSS1 and NDPSS2) to evaluate (1) independent scoring of each individual biopsy level, (2) use of continuous measures, (3) groupings of individual descriptors into classes and subclasses prior to scoring, and (4) indication of pathologists’ confidence/uncertainty for any given score. Three and 5 pathologists scored 157 and 79 glomeruli using the NDPSS1 and NDPSS2, respectively. Agreement was tested using conventional (Cohen j) and alternative (Gwet agreement coefficient 1 [AC1]) agreement statistics and compared with previously published data (original NDPSS). Results.—Overall, pathologists’ uncertainty was low, favoring application of the Gwet AC1. Greater agreement was achieved using the Gwet AC1 compared with the Cohen j across all scoring methodologies. Mean (standard deviation) differences in agreement estimates using the NDPSS1 and NDPSS2 compared with the single-level original NDPSS were 0.09 (0.17) and 0.17 (0.17), respectively. Using the Gwet AC1, 79% of the original NDPSS descriptors had good or excellent agreement. Pathologist feedback indicated the NDPSS1 and NDPSS2 were time-consuming. Conclusions.—The NDPSS1 and NDPSS2 increased pathologists’ scoring burden without improving reproducibility. Use of alternative agreement statistics was strongly supported. We suggest using the original NDPSS on whole slide images for glomerular morphology assessment and for guiding future automated technologies.

UR - http://www.scopus.com/inward/record.url?scp=85046662315&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046662315&partnerID=8YFLogxK

U2 - 10.5858/arpa.2017-0181-OA

DO - 10.5858/arpa.2017-0181-OA

M3 - Article

VL - 142

SP - 613

EP - 625

JO - Archives of Pathology and Laboratory Medicine

JF - Archives of Pathology and Laboratory Medicine

SN - 0003-9985

IS - 5

ER -