An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer

David L. Rimm, Samuel C.Y. Leung, Lisa M. McShane, Yalai Bai, Anita L. Bane, John M.S. Bartlett, Jane Bayani, Martin C. Chang, Michelle Dean, Carsten Denkert, Emeka K. Enwere, Chad Galderisi, Abhi Gholap, Judith C. Hugh, Anagha Jadhav, Elizabeth N. Kornaga, Arvydas Laurinavicius, Richard M Levenson, Joema Lima, Keith Miller & 10 others Liron Pantanowitz, Tammy Piper, Jason Ruan, Malini Srinivasan, Shakeel Virk, Ying Wu, Hua Yang, Daniel F. Hayes, Torsten O. Nielsen, Mitch Dowsett

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

The nuclear proliferation biomarker Ki67 has potential prognostic, predictive, and monitoring roles in breast cancer. Unacceptable between-laboratory variability has limited its clinical value. The International Ki67 in Breast Cancer Working Group investigated whether Ki67 immunohistochemistry can be analytically validated and standardized across laboratories using automated machine-based scoring. Sets of pre-stained core-cut biopsy sections of 30 breast tumors were circulated to 14 laboratories for scanning and automated assessment of the average and maximum percentage of tumor cells positive for Ki67. Seven unique scanners and 10 software platforms were involved in this study. Pre-specified analyses included evaluation of reproducibility between all laboratories (primary) as well as among those using scanners from a single vendor (secondary). The primary reproducibility metric was intraclass correlation coefficient between laboratories, with success considered to be intraclass correlation coefficient >0.80. Intraclass correlation coefficient for automated average scores across 16 operators was 0.83 (95% credible interval: 0.73–0.91) and intraclass correlation coefficient for maximum scores across 10 operators was 0.63 (95% credible interval: 0.44–0.80). For the laboratories using scanners from a single vendor (8 score sets), intraclass correlation coefficient for average automated scores was 0.89 (95% credible interval: 0.81–0.96), which was similar to the intraclass correlation coefficient of 0.87 (95% credible interval: 0.81–0.93) achieved using these same slides in a prior visual-reading reproducibility study. Automated machine assessment of average Ki67 has the potential to achieve between-laboratory reproducibility similar to that for a rigorously standardized pathologist-based visual assessment of Ki67. The observed intraclass correlation coefficient was worse for maximum compared to average scoring methods, suggesting that maximum score methods may be suboptimal for consistent measurement of proliferation. Automated average scoring methods show promise for assessment of Ki67 scoring, but requires further standardization and subsequent clinical validation.

Original languageEnglish (US)
JournalModern Pathology
DOIs
StateAccepted/In press - Jan 1 2018

Fingerprint

Multicenter Studies
Breast Neoplasms
Research Design
Reading
Software
Biomarkers
Immunohistochemistry
Biopsy
Neoplasms

ASJC Scopus subject areas

  • Pathology and Forensic Medicine

Cite this

Rimm, D. L., Leung, S. C. Y., McShane, L. M., Bai, Y., Bane, A. L., Bartlett, J. M. S., ... Dowsett, M. (Accepted/In press). An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer. Modern Pathology. https://doi.org/10.1038/s41379-018-0109-4

An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer. / Rimm, David L.; Leung, Samuel C.Y.; McShane, Lisa M.; Bai, Yalai; Bane, Anita L.; Bartlett, John M.S.; Bayani, Jane; Chang, Martin C.; Dean, Michelle; Denkert, Carsten; Enwere, Emeka K.; Galderisi, Chad; Gholap, Abhi; Hugh, Judith C.; Jadhav, Anagha; Kornaga, Elizabeth N.; Laurinavicius, Arvydas; Levenson, Richard M; Lima, Joema; Miller, Keith; Pantanowitz, Liron; Piper, Tammy; Ruan, Jason; Srinivasan, Malini; Virk, Shakeel; Wu, Ying; Yang, Hua; Hayes, Daniel F.; Nielsen, Torsten O.; Dowsett, Mitch.

In: Modern Pathology, 01.01.2018.

Research output: Contribution to journalArticle

Rimm, DL, Leung, SCY, McShane, LM, Bai, Y, Bane, AL, Bartlett, JMS, Bayani, J, Chang, MC, Dean, M, Denkert, C, Enwere, EK, Galderisi, C, Gholap, A, Hugh, JC, Jadhav, A, Kornaga, EN, Laurinavicius, A, Levenson, RM, Lima, J, Miller, K, Pantanowitz, L, Piper, T, Ruan, J, Srinivasan, M, Virk, S, Wu, Y, Yang, H, Hayes, DF, Nielsen, TO & Dowsett, M 2018, 'An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer', Modern Pathology. https://doi.org/10.1038/s41379-018-0109-4
Rimm, David L. ; Leung, Samuel C.Y. ; McShane, Lisa M. ; Bai, Yalai ; Bane, Anita L. ; Bartlett, John M.S. ; Bayani, Jane ; Chang, Martin C. ; Dean, Michelle ; Denkert, Carsten ; Enwere, Emeka K. ; Galderisi, Chad ; Gholap, Abhi ; Hugh, Judith C. ; Jadhav, Anagha ; Kornaga, Elizabeth N. ; Laurinavicius, Arvydas ; Levenson, Richard M ; Lima, Joema ; Miller, Keith ; Pantanowitz, Liron ; Piper, Tammy ; Ruan, Jason ; Srinivasan, Malini ; Virk, Shakeel ; Wu, Ying ; Yang, Hua ; Hayes, Daniel F. ; Nielsen, Torsten O. ; Dowsett, Mitch. / An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer. In: Modern Pathology. 2018.
@article{f3f136a4a677481187cf953353bac267,
title = "An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer",
abstract = "The nuclear proliferation biomarker Ki67 has potential prognostic, predictive, and monitoring roles in breast cancer. Unacceptable between-laboratory variability has limited its clinical value. The International Ki67 in Breast Cancer Working Group investigated whether Ki67 immunohistochemistry can be analytically validated and standardized across laboratories using automated machine-based scoring. Sets of pre-stained core-cut biopsy sections of 30 breast tumors were circulated to 14 laboratories for scanning and automated assessment of the average and maximum percentage of tumor cells positive for Ki67. Seven unique scanners and 10 software platforms were involved in this study. Pre-specified analyses included evaluation of reproducibility between all laboratories (primary) as well as among those using scanners from a single vendor (secondary). The primary reproducibility metric was intraclass correlation coefficient between laboratories, with success considered to be intraclass correlation coefficient >0.80. Intraclass correlation coefficient for automated average scores across 16 operators was 0.83 (95{\%} credible interval: 0.73–0.91) and intraclass correlation coefficient for maximum scores across 10 operators was 0.63 (95{\%} credible interval: 0.44–0.80). For the laboratories using scanners from a single vendor (8 score sets), intraclass correlation coefficient for average automated scores was 0.89 (95{\%} credible interval: 0.81–0.96), which was similar to the intraclass correlation coefficient of 0.87 (95{\%} credible interval: 0.81–0.93) achieved using these same slides in a prior visual-reading reproducibility study. Automated machine assessment of average Ki67 has the potential to achieve between-laboratory reproducibility similar to that for a rigorously standardized pathologist-based visual assessment of Ki67. The observed intraclass correlation coefficient was worse for maximum compared to average scoring methods, suggesting that maximum score methods may be suboptimal for consistent measurement of proliferation. Automated average scoring methods show promise for assessment of Ki67 scoring, but requires further standardization and subsequent clinical validation.",
author = "Rimm, {David L.} and Leung, {Samuel C.Y.} and McShane, {Lisa M.} and Yalai Bai and Bane, {Anita L.} and Bartlett, {John M.S.} and Jane Bayani and Chang, {Martin C.} and Michelle Dean and Carsten Denkert and Enwere, {Emeka K.} and Chad Galderisi and Abhi Gholap and Hugh, {Judith C.} and Anagha Jadhav and Kornaga, {Elizabeth N.} and Arvydas Laurinavicius and Levenson, {Richard M} and Joema Lima and Keith Miller and Liron Pantanowitz and Tammy Piper and Jason Ruan and Malini Srinivasan and Shakeel Virk and Ying Wu and Hua Yang and Hayes, {Daniel F.} and Nielsen, {Torsten O.} and Mitch Dowsett",
year = "2018",
month = "1",
day = "1",
doi = "10.1038/s41379-018-0109-4",
language = "English (US)",
journal = "Modern Pathology",
issn = "0893-3952",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - An international multicenter study to evaluate reproducibility of automated scoring for assessment of Ki67 in breast cancer

AU - Rimm, David L.

AU - Leung, Samuel C.Y.

AU - McShane, Lisa M.

AU - Bai, Yalai

AU - Bane, Anita L.

AU - Bartlett, John M.S.

AU - Bayani, Jane

AU - Chang, Martin C.

AU - Dean, Michelle

AU - Denkert, Carsten

AU - Enwere, Emeka K.

AU - Galderisi, Chad

AU - Gholap, Abhi

AU - Hugh, Judith C.

AU - Jadhav, Anagha

AU - Kornaga, Elizabeth N.

AU - Laurinavicius, Arvydas

AU - Levenson, Richard M

AU - Lima, Joema

AU - Miller, Keith

AU - Pantanowitz, Liron

AU - Piper, Tammy

AU - Ruan, Jason

AU - Srinivasan, Malini

AU - Virk, Shakeel

AU - Wu, Ying

AU - Yang, Hua

AU - Hayes, Daniel F.

AU - Nielsen, Torsten O.

AU - Dowsett, Mitch

PY - 2018/1/1

Y1 - 2018/1/1

N2 - The nuclear proliferation biomarker Ki67 has potential prognostic, predictive, and monitoring roles in breast cancer. Unacceptable between-laboratory variability has limited its clinical value. The International Ki67 in Breast Cancer Working Group investigated whether Ki67 immunohistochemistry can be analytically validated and standardized across laboratories using automated machine-based scoring. Sets of pre-stained core-cut biopsy sections of 30 breast tumors were circulated to 14 laboratories for scanning and automated assessment of the average and maximum percentage of tumor cells positive for Ki67. Seven unique scanners and 10 software platforms were involved in this study. Pre-specified analyses included evaluation of reproducibility between all laboratories (primary) as well as among those using scanners from a single vendor (secondary). The primary reproducibility metric was intraclass correlation coefficient between laboratories, with success considered to be intraclass correlation coefficient >0.80. Intraclass correlation coefficient for automated average scores across 16 operators was 0.83 (95% credible interval: 0.73–0.91) and intraclass correlation coefficient for maximum scores across 10 operators was 0.63 (95% credible interval: 0.44–0.80). For the laboratories using scanners from a single vendor (8 score sets), intraclass correlation coefficient for average automated scores was 0.89 (95% credible interval: 0.81–0.96), which was similar to the intraclass correlation coefficient of 0.87 (95% credible interval: 0.81–0.93) achieved using these same slides in a prior visual-reading reproducibility study. Automated machine assessment of average Ki67 has the potential to achieve between-laboratory reproducibility similar to that for a rigorously standardized pathologist-based visual assessment of Ki67. The observed intraclass correlation coefficient was worse for maximum compared to average scoring methods, suggesting that maximum score methods may be suboptimal for consistent measurement of proliferation. Automated average scoring methods show promise for assessment of Ki67 scoring, but requires further standardization and subsequent clinical validation.

AB - The nuclear proliferation biomarker Ki67 has potential prognostic, predictive, and monitoring roles in breast cancer. Unacceptable between-laboratory variability has limited its clinical value. The International Ki67 in Breast Cancer Working Group investigated whether Ki67 immunohistochemistry can be analytically validated and standardized across laboratories using automated machine-based scoring. Sets of pre-stained core-cut biopsy sections of 30 breast tumors were circulated to 14 laboratories for scanning and automated assessment of the average and maximum percentage of tumor cells positive for Ki67. Seven unique scanners and 10 software platforms were involved in this study. Pre-specified analyses included evaluation of reproducibility between all laboratories (primary) as well as among those using scanners from a single vendor (secondary). The primary reproducibility metric was intraclass correlation coefficient between laboratories, with success considered to be intraclass correlation coefficient >0.80. Intraclass correlation coefficient for automated average scores across 16 operators was 0.83 (95% credible interval: 0.73–0.91) and intraclass correlation coefficient for maximum scores across 10 operators was 0.63 (95% credible interval: 0.44–0.80). For the laboratories using scanners from a single vendor (8 score sets), intraclass correlation coefficient for average automated scores was 0.89 (95% credible interval: 0.81–0.96), which was similar to the intraclass correlation coefficient of 0.87 (95% credible interval: 0.81–0.93) achieved using these same slides in a prior visual-reading reproducibility study. Automated machine assessment of average Ki67 has the potential to achieve between-laboratory reproducibility similar to that for a rigorously standardized pathologist-based visual assessment of Ki67. The observed intraclass correlation coefficient was worse for maximum compared to average scoring methods, suggesting that maximum score methods may be suboptimal for consistent measurement of proliferation. Automated average scoring methods show promise for assessment of Ki67 scoring, but requires further standardization and subsequent clinical validation.

UR - http://www.scopus.com/inward/record.url?scp=85052968575&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052968575&partnerID=8YFLogxK

U2 - 10.1038/s41379-018-0109-4

DO - 10.1038/s41379-018-0109-4

M3 - Article

JO - Modern Pathology

JF - Modern Pathology

SN - 0893-3952

ER -