Power estimation for non-standardized multisite studies

Anisha Keshavan, Friedemann Paul, Mona K. Beyer, Alyssa H. Zhu, Nico Papinutto, Russell T. Shinohara, William Stern, Michael Amann, Rohit Bakshi, Antje Bischof, Alessandro Carriero, Manuel Comabella, Jason C. Crane, Sandra D'Alfonso, Philippe Demaerel, Benedicte Dubois, Massimo Filippi, Vinzenz Fleischer, Bertrand Fontaine, Laura Gaetano & 37 others An Goris, Christiane Graetz, Adriane Gröger, Sergiu Groppa, David A. Hafler, Hanne F. Harbo, Bernhard Hemmer, Kesshi Jordan, Ludwig Kappos, Gina Kirkish, Sara Llufriu, Stefano Magon, Filippo Martinelli-Boneschi, Jacob L McCauley, Xavier Montalban, Mark Mühlau, Daniel Pelletier, Pradip Pattany, Margaret A Pericak-Vance, Isabelle Cournu-Rebeix, Maria A. Rocca, Alex Rovira, Regina Schlaeger, Albert Saiz, Till Sprenger, Alessandro Stecco, Bernard M J Uitdehaag, Pablo Villoslada, Mike P. Wattjes, Howard Weiner, Jens Wuerfel, Claus Zimmer, Frauke Zipp, Stephen L. Hauser, Jorge R. Oksenberg, Roland G. Henry, Multiple Sclerosis Genetics Consortium International Multiple Sclerosis Genetics Consortium

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

A concern for researchers planning multisite studies is that scanner and T1-weighted sequence-related biases on regional volumes could overshadow true effects, especially for studies with a heterogeneous set of scanners and sequences. Current approaches attempt to harmonize data by standardizing hardware, pulse sequences, and protocols, or by calibrating across sites using phantom-based corrections to ensure the same raw image intensities. We propose to avoid harmonization and phantom-based correction entirely. We hypothesized that the bias of estimated regional volumes is scaled between sites due to the contrast and gradient distortion differences between scanners and sequences. Given this assumption, we provide a new statistical framework and derive a power equation to define inclusion criteria for a set of sites based on the variability of their scaling factors. We estimated the scaling factors of 20 scanners with heterogeneous hardware and sequence parameters by scanning a single set of 12 subjects at sites across the United States and Europe. Regional volumes and their scaling factors were estimated for each site using Freesurfer's segmentation algorithm and ordinary least squares, respectively. The scaling factors were validated by comparing the theoretical and simulated power curves, performing a leave-one-out calibration of regional volumes, and evaluating the absolute agreement of all regional volumes between sites before and after calibration. Using our derived power equation, we were able to define the conditions under which harmonization is not necessary to achieve 80% power. This approach can inform choice of processing pipelines and outcome metrics for multisite studies based on scaling factor variability across sites, enabling collaboration between clinical and research institutions.

Original languageEnglish (US)
Pages (from-to)281-294
Number of pages14
JournalNeuroImage
Volume134
DOIs
StatePublished - Jul 1 2016

Fingerprint

Calibration
Least-Squares Analysis
Research Personnel
Research

ASJC Scopus subject areas

  • Cognitive Neuroscience
  • Neurology

Cite this

Keshavan, A., Paul, F., Beyer, M. K., Zhu, A. H., Papinutto, N., Shinohara, R. T., ... International Multiple Sclerosis Genetics Consortium, M. S. G. C. (2016). Power estimation for non-standardized multisite studies. NeuroImage, 134, 281-294. https://doi.org/10.1016/j.neuroimage.2016.03.051

Power estimation for non-standardized multisite studies. / Keshavan, Anisha; Paul, Friedemann; Beyer, Mona K.; Zhu, Alyssa H.; Papinutto, Nico; Shinohara, Russell T.; Stern, William; Amann, Michael; Bakshi, Rohit; Bischof, Antje; Carriero, Alessandro; Comabella, Manuel; Crane, Jason C.; D'Alfonso, Sandra; Demaerel, Philippe; Dubois, Benedicte; Filippi, Massimo; Fleischer, Vinzenz; Fontaine, Bertrand; Gaetano, Laura; Goris, An; Graetz, Christiane; Gröger, Adriane; Groppa, Sergiu; Hafler, David A.; Harbo, Hanne F.; Hemmer, Bernhard; Jordan, Kesshi; Kappos, Ludwig; Kirkish, Gina; Llufriu, Sara; Magon, Stefano; Martinelli-Boneschi, Filippo; McCauley, Jacob L; Montalban, Xavier; Mühlau, Mark; Pelletier, Daniel; Pattany, Pradip; Pericak-Vance, Margaret A; Cournu-Rebeix, Isabelle; Rocca, Maria A.; Rovira, Alex; Schlaeger, Regina; Saiz, Albert; Sprenger, Till; Stecco, Alessandro; Uitdehaag, Bernard M J; Villoslada, Pablo; Wattjes, Mike P.; Weiner, Howard; Wuerfel, Jens; Zimmer, Claus; Zipp, Frauke; Hauser, Stephen L.; Oksenberg, Jorge R.; Henry, Roland G.; International Multiple Sclerosis Genetics Consortium, Multiple Sclerosis Genetics Consortium.

In: NeuroImage, Vol. 134, 01.07.2016, p. 281-294.

Research output: Contribution to journalArticle

Keshavan, A, Paul, F, Beyer, MK, Zhu, AH, Papinutto, N, Shinohara, RT, Stern, W, Amann, M, Bakshi, R, Bischof, A, Carriero, A, Comabella, M, Crane, JC, D'Alfonso, S, Demaerel, P, Dubois, B, Filippi, M, Fleischer, V, Fontaine, B, Gaetano, L, Goris, A, Graetz, C, Gröger, A, Groppa, S, Hafler, DA, Harbo, HF, Hemmer, B, Jordan, K, Kappos, L, Kirkish, G, Llufriu, S, Magon, S, Martinelli-Boneschi, F, McCauley, JL, Montalban, X, Mühlau, M, Pelletier, D, Pattany, P, Pericak-Vance, MA, Cournu-Rebeix, I, Rocca, MA, Rovira, A, Schlaeger, R, Saiz, A, Sprenger, T, Stecco, A, Uitdehaag, BMJ, Villoslada, P, Wattjes, MP, Weiner, H, Wuerfel, J, Zimmer, C, Zipp, F, Hauser, SL, Oksenberg, JR, Henry, RG & International Multiple Sclerosis Genetics Consortium, MSGC 2016, 'Power estimation for non-standardized multisite studies', NeuroImage, vol. 134, pp. 281-294. https://doi.org/10.1016/j.neuroimage.2016.03.051
Keshavan A, Paul F, Beyer MK, Zhu AH, Papinutto N, Shinohara RT et al. Power estimation for non-standardized multisite studies. NeuroImage. 2016 Jul 1;134:281-294. https://doi.org/10.1016/j.neuroimage.2016.03.051
Keshavan, Anisha ; Paul, Friedemann ; Beyer, Mona K. ; Zhu, Alyssa H. ; Papinutto, Nico ; Shinohara, Russell T. ; Stern, William ; Amann, Michael ; Bakshi, Rohit ; Bischof, Antje ; Carriero, Alessandro ; Comabella, Manuel ; Crane, Jason C. ; D'Alfonso, Sandra ; Demaerel, Philippe ; Dubois, Benedicte ; Filippi, Massimo ; Fleischer, Vinzenz ; Fontaine, Bertrand ; Gaetano, Laura ; Goris, An ; Graetz, Christiane ; Gröger, Adriane ; Groppa, Sergiu ; Hafler, David A. ; Harbo, Hanne F. ; Hemmer, Bernhard ; Jordan, Kesshi ; Kappos, Ludwig ; Kirkish, Gina ; Llufriu, Sara ; Magon, Stefano ; Martinelli-Boneschi, Filippo ; McCauley, Jacob L ; Montalban, Xavier ; Mühlau, Mark ; Pelletier, Daniel ; Pattany, Pradip ; Pericak-Vance, Margaret A ; Cournu-Rebeix, Isabelle ; Rocca, Maria A. ; Rovira, Alex ; Schlaeger, Regina ; Saiz, Albert ; Sprenger, Till ; Stecco, Alessandro ; Uitdehaag, Bernard M J ; Villoslada, Pablo ; Wattjes, Mike P. ; Weiner, Howard ; Wuerfel, Jens ; Zimmer, Claus ; Zipp, Frauke ; Hauser, Stephen L. ; Oksenberg, Jorge R. ; Henry, Roland G. ; International Multiple Sclerosis Genetics Consortium, Multiple Sclerosis Genetics Consortium. / Power estimation for non-standardized multisite studies. In: NeuroImage. 2016 ; Vol. 134. pp. 281-294.
@article{05f79476a0784a55bd7eca3daf46a05e,
title = "Power estimation for non-standardized multisite studies",
abstract = "A concern for researchers planning multisite studies is that scanner and T1-weighted sequence-related biases on regional volumes could overshadow true effects, especially for studies with a heterogeneous set of scanners and sequences. Current approaches attempt to harmonize data by standardizing hardware, pulse sequences, and protocols, or by calibrating across sites using phantom-based corrections to ensure the same raw image intensities. We propose to avoid harmonization and phantom-based correction entirely. We hypothesized that the bias of estimated regional volumes is scaled between sites due to the contrast and gradient distortion differences between scanners and sequences. Given this assumption, we provide a new statistical framework and derive a power equation to define inclusion criteria for a set of sites based on the variability of their scaling factors. We estimated the scaling factors of 20 scanners with heterogeneous hardware and sequence parameters by scanning a single set of 12 subjects at sites across the United States and Europe. Regional volumes and their scaling factors were estimated for each site using Freesurfer's segmentation algorithm and ordinary least squares, respectively. The scaling factors were validated by comparing the theoretical and simulated power curves, performing a leave-one-out calibration of regional volumes, and evaluating the absolute agreement of all regional volumes between sites before and after calibration. Using our derived power equation, we were able to define the conditions under which harmonization is not necessary to achieve 80{\%} power. This approach can inform choice of processing pipelines and outcome metrics for multisite studies based on scaling factor variability across sites, enabling collaboration between clinical and research institutions.",
author = "Anisha Keshavan and Friedemann Paul and Beyer, {Mona K.} and Zhu, {Alyssa H.} and Nico Papinutto and Shinohara, {Russell T.} and William Stern and Michael Amann and Rohit Bakshi and Antje Bischof and Alessandro Carriero and Manuel Comabella and Crane, {Jason C.} and Sandra D'Alfonso and Philippe Demaerel and Benedicte Dubois and Massimo Filippi and Vinzenz Fleischer and Bertrand Fontaine and Laura Gaetano and An Goris and Christiane Graetz and Adriane Gr{\"o}ger and Sergiu Groppa and Hafler, {David A.} and Harbo, {Hanne F.} and Bernhard Hemmer and Kesshi Jordan and Ludwig Kappos and Gina Kirkish and Sara Llufriu and Stefano Magon and Filippo Martinelli-Boneschi and McCauley, {Jacob L} and Xavier Montalban and Mark M{\"u}hlau and Daniel Pelletier and Pradip Pattany and Pericak-Vance, {Margaret A} and Isabelle Cournu-Rebeix and Rocca, {Maria A.} and Alex Rovira and Regina Schlaeger and Albert Saiz and Till Sprenger and Alessandro Stecco and Uitdehaag, {Bernard M J} and Pablo Villoslada and Wattjes, {Mike P.} and Howard Weiner and Jens Wuerfel and Claus Zimmer and Frauke Zipp and Hauser, {Stephen L.} and Oksenberg, {Jorge R.} and Henry, {Roland G.} and {International Multiple Sclerosis Genetics Consortium}, {Multiple Sclerosis Genetics Consortium}",
year = "2016",
month = "7",
day = "1",
doi = "10.1016/j.neuroimage.2016.03.051",
language = "English (US)",
volume = "134",
pages = "281--294",
journal = "NeuroImage",
issn = "1053-8119",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Power estimation for non-standardized multisite studies

AU - Keshavan, Anisha

AU - Paul, Friedemann

AU - Beyer, Mona K.

AU - Zhu, Alyssa H.

AU - Papinutto, Nico

AU - Shinohara, Russell T.

AU - Stern, William

AU - Amann, Michael

AU - Bakshi, Rohit

AU - Bischof, Antje

AU - Carriero, Alessandro

AU - Comabella, Manuel

AU - Crane, Jason C.

AU - D'Alfonso, Sandra

AU - Demaerel, Philippe

AU - Dubois, Benedicte

AU - Filippi, Massimo

AU - Fleischer, Vinzenz

AU - Fontaine, Bertrand

AU - Gaetano, Laura

AU - Goris, An

AU - Graetz, Christiane

AU - Gröger, Adriane

AU - Groppa, Sergiu

AU - Hafler, David A.

AU - Harbo, Hanne F.

AU - Hemmer, Bernhard

AU - Jordan, Kesshi

AU - Kappos, Ludwig

AU - Kirkish, Gina

AU - Llufriu, Sara

AU - Magon, Stefano

AU - Martinelli-Boneschi, Filippo

AU - McCauley, Jacob L

AU - Montalban, Xavier

AU - Mühlau, Mark

AU - Pelletier, Daniel

AU - Pattany, Pradip

AU - Pericak-Vance, Margaret A

AU - Cournu-Rebeix, Isabelle

AU - Rocca, Maria A.

AU - Rovira, Alex

AU - Schlaeger, Regina

AU - Saiz, Albert

AU - Sprenger, Till

AU - Stecco, Alessandro

AU - Uitdehaag, Bernard M J

AU - Villoslada, Pablo

AU - Wattjes, Mike P.

AU - Weiner, Howard

AU - Wuerfel, Jens

AU - Zimmer, Claus

AU - Zipp, Frauke

AU - Hauser, Stephen L.

AU - Oksenberg, Jorge R.

AU - Henry, Roland G.

AU - International Multiple Sclerosis Genetics Consortium, Multiple Sclerosis Genetics Consortium

PY - 2016/7/1

Y1 - 2016/7/1

N2 - A concern for researchers planning multisite studies is that scanner and T1-weighted sequence-related biases on regional volumes could overshadow true effects, especially for studies with a heterogeneous set of scanners and sequences. Current approaches attempt to harmonize data by standardizing hardware, pulse sequences, and protocols, or by calibrating across sites using phantom-based corrections to ensure the same raw image intensities. We propose to avoid harmonization and phantom-based correction entirely. We hypothesized that the bias of estimated regional volumes is scaled between sites due to the contrast and gradient distortion differences between scanners and sequences. Given this assumption, we provide a new statistical framework and derive a power equation to define inclusion criteria for a set of sites based on the variability of their scaling factors. We estimated the scaling factors of 20 scanners with heterogeneous hardware and sequence parameters by scanning a single set of 12 subjects at sites across the United States and Europe. Regional volumes and their scaling factors were estimated for each site using Freesurfer's segmentation algorithm and ordinary least squares, respectively. The scaling factors were validated by comparing the theoretical and simulated power curves, performing a leave-one-out calibration of regional volumes, and evaluating the absolute agreement of all regional volumes between sites before and after calibration. Using our derived power equation, we were able to define the conditions under which harmonization is not necessary to achieve 80% power. This approach can inform choice of processing pipelines and outcome metrics for multisite studies based on scaling factor variability across sites, enabling collaboration between clinical and research institutions.

AB - A concern for researchers planning multisite studies is that scanner and T1-weighted sequence-related biases on regional volumes could overshadow true effects, especially for studies with a heterogeneous set of scanners and sequences. Current approaches attempt to harmonize data by standardizing hardware, pulse sequences, and protocols, or by calibrating across sites using phantom-based corrections to ensure the same raw image intensities. We propose to avoid harmonization and phantom-based correction entirely. We hypothesized that the bias of estimated regional volumes is scaled between sites due to the contrast and gradient distortion differences between scanners and sequences. Given this assumption, we provide a new statistical framework and derive a power equation to define inclusion criteria for a set of sites based on the variability of their scaling factors. We estimated the scaling factors of 20 scanners with heterogeneous hardware and sequence parameters by scanning a single set of 12 subjects at sites across the United States and Europe. Regional volumes and their scaling factors were estimated for each site using Freesurfer's segmentation algorithm and ordinary least squares, respectively. The scaling factors were validated by comparing the theoretical and simulated power curves, performing a leave-one-out calibration of regional volumes, and evaluating the absolute agreement of all regional volumes between sites before and after calibration. Using our derived power equation, we were able to define the conditions under which harmonization is not necessary to achieve 80% power. This approach can inform choice of processing pipelines and outcome metrics for multisite studies based on scaling factor variability across sites, enabling collaboration between clinical and research institutions.

UR - http://www.scopus.com/inward/record.url?scp=84964354680&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964354680&partnerID=8YFLogxK

U2 - 10.1016/j.neuroimage.2016.03.051

DO - 10.1016/j.neuroimage.2016.03.051

M3 - Article

VL - 134

SP - 281

EP - 294

JO - NeuroImage

JF - NeuroImage

SN - 1053-8119

ER -