The Neuro/PsyGRID calibration experiment: identifying sources of variance and bias in multicenter MRI studies.
Suckling J., Barnes A., Job D., Brennan D., Lymer K., Dazzan P., Marques TR., MacKay C., McKie S., Williams SR., Williams SCR., Deakin B., Lawrie S.
Calibration experiments precede multicenter trials to identify potential sources of variance and bias. In support of future imaging studies of mental health disorders and their treatment, the Neuro/PsyGRID consortium commissioned a calibration experiment to acquire functional and structural MRI from twelve healthy volunteers attending five centers on two occasions. Measures were derived of task activation from a working memory paradigm, fractal scaling (Hurst exponent) from resting fMRI, and grey matter distributions from T(1) -weighted sequences. At each intracerebral voxel a fixed-effects analysis of variance estimated components of variance corresponding to factors of center, subject, occasion, and within-occasion order, and interactions of center-by-occasion, subject-by-occasion, and center-by-subject, the latter (since there is no intervention) a surrogate of the expected variance of the treatment effect standard error across centers. A rank order test of between-center differences was indicative of crossover or noncrossover subject-by-center interactions. In general, factors of center, subject and error variance constituted >90% of the total variance, whereas occasion, order, and all interactions were generally <5%. Subject was the primary source of variance (70%-80%) for grey-matter, with error variance the dominant component for fMRI-derived measures. Spatially, variance was broadly homogenous with the exception of fractal scaling measures which delineated white matter, related to the flip angle of the EPI sequence. Maps of P values for the associated F-tests were also derived. Rank tests were highly significant indicating the order of measures across centers was preserved. In summary, center effects should be modeled at the voxel-level using existing and long-standing statistical recommendations.