Erratum: Autism-related dietary preferences mediate autism-gut microbiome associations (Cell (2021) 184(24) (5916–5931.e17), (S0092867421012319), (10.1016/j.cell.2021.10.015))
Yap CX., Henders AK., Alvares GA., Wood DLA., Krause L., Tyson GW., Restuadi R., Wallace L., McLaren T., Hansell NK., Cleary D., Grove R., Hafekost C., Harun A., Holdsworth H., Jellett R., Khan F., Lawson LP., Leslie J., Frenk ML., Masi A., Mathew NE., Muniandy M., Nothard M., Miller JL., Nunn L., Holtmann G., Strike LT., de Zubicaray GI., Thompson PM., McMahon KL., Wright MJ., Visscher PM., Dawson PA., Dissanayake C., Eapen V., Heussler HS., McRae AF., Whitehouse AJO., Wray NR., Gratten J.
(Cell 184, 5916–5931; November 24, 2021) Our paper reported evidence that autism-related dietary preferences mediate autism-microbiome associations. Since publication, we have become aware of an error in our paper that we are now correcting. Specifically, in the code we wrote and used to transform the microbiome count matrices in our variance component analysis, we inadvertently missed a matrix transposition, which affected their centered-log-ratio (clr) transformation and affected variance estimates in Figure 2 and Table S1 (listed in detail below). By missing the matrix transposition, we incorrectly calculated the geometric mean per-taxa rather than per-individual. However, the error does not affect the conclusions of the paper because the per-taxa and per-individual geometric means are similar, and so the resulting clr transformed matrices are similar as well (note that the clr transform should take the quotient of a microbiome/taxa quantity by the geometric mean of microbiome quantities across the sample/individual). To show that this is the case, we compared the correctly (geometric mean calculated per-individual) and incorrectly (geometric mean calculated per-taxa) clr transformed matrices by taking the nth column of both matrices (representing each of 247 individuals’ microbiome data) and calculating the Pearson's correlation coefficient between them. The median Pearson's correlation coefficient ranged from 0.90–0.94 for the common species, rare species, common genes, and rare genes matrices. As the correctly and incorrectly transformed matrices are highly correlated, this error has negligible impact on the variance component analysis results and does not change the overall conclusions of our work. The code error did not affect which microbiome features were identified as being differentially abundant, as the method used for this analysis (ANCOMv2.1) takes un-transformed count data as input. However, the data visualization for this analysis was affected with respect to the x-axes of Figures 3A–3C, 3E, 3F, and S4, which reflect the degree and directionality of differential abundance. In the updated plots, all the significant or near-significant microbiome features have identical directions of effect to the original plots as well as similar magnitudes of effect. We can confirm that the other instances of clr transformation were performed correctly; namely, in generating the dietary PCs, CD4+ T cells, and the PCA plot (Figure 5A). We have updated the following: (1) Figure 2 has been amended with the updated data: • Under age, species_common has changed from 33 to 35, transporter(TCDB)_common from 42 to 36, pathway(MetaCyc)_common from 40 to 39, and food(AES) from 33 to 46.• Under BMI, species, transporter(TCDB)_common has changed from 1 to 4, pathway(MetaCyc)_common from 0 to 1, genes(Microba)_common from 7 to 10, and food(AES) from 12 to 22.• Under ASD, genes(Microba)_rare has changed from 7 to 9.• Under IQ_DQ, species_common has changed from 3 to 5, genes(Microba)_common from 7 to 14, and food(AES) from 3 to 14.• Under Sleep, species_common has changed from 10 to 11, and genes(Microba)_common has changed from 0 to 6.• Under rBSC, species_common has changed from 5 to 6, species_rare from 41 to 40, transporter(TCDB)_common from 3 to 4, and genes(Microba)_common from 49 to 50.• Under dietary_PC1, enzyme(ECL4)_common has changed from 48 to 46, pathway(MetaCyc)_common from 25 to 24, and genes(Microba)_common from 48 to 47.• Under dietary_PC2, species_rare has changed from 1 to 0, genes(Microba)_common from 7 to 8, and genes(Microba)_rare from 3 to 0.• Under dietary_PC3, species_common has changed from 4 to 6, species_rare from 1 to 0, transporter(TCDB)_common from 21 to 17, pathway(MetaCyc)_common from 11 to 10, and genes(Microba)_common from 27 to 28.• Under diet_diversity, species_rare has changed from 20 to 14, transporter(TCDB)_common from 26 to 23, and genes(Microba)_common from 26 to 23.• We have also taken this opportunity to switch the y-axis order for “species_rare” and “enzyme(ECL4)_common” to better separate the taxonomic and functional datasets.(2) Table S1, which contains the raw data presented in Figure 2, has been amended with the updated OREML results.(3) In the main text, the third to fifth paragraphs of the section titled “Negligible variance in ASD diagnostic status is associated with the microbiome compared to age, stool and dietary traits” has been amended: • The age common species b2 estimate and standard error has changed from 33% (SE = 8%) to 35% (SE = 7%).• The p value for the BMI common species analysis has changed from p = 3.5e-2 (not FDR significant) to 1.8e-2 (FDR-significant).• With reference to the age gene-level ORM analyses, the range of standard errors has been changed from 13%–17% to 14%–17%.• The BMI rare genes b2 estimate has changed from 46% to 47%, and the p value has changed from 8.4e-3 to 1.1e-2.• The ASD rare genes b2 estimate has changed from 7% to 9%, and the p value has changed from 0.33 to 0.29.• The IQ-DQ common species b2 estimate has changed from 7% (SE = 13%, p = 0.39) to 5% (SE = 6%, p = 0.20).• The sleep problems common species b2 estimate has changed from 10% to 11%, and the p value has changed from 0.17 to 8.2e-2.• The stool consistency rare species b2 estimate has changed from 41% to 40%, and the p value has changed from 8.7e-6 to 2.8e-5.• The stool consistency rare genes standard error has changed from 20% to 21%, and the p value has changed from 2.5e-5 to 5.8e-5.• We have corrected an error where the dietary PC1 common genes b2 estimate (b2 = 48%, SE = 15%, p = 3.8e-4) was mislabeled as the rare genes analysis. We have also updated the common genes b2 estimate from 48% to 47% and updated the p value from 3.8e-4 to 4.5e-5.(4) Figure S1, which visualizes the diagonals and off-diagonals of the omics relationship matrix (ORM; which, in turn, is based on the centered-log-ratio transformed microbiome matrices) has been amended with the updated OREML results.(5) Figure S2, which draws upon ORMs using rare microbiome features to compare the effects of prior clr transformation versus binary coding as a sensitivity analysis, has been amended with the updated OREML results.(6) Figure S3, which provides a variety of OREML estimates to support Figure 2 (including the impact of estimating b2 with a combination of multiple ORMs and collapsing taxonomic microbiome data into higher levels of hierarchy), has been amended with the updated OREML results.(7) Methods S1, which provides results from extensive sensitivity analyses to support the main results, has also been amended with the updated OREML results. We have also updated the section “Estimating the upper limit of predictivity using non-additive models,” for which we used adaboost as a sensitivity analysis for a method that does not assume additivity. In this analysis, the mean prediction accuracy for ASD changed from 53% (SD = 7%) to 53% (SD = 8%), and the prediction accuracy for age changed from 62% (SD = 7%) to 63% (SD = 9%).(8) Figures 3A–3C, 3E, and 3F, which visualize differentially abundant microbiome features, now have updated x-axes.(9) Figure S4, which supports Figure 3 by providing results from sensitivity analyses for differential abundance, also has updated x-axes.(10) Tables S2.1, S2.3, S2.8, S2.13, and S2.14, which provide data (including x-axis coordinates) for Figures 3A–3C, 3E, and 3F, have been updated.(11) Unrelated to the clr transformation error, we have also updated the heading of the upper plot in Figure 4I to read “Diet ∼ Sensory score” rather than “Taxa ∼ Sensory score.”(12) The accompanying Zenodo code has been updated, and the link has been changed from https://zenodo.org/records/5558047 to https://zenodo.org/records/5558046. The specific code updates can be viewed on the linked GitHub page.These errors have now been corrected in the online version of the paper. We apologize for any inconvenience that this may have caused the readers.[Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented][Formula presented]