Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries.
Zheng Z., Liu S., Sidorenko J., Wang Y., Lin T., Yengo L., Turley P., Ani A., Wang R., Nolte IM., Snieder H., LifeLines Cohort Study None., Yang J., Wray NR., Goddard ME., Visscher PM., Zeng J.
We develop a method, SBayesRC, that integrates genome-wide association study (GWAS) summary statistics with functional genomic annotations to improve polygenic prediction of complex traits. Our method is scalable to whole-genome variant analysis and refines signals from functional annotations by allowing them to affect both causal variant probability and causal effect distribution. We analyze 50 complex traits and diseases using ∼7 million common single-nucleotide polymorphisms (SNPs) and 96 annotations. SBayesRC improves prediction accuracy by 14% in European ancestry and up to 34% in cross-ancestry prediction compared to the baseline method SBayesR, which does not use annotations, and outperforms other methods, including LDpred2, LDpred-funct, MegaPRS, PolyPred-S and PRS-CSx. Investigation of factors affecting prediction accuracy identifies a significant interaction between SNP density and annotation information, suggesting whole-genome sequence variants with annotations may further improve prediction. Functional partitioning analysis highlights a major contribution of evolutionary constrained regions to prediction accuracy and the largest per-SNP contribution from nonsynonymous SNPs.