Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Accurate prediction of an individual's phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding.

Original publication




Journal article


Nat Commun

Publication Date





Adipose Tissue, Alopecia, Basal Metabolism, Bayes Theorem, Biological Specimen Banks, Birth Weight, Body Composition, Body Height, Body Mass Index, Bone Density, Diabetes Mellitus, Type 2, Forced Expiratory Volume, Genetic Association Studies, Genome-Wide Association Study, Humans, Multifactorial Inheritance, Polymorphism, Single Nucleotide, Regression Analysis, Statistics as Topic, Vital Capacity, Waist-Hip Ratio