Article open access publication

Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores

American Journal of Human Genetics, Elsevier, ISSN 1537-6605

Volume 97, 4, 2015

DOI:10.1016/j.ajhg.2015.09.001, Dimensions: pub.1046213550, PMC: PMC4596916, PMID: 26430803,

Authors

Vilhjálmsson, Bjarni J. * (1) (2) (3)
Finucane, Hilary K. (2) (3) (5)
Ripke, Stephan (3) (6) (7)
Loh, Po-Ru (2) (3)
Bhatia, Gaurav (2) (3)
Do, Ron (8)
Hayeck, Tristan (2) (3)
Won, Hong-Hee (3) (7)
Tamimi, Rulla (2) (10)
Stahl, Eli (3) (8)
De Jager, Philip (2) (3) (10)
Purcell, Shaun (3) (8)
Chasman, Daniel (2) (10)
Goddard, Michael (13) (14)
Kraft, Peter (2) (3)
Price, Alkes L. * (2) (3)

* Corresponding author

Affiliations

Organisations

  1. (1) Aarhus University, grid.7048.b, AU
  2. (2) Harvard University, grid.38142.3c
  3. (3) Broad Institute, grid.66859.34
  4. (4) University of Queensland, grid.1003.2
  5. (5) Massachusetts Institute of Technology, grid.116068.8
  6. (6) Charité, grid.6363.0
  7. (7) Massachusetts General Hospital, grid.32224.35
  8. (8) Icahn School of Medicine at Mount Sinai, grid.59734.3c
  9. (9) University of Southern California, grid.42505.36
  10. (10) Brigham and Women's Hospital, grid.62560.37
  11. (11) University of California, San Francisco, grid.266102.1
  12. (12) University of California, Los Angeles, grid.19006.3e
  13. (13) Department of Environment, Land, Water and Planning, grid.452205.4
  14. (14) University of Melbourne, grid.1008.9

Description

Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

Funders

Research Categories

Main Subject Area

Links & Metrics

NORA University Profiles

Aarhus University

Danish Open Access Indicator

2015: Unused

Research area: Science & Technology

Danish Bibliometrics Indicator

2015: Level 2

Research area: Science & Technology

Dimensions Citation Indicators

Times Cited: 411

Relative Citation ratio (RCR): 12.61

Open Access Info

Hybrid