Article open access publication

Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

Nature, Springer Nature, ISSN 1476-4687

Volume 548, 7665, 2017

DOI:10.1038/nature23264, Dimensions: pub.1090898815, PMID: 28746312,

Affiliations

Organisations

  1. (1) University of Copenhagen, grid.5254.6, KU
  2. (2) Aarhus University, grid.7048.b, AU
  3. (3) Technical University of Denmark, grid.5170.3, DTU
  4. (4) BGI-Europe, Ole Maaløes Vej 3, 2200, Copenhagen, Denmark
  5. (5) Lundbeck Foundation, grid.452548.a
  6. (6) South China University of Technology, grid.79703.3a
  7. (7) Beijing Genomics Institute, grid.21155.32
  8. (8) Department of Psychology, University of Oslo, 0317, Oslo, Norway
  9. (9) University of Bergen, grid.7914.b
  10. (10) Haukeland University Hospital, grid.412008.f
  11. (11) Karolinska Institute, grid.4714.6
  12. (12) University of North Carolina System, grid.410711.2
  13. (13) Bispebjerg Hospital, grid.411702.1, Capital Region

Description

Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark.

Funders

Research Categories

Main Subject Area

Fields of Research

Sustainable Development Goals

Links & Metrics

NORA University Profiles

University of Copenhagen

Aarhus University

Technical University of Denmark

Danish Open Access Indicator

2017: Realized

Research area: Science & Technology

Danish Bibliometrics Indicator

2017: Level 2

Research area: Science & Technology

Dimensions Citation Indicators

Times Cited: 65

Field Citation Ratio (FCR): 13.66

Relative Citation ratio (RCR): 2.64

Open Access Info

Hybrid