Grant

Ensembl and enabling genetics and genomics research in farmed animal species

Funder: Biotechnology and Biological Sciences Research Council

Dimensions: grant.2767620

Investigators

Affiliations

Organisations

  1. (1) University of Edinburgh, grid.4305.2

Research Organisations

Countries

United Kingdom

Continents

Europe

Abstract

The sequence of almost all genes (a draft genome sequence) has been determined for several farmed and companions animals including cattle, pigs, chickens, turkeys, dogs and horses. Draft genome sequences for several other species such as sheep, ducks and salmon will be completed soon. The strings of billions of bases (symbolised as four letters A, C, G, T) that constitute these genome sequences are not immediately useful to biological research scientists. Annotating these draft genome sequences with features such as the coding and regulatory parts of genes, and bases which differ between individuals within a species (genetic variants) greatly enhances the value and utility of the genome sequence. Visualising the genome sequences complete with annotations in an freely accessible manner further improves the value of the information. The web-mounted Ensembl genome browser, databases and associated annotation tools have been shown to be powerful and effective means of annotating the complex genomes of animal species including humans, mice and more recently farmed and companion animals. This project is concerned with improving the quality of genome annotation for farmed and companion animal genomes. International consortia of scientists are using the so-called next generation sequencing technology, not only to sequence the genomes of more economically important species, but also the genomes of multiple individuals for each species of interest and to improve or finish the reference genome sequences for key species. These new sequencing technologies are also being used increasingly in assays, for example, of the extent of gene expression in different cells or under different conditions (transcriptomics) or of the state of the genome (epigenomics). Mapping the sequence read-outs from these assays back to the relevant genome sequence not only provides a genome-wide framework for analysis but also provides further information with which to annotate the genome sequence itself. Thus, there is a recurring need to refresh the genome sequence annotation for important animal species. We will use the Ensembl system to annotate the genome sequences of key farmed and companion animal species. The resulting annotated genome sequences will be made freely available as resources mounted on the World Wide Web. Recently developed features within the Ensembl system enable the analysis and visualisation of genetic variation (i.e. sequence differences) between individuals of the same species. This genetic variation explains the differences in traits such as growth, milk yield and susceptibility to disease. We will populate the Ensembl-animal Variation databases with sequence and genotype data acquired from the animal genetics research community. Visualising these variation data and making them accessible to the scientific community and the animal breeding industry will facilitate research to understand the genetic control of complex traits in animals and genetic improvement of farmed animals. A high quality annotated reference genome sequence is a critical bioinformatics resource for the effective prosecution of contempary research in the biological sciences. The value and utility of such bioinformatics resources are critically dependent upon the currency of the resource. Thus, this project is concerned with delivering high quality up-to-date annotated reference genomes for key farmed and companion animal species to enable research on these economically or socially important animal species. Technical Summary A high quality annotated reference genome sequence is critical to contempary research in the biological sciences. The Ensembl browser and associated annotation tools and database have been shown to be robust and effective means for making genomic information useful to a wide range of users. Draft reference genome sequences have been established for several farmed animal species (chicken, cattle, pig, horse, turkey) and sequencing is well advanced for several others (including sheep, duck, salmon). Annotated assemblies have already been made available through the Ensembl (chicken, cattle, pig, horse) and Pre-Ensembl (duck, turkey, sheep) genome browsers. However, the utility of a bioinformatics resource are critically dependent upon the currency of the resource. Genome sequence assemblies, including the 'finished' human and mouse sequences are subject to continual revision as new data are acquired and errors corrected. This proposal is concerned with maintaining the currency of Ensembl in respect of farmed and companion animal species, including poultry and farmed fish. Whilst first draft genome sequences have been established for several of the species of interest, improved genome assemblies and increased volumes of ancillary data, including RNAseq and ChIPseq data are also being generated for these species. Thus, we will use these growing and improving data to develop up-to-date and enhanced annotation for these species. Not only are the genomes of more farmed animal species but also the genomes of multiple individuals within a species are being sequenced. The recently developed Ensembl variation resources allow these additional data to be captured and visualised for the benefit of scientists engaged in genetics and genomics, and other lines of, research on the target species. We will work with the animal sciences research community to acquire re-sequence data, SNP and CNV genotypes with which to populate the Ensembl-animal variation databases.

Funding information

Funding period: 2012-2015

Funding amount: EUR 392611

Grant number: BB/I025328/1

Research Categories

Main Subject Area