DiscoverMathematical, Statistical and Computational Aspects of the New Science of MetagenomicsLinking taxa to function through contig clustering of microbial metagenomes
Linking taxa to function through contig clustering of microbial metagenomes

Linking taxa to function through contig clustering of microbial metagenomes

Update: 2014-03-31
Share

Description

Co-authors: Johannes Alneberg (KTH Royal Institute of Technology, Stockholm, Sweden), Brynjar Smaari Bjarnason (KTH Royal Institute of Technology, Stockholm, Sweden), Ino de Bruijn (KTH Royal Institute of Technology, Stockholm, Sweden), Melanie Schirmer (University of Glasgow), Joshua Quick (University of Birmingham), Nicholas J. Loman (University of Birmingham), Anders F. Andersson (KTH Royal Institute of Technology, Stockholm, Sweden), Konstantinos Gerasimidis (University of Glasgow)

Taxonomic profiling of microbial communities can answer the question of “Who is there?” This can be achieved either through marker gene sequencing or true shotgun metagenomics. The latter because the functional genes of all community members are sequenced allows us to answer the additional question: “What are they doing?” However, there is a third question that is key to understanding microbial communities: “Who is doing what?” This question has received much less attention because to answer it requires the extraction of complete genomes from metagenomes. Assembly of metagenomes can generate millions of contigs, assembled genome fragments, with no information on which contig derives from which genome. Here I will present CONCOCT, a novel algorithm that combines sequence composition, coverage across multiple samples, and read-pair linkage to automatically cluster contigs into genomes. CONCOCT uses a dimensionality reduction coupled to a Gaus sian mixture model, fit using a variational Bayesian algorithm which automatically identifies the optimal number of clusters. We demonstrate high recall and precision rates on artificial as well as real human gut metagenome datasets. Linking contigs into genome clusters, allows the frequencies of those clusters to be related to metadata, revealing function. We apply this approach to fecal metagenomes obtained from the E. coli O104:H4 epidemic (Germany, 2011) and are able to directly extract the outbreak genome. We also use it to identify organisms associated with inflammation in samples from children with Crohn’s disease.

Related Links

http://arxiv.org/abs/1312.4038 - arXiv preprint
Comments 
In Channel
Why mixture modelling?

Why mixture modelling?

2014-04-1001:28:00

loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Linking taxa to function through contig clustering of microbial metagenomes

Linking taxa to function through contig clustering of microbial metagenomes

Vincenzo Abete

We and our partners use cookies to personalize your experience, to show you ads based on your interests, and for measurement and analytics purposes. By using our website and our services, you agree to our use of cookies as described in our Cookie Policy.