Webserver-supported storage of metagenomic datasets using MEGANv5
Update: 2014-03-31
Description
Co-author: Daniel Huson (University of Tuebingen, Algorithms in Bioinformatics, Tuebingen, Germany)
Background: Metagenomics is a rapidly growing field of research that aims at studying assemblages of uncultured organisms with the help of sequencing, with the hope of understanding the true diversity of microbes, their functions, cooperation and evolution. While early papers studied isolated or small numbers of samples, there is now an increasing number of projects that involve systematically collecting multiple samples of, due to sinking sequencing costs, growing size. Moreover, more attention is being paid to the problem of recording relevant environmental parameters (so-called metadata). There is a need for tools that allow one to store and analyze multiple metagenomic datasets in the context of their metadata.
Results: We announce an extension to our metagenome analysis tool MEGAN, called MeganServer, that allows one to store metagenomic datasets on a secure server in order to reduce redundancy and enhancing the ease of sharing large datasets between project members or making the publicly available. The software allows one, additionally, to capture the metadata associated with datasets and then use it to form new composite datasets by combining primary datasets based on the values of their environmental parameters. While the user can analyze any such combined dataset exactly like a primary dataset using MEGAN, internally, a combined dataset refers back to the primary datasets and thus does not duplicate any reads or matches.
Conclusions: With sinking sequencing costs, metagenomic datasets are growing to sizes too large to be stored locally. Installing MeganServer on an computer cluster or using a publicly available instance allows one to store datasets on a server without losing the benefits of using MEGAN locally. Also, combining datasets based on environmental features is an important step in the comparative analysis of metagenome datasets.
Background: Metagenomics is a rapidly growing field of research that aims at studying assemblages of uncultured organisms with the help of sequencing, with the hope of understanding the true diversity of microbes, their functions, cooperation and evolution. While early papers studied isolated or small numbers of samples, there is now an increasing number of projects that involve systematically collecting multiple samples of, due to sinking sequencing costs, growing size. Moreover, more attention is being paid to the problem of recording relevant environmental parameters (so-called metadata). There is a need for tools that allow one to store and analyze multiple metagenomic datasets in the context of their metadata.
Results: We announce an extension to our metagenome analysis tool MEGAN, called MeganServer, that allows one to store metagenomic datasets on a secure server in order to reduce redundancy and enhancing the ease of sharing large datasets between project members or making the publicly available. The software allows one, additionally, to capture the metadata associated with datasets and then use it to form new composite datasets by combining primary datasets based on the values of their environmental parameters. While the user can analyze any such combined dataset exactly like a primary dataset using MEGAN, internally, a combined dataset refers back to the primary datasets and thus does not duplicate any reads or matches.
Conclusions: With sinking sequencing costs, metagenomic datasets are growing to sizes too large to be stored locally. Installing MeganServer on an computer cluster or using a publicly available instance allows one to store datasets on a server without losing the benefits of using MEGAN locally. Also, combining datasets based on environmental features is an important step in the comparative analysis of metagenome datasets.
Comments
Top Podcasts
The Best New Comedy Podcast Right Now – June 2024The Best News Podcast Right Now – June 2024The Best New Business Podcast Right Now – June 2024The Best New Sports Podcast Right Now – June 2024The Best New True Crime Podcast Right Now – June 2024The Best New Joe Rogan Experience Podcast Right Now – June 20The Best New Dan Bongino Show Podcast Right Now – June 20The Best New Mark Levin Podcast – June 2024
In Channel