2.50
Hdl Handle:
http://hdl.handle.net/10033/596930
Title:
PhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes.
Authors:
Gregor, Ivan; Dröge, Johannes; Schirmer, Melanie; Quince, Christopher; McHardy, Alice C
Abstract:
Background. Metagenomics is an approach for characterizing environmental microbial communities in situ, it allows their functional and taxonomic characterization and to recover sequences from uncultured taxa. This is often achieved by a combination of sequence assembly and binning, where sequences are grouped into 'bins' representing taxa of the underlying microbial community. Assignment to low-ranking taxonomic bins is an important challenge for binning methods as is scalability to Gb-sized datasets generated with deep sequencing techniques. One of the best available methods for species bins recovery from deep-branching phyla is the expert-trained PhyloPythiaS package, where a human expert decides on the taxa to incorporate in the model and identifies 'training' sequences based on marker genes directly from the sample. Due to the manual effort involved, this approach does not scale to multiple metagenome samples and requires substantial expertise, which researchers who are new to the area do not have. Results. We have developed PhyloPythiaS+, a successor to our PhyloPythia(S) software. The new (+) component performs the work previously done by the human expert. PhyloPythiaS+ also includes a new k-mer counting algorithm, which accelerated the simultaneous counting of 4-6-mers used for taxonomic binning 100-fold and reduced the overall execution time of the software by a factor of three. Our software allows to analyze Gb-sized metagenomes with inexpensive hardware, and to recover species or genera-level bins with low error rates in a fully automated fashion. PhyloPythiaS+ was compared to MEGAN, taxator-tk, Kraken and the generic PhyloPythiaS model. The results showed that PhyloPythiaS+ performs especially well for samples originating from novel environments in comparison to the other methods. Availability. PhyloPythiaS+ in a virtual machine is available for installation under Windows, Unix systems or OS X on: https://github.com/algbioi/ppsp/wiki.
Affiliation:
Helmholtz Centre for infection research, Inhoffenstr. 7, D-38124 Braunschweig, Germany.
Citation:
PhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes. 2016, 4:e1603 PeerJ
Journal:
PeerJ
Issue Date:
2016
URI:
http://hdl.handle.net/10033/596930
DOI:
10.7717/peerj.1603
PubMed ID:
26870609
Type:
Article
Language:
en
ISSN:
2167-8359
Appears in Collections:
publications of the research group bioinformatics in infection research ([BRICS] BIFO)

Full metadata record

DC FieldValue Language
dc.contributor.authorGregor, Ivanen
dc.contributor.authorDröge, Johannesen
dc.contributor.authorSchirmer, Melanieen
dc.contributor.authorQuince, Christopheren
dc.contributor.authorMcHardy, Alice Cen
dc.date.accessioned2016-02-22T15:15:41Zen
dc.date.available2016-02-22T15:15:41Zen
dc.date.issued2016en
dc.identifier.citationPhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes. 2016, 4:e1603 PeerJen
dc.identifier.issn2167-8359en
dc.identifier.pmid26870609en
dc.identifier.doi10.7717/peerj.1603en
dc.identifier.urihttp://hdl.handle.net/10033/596930en
dc.description.abstractBackground. Metagenomics is an approach for characterizing environmental microbial communities in situ, it allows their functional and taxonomic characterization and to recover sequences from uncultured taxa. This is often achieved by a combination of sequence assembly and binning, where sequences are grouped into 'bins' representing taxa of the underlying microbial community. Assignment to low-ranking taxonomic bins is an important challenge for binning methods as is scalability to Gb-sized datasets generated with deep sequencing techniques. One of the best available methods for species bins recovery from deep-branching phyla is the expert-trained PhyloPythiaS package, where a human expert decides on the taxa to incorporate in the model and identifies 'training' sequences based on marker genes directly from the sample. Due to the manual effort involved, this approach does not scale to multiple metagenome samples and requires substantial expertise, which researchers who are new to the area do not have. Results. We have developed PhyloPythiaS+, a successor to our PhyloPythia(S) software. The new (+) component performs the work previously done by the human expert. PhyloPythiaS+ also includes a new k-mer counting algorithm, which accelerated the simultaneous counting of 4-6-mers used for taxonomic binning 100-fold and reduced the overall execution time of the software by a factor of three. Our software allows to analyze Gb-sized metagenomes with inexpensive hardware, and to recover species or genera-level bins with low error rates in a fully automated fashion. PhyloPythiaS+ was compared to MEGAN, taxator-tk, Kraken and the generic PhyloPythiaS model. The results showed that PhyloPythiaS+ performs especially well for samples originating from novel environments in comparison to the other methods. Availability. PhyloPythiaS+ in a virtual machine is available for installation under Windows, Unix systems or OS X on: https://github.com/algbioi/ppsp/wiki.en
dc.language.isoenen
dc.titlePhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes.en
dc.typeArticleen
dc.contributor.departmentHelmholtz Centre for infection research, Inhoffenstr. 7, D-38124 Braunschweig, Germany.en
dc.identifier.journalPeerJen

Related articles on PubMed

This item is licensed under a Creative Commons License
Creative Commons
All Items in HZI are protected by copyright, with all rights reserved, unless otherwise indicated.