Binning sequences using very sparse labels within a metagenome

15 years 2 months ago

Download www.biomedcentral.com

Background: In metagenomic studies, a process called binning is necessary to assign contigs that belong to multiple species to their respective phylogenetic groups. Most of the current methods of binning, such as BLAST, k-mer and PhyloPythia, involve assigning sequence fragments by comparing sequence similarity or sequence composition with already-sequenced genomes that are still far from comprehensive. We propose a semi-supervised seeding method for binning that does not depend on knowledge of completed genomes. Instead, it extracts the flanking sequences of highly conserved 16S rRNA from the metagenome and uses them as seeds (labels) to assign other reads based on their compositional similarity. Results: The proposed seeding method is implemented on an unsupervised Growing Self-Organising Map (GSOM), and called Seeded GSOM (S-GSOM). We compared it with four well-known semi-supervised learning methods in a preliminary test, separating random-length prokaryotic sequence fragments samp...

Chon-Kit Kenneth Chan, Arthur L. Hsu, Saman K. Hal

Real-time Traffic

16S RRNA | BMCBI 2008 | Flanking Sequences | Semi-supervised Learning Methods |

claim paper

» Fast realistic multiaction recognition using mined dense spatiotemporal features

Post Info
More Details (n/a)

Added	09 Dec 2010
Updated	09 Dec 2010
Type	Journal
Year	2008
Where	BMCBI
Authors	Chon-Kit Kenneth Chan, Arthur L. Hsu, Saman K. Halgamuge, Sen-Lin Tang

Comments (0)

Sciweavers

Binning sequences using very sparse labels within a metagenome

16S RRNA | BMCBI 2008 | Flanking Sequences | Semi-supervised Learning Methods |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers