6533b830fe1ef96bd129664d
RESEARCH PRODUCT
DySC: software for greedy clustering of 16S rRNA reads.
Stefan KramerBertil SchmidtZejun Zhengsubject
Statistics and ProbabilityComputer sciencebusiness.industrySequence Analysis RNA16S ribosomal RNAcomputer.software_genreBiochemistryComputer Science ApplicationsComputational MathematicsSoftwareComputational Theory and MathematicsRNA Ribosomal 16SCluster AnalysisMetagenomeData miningCluster analysisbusinessMolecular BiologycomputerSoftwaredescription
Abstract Summary: Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime. Availability and implementation: DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license. Contact: bertil.schmidt@uni-mainz.de Supplementary Information: Supplementary data are available at Bioinformatics online.
year | journal | country | edition | language |
---|---|---|---|---|
2012-06-26 | Bioinformatics (Oxford, England) |