0000000000462504

AUTHOR

Adrianto Wirawan

HECTOR : a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data

Background Current-generation sequencing technologies are able to produce low-cost, high-throughput reads. However, the produced reads are imperfect and may contain various sequencing errors. Although many error correction methods have been developed in recent years, none explicitly targets homopolymer-length errors in the 454 sequencing reads. Results We present HECTOR, a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. In this algorithm, for the first time we have investigated a novel homopolymer spectrum based approach to handle homopolymer insertions or deletions, which are the dominant sequencing errors in 454 pyrosequencing reads. We have evaluat…

research product

Additional file 1: Figure S1. of CLOVE: classification of genomic fusions into structural variation events

Description of data: Sensitivity of individual tools and one run on CLOVE for different event types. Sensitivity is measured including half true positives (wrong event type). Events are considered recalled if any one of its fusions is found in the output. (PDF 9Â kb)

research product

CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions

Background The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. Results We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU …

research product

CLOVE: classification of genomic fusions into structural variation events

Background A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than …

research product

Additional file 2: Table S1. of CLOVE: classification of genomic fusions into structural variation events

Description of data: Detailed results of simulated data analysis. The spreadsheet shows runs of the tested structural variant tools as well as CLOVE re-classified results by variant type and for the individual runs of simulated data. (XLSX 139Â kb)

research product

Additional file 3: of CLOVE: classification of genomic fusions into structural variation events

Data S1. Description of data: VCF file of variant calls of CLOVE on the NA12878 genome. (VCF 271Â kb)

research product

Additional file 2: Table S1. of CLOVE: classification of genomic fusions into structural variation events

Description of data: Detailed results of simulated data analysis. The spreadsheet shows runs of the tested structural variant tools as well as CLOVE re-classified results by variant type and for the individual runs of simulated data. (XLSX 139Â kb)

research product