Search results for "DNA sequencing"
showing 10 items of 237 documents
Data mining approaches to identify biomineralization related sequences.
2015
Proteomics is an efficient high throughput technique developed to identify proteins from a crude extract using sequence homology. Advances in Next Generation Sequencing (NGS) have led to increase knowledge of several non-model species. In the field of calcium carbonate biomineralization, the paucity of available sequences (such as the ones of mollusc shells) is still a bottleneck in most proteomic studies. Indeed, this technique needs proteins databases to find homology. The aim of this study was to perform different data mining approaches in order to identify novel shell proteins. To this end, we disposed of several publicly non-model molluscs databases. Previously identified molluscan she…
A new parallel pipeline for DNA methylation analysis of long reads datasets
2017
Background DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. Results In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while…
Deep learning network for exploiting positional information in nucleosome related sequences
2017
A nucleosome is a DNA-histone complex, wrapping about 150 pairs of double-stranded DNA. The role of nucleosomes is to pack the DNA into the nucleus of the Eukaryote cells to form the Chromatin. Nucleosome positioning genome wide play an important role in the regulation of cell type-specific gene activities. Several biological studies have shown sequence specificity of nucleosome presence, clearly underlined by the organization of precise nucleotides substrings. Taking into consideration such advances, the identification of nucleosomes on a genomic scale has been successfully performed by DNA sequence features representation and classical supervised classification methods such as Support Vec…
A Comparison of Techniques to Evaluate the Effectiveness of Genome Editing
2018
Genome editing using engineered nucleases (meganucleases, zinc finger nucleases, transcription activator-like effector nucleases) has created many recent breakthroughs. Prescreening for efficiency and specificity is a critical step prior to using any newly designed genome editing tool for experimental purposes. The current standard screening methods of evaluation are based on DNA sequencing or use mismatch-sensitive endonucleases. They can be time-consuming and costly or lack reproducibility. Here, we review and critically compare standard techniques with those more recently developed in terms of reliability, time, cost, and ease of use.
Directional high-throughput sequencing of RNAs without gene-specific primers.
2018
Ribosomal RNA analysis is a useful tool for characterization of microbial communities. However, the lack of broad-range primers has hampered the simultaneous analysis of eukaryotic and prokaryotic members by amplicon sequencing. We present a complete workflow for directional, primer-independent sequencing of size-selected small subunit ribosomal RNA fragments. The library preparation protocol includes gel extraction of the target RNA, ligation of an RNA oligo to the 5′-end of the target, and cDNA synthesis with a tailed random-hexamer primer and further barcoding. The sequencing results of a phytoplankton mock community showed a highly similar profile to the biomass indicators. This method…
FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications
2017
Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…
Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.
2020
Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, w…
CLOVE: classification of genomic fusions into structural variation events
2017
Background A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than …
Generation of a novel next-generation sequencing-based method for the isolation of new human papillomavirus types
2018
Abstract With the advent of new molecular tools, the discovery of new papillomaviruses (PVs) has accelerated during the past decade, enabling the expansion of knowledge about the viral populations that inhabit the human body. Human PVs (HPVs) are etiologically linked to benign or malignant lesions of the skin and mucosa. The detection of HPV types can vary widely, depending mainly on the methodology and the quality of the biological sample. Next-generation sequencing is one of the most powerful tools, enabling the discovery of novel viruses in a wide range of biological material. Here, we report a novel protocol for the detection of known and unknown HPV types in human skin and oral gargle …
Halorhabdus rudnickae sp. nov., a halophilic archaeon isolated from a salt mine borehole in Poland
2016
Two halophilic archaea, designated strains WSM-64 and WSM-66, were isolated from a sample taken from a borehole in the currently unexploited Barycz mining area belonging to the >Wieliczka> Salt Mine Company, in Poland. Strains are red pigmented and form non-motile cocci that stain Gram-negative. Strains WSM-64 and WSM-66 showed optimum growth at 40 °C, in 20% NaCl and at pH 6.5-7.5. The strains were facultative anaerobes. The major polar lipids of the two strains were phosphatidylglycerol (PG2), phosphatidylglycerol phosphate methyl ester (PGP-Me) and sulfated diglycosyl diether (S-DGD). Menaquinone MK-8 was the major respiratory quinone. The DNA G+C content of strain WSM-64 was 61.2 mol% b…