Search results for "sequence"

showing 10 items of 4987 documents

A Methodology to Study Pseudogenized lincRNAs

2021

Long intergenic noncoding RNAs (lincRNAs) are known to be tissue specifically expressed and able to regulate functional protein-coding genes: some can even act as competing endogenous RNAs (ceRNAs), because microRNAs can bind to them instead of the corresponding mRNA binding sites. Some lincRNAs contain remnants of protein-coding sequences and it has been hypothesized that they might arise after a pseudogenization processes. However, a major limitation in the study of such phenomenon is the lack of proper computational tools designed to align/analyze protein-coding sequences and noncoding sequences. To overcome this limitation, we published a method that finds the remnants of protein-coding…

0301 basic medicineCompeting endogenous RNAPseudogeneSequence alignmentComputational biologyBiology03 medical and health sciences030104 developmental biology0302 clinical medicineIntergenic regionmicroRNASingle pointGene030217 neurology & neurosurgerySequence (medicine)
researchProduct

Efficient Algorithms for Sequence Analysis with Entropic Profiles

2017

Entropy, being closely related to repetitiveness and compressibility, is a widely used information-related measure to assess the degree of predictability of a sequence. Entropic profiles are based on information theory principles, and can be used to study the under-/over-representation of subwords, by also providing information about the scale of conserved DNA regions. Here, we focus on the algorithmic aspects related to entropic profiles. In particular, we propose linear time algorithms for their computation that rely on suffix-based data structures, more specifically on the truncated suffix tree (TST) and on the enhanced suffix array (ESA). We performed an extensive experimental campaign …

0301 basic medicineCompressed suffix arrayTheoretical computer scienceEntropySuffix tree0206 medical engineeringGeneralized suffix tree02 engineering and technologyString searching algorithmInformation theorylaw.invention03 medical and health scienceslawGeneticsAnimalsHumansMathematicsApplied MathematicsSuffix arrayComputational BiologyDNASequence Analysis DNAData structure030104 developmental biologySuffixAlignment free Entropy Sequence analysis Sequence comparisonAlgorithms020602 bioinformaticsBiotechnologyIEEE/ACM Transactions on Computational Biology and Bioinformatics
researchProduct

HPG pore: an efficient and scalable framework for nanopore sequencing data.

2016

The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput. Despite the flood of data expected from this technology, the data analysis solutions currently available are only designed to manage small projects and are not scalable. Here we present HPG Pore, a toolkit for …

0301 basic medicineComputer scienceApplied MathematicsDistributed computingDNASequence Analysis DNAData scienceBiochemistryComputer Science Applications03 medical and health scienceschemistry.chemical_compoundNanoporeNanopores030104 developmental biology0302 clinical medicinechemistryStructural Biology030220 oncology & carcinogenesisScalabilityNanopore sequencingDNA microarrayThroughput (business)Molecular BiologyDNASoftwareBMC bioinformatics
researchProduct

Deep learning architectures for prediction of nucleosome positioning from sequences data

2018

Abstract Background Nucleosomes are DNA-histone complex, each wrapping about 150 pairs of double-stranded DNA. Their function is fundamental for one of the primary functions of Chromatin i.e. packing the DNA into the nucleus of the Eukaryote cells. Several biological studies have shown that the nucleosome positioning influences the regulation of cell type-specific gene activities. Moreover, computational studies have shown evidence of sequence specificity concerning the DNA fragment wrapped into nucleosomes, clearly underlined by the organization of particular DNA substrings. As the main consequence, the identification of nucleosomes on a genomic scale has been successfully performed by com…

0301 basic medicineComputer scienceCellBiochemistrychemistry.chemical_compound0302 clinical medicineStructural Biologylcsh:QH301-705.5Nucleosome classificationSequenceSettore INF/01 - InformaticabiologyApplied MathematicsEpigeneticComputer Science ApplicationsChromatinNucleosomesmedicine.anatomical_structurelcsh:R858-859.7EukaryoteDNA microarrayDatabases Nucleic AcidComputational biologySaccharomyces cerevisiaelcsh:Computer applications to medicine. Medical informatics03 medical and health sciencesDeep LearningmedicineNucleosomeAnimalsHumansEpigeneticsMolecular BiologyGeneBase Sequencebusiness.industryDeep learningResearchReproducibility of Resultsbiology.organism_classificationYeastNucleosome classification Epigenetic Deep learning networks Recurrent neural networks030104 developmental biologylcsh:Biology (General)chemistryRecurrent neural networksROC CurveDeep learning networksArtificial intelligenceNeural Networks Computerbusiness030217 neurology & neurosurgeryDNABMC Bioinformatics
researchProduct

Next-generation sequencing: big data meets high performance computing

2017

The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and t…

0301 basic medicineComputer scienceDistributed computingGenomic researchBig dataTerabyteComputing MethodologiesDNA sequencing03 medical and health sciences0302 clinical medicineDatabases GeneticDrug DiscoveryHumansThroughput (business)PharmacologyGenomebusiness.industryHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAPrecision medicineSupercomputerData scienceCancer treatment030104 developmental biology030220 oncology & carcinogenesisbusinessAlgorithmsDrug Discovery Today
researchProduct

Data mining approaches to identify biomineralization related sequences.

2015

Proteomics is an efficient high throughput technique developed to identify proteins from a crude extract using sequence homology. Advances in Next Generation Sequencing (NGS) have led to increase knowledge of several non-model species. In the field of calcium carbonate biomineralization, the paucity of available sequences (such as the ones of mollusc shells) is still a bottleneck in most proteomic studies. Indeed, this technique needs proteins databases to find homology. The aim of this study was to perform different data mining approaches in order to identify novel shell proteins. To this end, we disposed of several publicly non-model molluscs databases. Previously identified molluscan she…

0301 basic medicineComputer scienceMechanical EngineeringProteomicscomputer.software_genre[ SDV.IB.BIO ] Life Sciences [q-bio]/Bioengineering/BiomaterialsBottleneckDNA sequencing[SDV.IB.BIO] Life Sciences [q-bio]/Bioengineering/Biomaterials03 medical and health sciencesAnnotation030104 developmental biologySequence homologyMechanics of Materials[ SDV.BBM.GTP ] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Shell matrix[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]General Materials ScienceData miningKEGGcomputerComputingMilieux_MISCELLANEOUSBiomineralization
researchProduct

Transcriptome Analysis of PA Gain and Loss of Function Mutants

2017

Functional genomics has become a forefront methodology for plant science thanks to the widespread development of microarray technology. While technical difficulties associated with the process of obtaining raw expression data have been diminishing, allowing the appearance of tremendous amounts of transcriptome data in different databases, a common problem using "omic" technologies remains: the interpretation of these data and the inference of its biological meaning. In order to assist to this complex task, a wide variety of software tools have been developed. In this chapter we describe our current workflow of the application of some of these analyses. We have used it to compare the transcr…

0301 basic medicineComputer scienceMicroarray analysis techniquesProcess (engineering)MutantComputational biologyOmicsTranscriptomeGene expression profiling03 medical and health sciences030104 developmental biologyMolecular Sequence AnnotationGene chip analysisFunctional genomicsLoss function
researchProduct

A new parallel pipeline for DNA methylation analysis of long reads datasets

2017

Background DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. Results In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while…

0301 basic medicineComputer scienceParallel pipelineADN02 engineering and technologycomputer.software_genreBiochemistrySensitivity and SpecificityDNA sequencingEpigenesis Genetic03 medical and health scienceschemistry.chemical_compoundStructural BiologyRNA analysisInformàticaDatabases Genetic0202 electrical engineering electronic engineering information engineeringHumansEpigeneticsMolecular Biology020203 distributed computingDNA methylationGenome HumanApplied MathematicsParallel pipelineMethylationSequence Analysis DNASupercomputerComputer Science ApplicationsGenòmica030104 developmental biologychemistryGene Expression RegulationDNA methylationMutationData miningHigh performance computingDNA microarraycomputerSequence AlignmentDNASoftware
researchProduct

2019

As rats learn to search for multiple sources of food or water in a complex environment, they generate increasingly efficient trajectories between reward sites. Such spatial navigation capacity involves the replay of hippocampal place-cells during awake states, generating small sequences of spatially related place-cell activity that we call "snippets". These snippets occur primarily during sharp-wave-ripples (SWRs). Here we focus on the role of such replay events, as the animal is learning a traveling salesperson task (TSP) across multiple trials. We hypothesize that snippet replay generates synthetic data that can substantially expand and restructure the experience available and make learni…

0301 basic medicineComputer sciencePlace cellMachine learningcomputer.software_genreSpatial memorySynthetic data03 medical and health sciencesCellular and Molecular Neuroscience0302 clinical medicineModels of neural computationGeneticsReinforcement learningMolecular BiologyEcology Evolution Behavior and SystematicsEcologybusiness.industryReservoir computingSnippet030104 developmental biologyComputational Theory and MathematicsModeling and SimulationSequence learningArtificial intelligencebusinesscomputer030217 neurology & neurosurgeryPLOS Computational Biology
researchProduct

miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal

2017

Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and …

0301 basic medicineComputer scienceProcess (engineering)media_common.quotation_subjectmiRToolsGallerycomputer.software_genreBioinformaticsGeneral Biochemistry Genetics and Molecular Biology03 medical and health sciencesUpload0302 clinical medicinetyövälineetFunction (engineering)Data Curationmedia_commonStructure (mathematical logic)DatabaseData curationSequence Analysis RNAbioinformatiikkabioinformaticsMicroRNAsIdentification (information)Database Tool030104 developmental biologyRankingFeature (computer vision)toolsta1181Databases Nucleic AcidGeneral Agricultural and Biological SciencescomputerAlgorithms030217 neurology & neurosurgeryInformation SystemsDatabase
researchProduct