Search results for " data"

showing 10 items of 7516 documents

The colored longest common prefix array computed via sequential scans

2018

Due to the increased availability of large datasets of biological sequences, the tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most of the alignment-free approaches require the computation of statistics of the sequences in the dataset. Such computations become impractical in internal memory when very large collections of long sequences are considered. In this paper, we present a new conceptual data structure, the colored longest common prefix array (cLCP), that allows to efficiently tackle several problems with an alignment-free approach. In fact, we show that such a data structure can be computed via sequential scans in semi-exter…

0301 basic medicineFOS: Computer and information sciencesAlignment-free methodsBurrows–Wheeler transformComputer scienceComputationAverage common substring0206 medical engineeringMatching statisticsScale (descriptive set theory)02 engineering and technologyTheoretical Computer Science03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Burrows-wheeler transformString (computer science)Computer Science (all)LCP arrayMatching statisticData structureSubstring030104 developmental biologyAlignment-free methods; Average common substring; Burrows-wheeler transform; Longest common prefix; Matching statistics; Theoretical Computer Science; Computer Science (all)Pairwise comparisonLongest common prefixAlgorithm020602 bioinformaticsAlignment-free method

researchProduct

Alignment-free sequence comparison using absent words

2018

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…

0301 basic medicineFOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySequence alignmentInformation System0102 computer and information sciencesCircular wordAbsent words01 natural sciencesUpper and lower boundsSequence comparisonTheoretical Computer ScienceCombinatorics03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Absent wordCircular wordsMathematicsSequenceSettore INF/01 - InformaticaProcess (computing)q-gramComputer Science Applications1707 Computer Vision and Pattern Recognitionq-gramsComposition (combinatorics)Computer Science Applications030104 developmental biologyComputational Theory and MathematicsForbidden words010201 computation theory & mathematicsFocus (optics)Forbidden wordWord (computer architecture)Information SystemsInteger (computer science)

researchProduct

Biophysics of high density nanometer regions extracted from super-resolution single particle trajectories: application to voltage-gated calcium chann…

2019

AbstractThe cellular membrane is very heterogenous and enriched with high-density regions forming microdomains, as revealed by single particle tracking experiments. However the organization of these regions remain unexplained. We determine here the biophysical properties of these regions, when described as a basin of attraction. We develop two methods to recover the dynamics and local potential wells (field of force and boundary). The first method is based on the local density of points distribution of trajectories, which differs inside and outside the wells. The second method focuses on recovering the drift field that is convergent inside wells and uses the transient field to determine the…

0301 basic medicineField (physics)1.1 Normal biological development and functioningHigh densityBoundary (topology)lcsh:Medicine32 Biomedical and Clinical SciencesLocal field potentialArticleQuantitative Biology::Cell BehaviorQuantitative Biology::Subcellular ProcessesComputational biophysics03 medical and health sciences0302 clinical medicineSingle-molecule biophysics1 Underpinning researchlcsh:SciencePhysicsMultidisciplinary3208 Medical PhysiologyVoltage-dependent calcium channelFOS: Clinical medicinelcsh:RNeurosciencesScientific data030104 developmental biologyParticleNanometrelcsh:QBiological systemBiological physics51 Physical Sciences030217 neurology & neurosurgeryEnergy (signal processing)

researchProduct

Use of deep learning methods to translate drug-induced gene expression changes from rat to human primary hepatocytes

2020

In clinical trials, animal and cell line models are often used to evaluate the potential toxic effects of a novel compound or candidate drug before progressing to human trials. However, relating the results of animal and in vitro model exposures to relevant clinical outcomes in the human in vivo system still proves challenging, relying on often putative orthologs. In recent years, multiple studies have demonstrated that the repeated dose rodent bioassay, the current gold standard in the field, lacks sufficient sensitivity and specificity in predicting toxic effects of pharmaceuticals in humans. In this study, we evaluate the potential of deep learning techniques to translate the pattern of …

0301 basic medicineGene ExpressionGene Expression Regulation/drug effectsPathology and Laboratory MedicineConvolutional neural networkTOXICITYMachine LearningVoeding Metabolisme en GenomicaTime Measurement0302 clinical medicineGene expressionMedicine and Health SciencesMeasurementClinical Trials as TopicMultidisciplinaryArtificial neural networkPharmaceuticsQRMetabolism and GenomicsTOXICOGENOMICS030220 oncology & carcinogenesisMetabolisme en GenomicaMedicineEngineering and TechnologyNutrition Metabolism and GenomicsHepatocytes/drug effectsAlgorithmsResearch ArticleComputer and Information SciencesClinical Trials as Topic/statistics & numerical dataNeural NetworksGenetic ToxicologyTOXICOLOGYSciencePredictive ToxicologyComputational biologyBiologyComputer03 medical and health sciencesDose Prediction MethodsDeep LearningVoedingArtificial IntelligenceIn vivoGeneticsLife ScienceAnimalsHumansGeneNutritionbusiness.industryDeep learningBiology and Life SciencesGold standard (test)REPRESENTATIONSRats030104 developmental biologyGene Expression RegulationHepatocytesArtificial intelligenceNeural Networks ComputerToxicogenomicsbusinessNeuroscience

researchProduct

MiasDB: A Database of Molecular Interactions Associated with Alternative Splicing of Human Pre-mRNAs.

2016

Alternative splicing (AS) is pervasive in human multi-exon genes and is a major contributor to expansion of the transcriptome and proteome diversity. The accurate recognition of alternative splice sites is regulated by information contained in networks of protein-protein and protein-RNA interactions. However, the mechanisms leading to splice site selection are not fully understood. Although numerous databases have been built to describe AS, molecular interaction databases associated with AS have only recently emerged. In this study, we present a new database, MiasDB, that provides a description of molecular interactions associated with human AS events. This database covers 938 interactions …

0301 basic medicineGene regulatory networklcsh:MedicineRNA-binding proteinRNA-binding proteinscomputer.software_genreBiochemistryHistonesExonDatabase and Informatics MethodsDatabases GeneticProtein Interaction MappingRNA PrecursorsGene Regulatory NetworksDatabase Searchinglcsh:ScienceMultidisciplinaryDatabaseExonsGenomicsGenomic DatabasesNucleic acidsRNA splicingProteomeSequence AnalysisResearch ArticleSequence DatabasesBiologyResponse ElementsResearch and Analysis MethodsGenome Complexity03 medical and health sciencesGeneticsHumansMolecular Biology TechniquesSequencing TechniquesProtein InteractionsGeneMolecular BiologyInternetlcsh:RAlternative splicingIntronBiology and Life SciencesComputational BiologyProteinsGenome AnalysisIntronsAlternative Splicing030104 developmental biologyBiological DatabasesRNA processingRNAlcsh:QRNA Splice SitesGene expressioncomputerProtein KinasesTranscription FactorsPloS one

researchProduct

Diagnostic odyssey in severe neurodevelopmental disorders: toward clinical whole-exome sequencing as a first-line diagnostic test

2016

The current standard of care for diagnosis of severe intellectual disability (ID) and epileptic encephalopathy (EE) results in a diagnostic yield of ∼50%. Affected individuals nonetheless undergo multiple clinical evaluations and low-yield laboratory tests often referred to as a 'diagnostic odyssey'. This study was aimed at assessing the utility of clinical whole-exome sequencing (WES) in individuals with undiagnosed and severe forms of ID and EE, and the feasibility of its implementation in routine practice by a small regional genetic center. We performed WES in a cohort of 43 unrelated individuals with undiagnosed ID and/or EE. All individuals had undergone multiple clinical evaluations a…

0301 basic medicineGeneticsPediatricsmedicine.medical_specialtybusiness.industryEpileptic encephalopathyFirst lineSequencing dataData interpretationDiagnostic testmedicine.disease3. Good health03 medical and health sciences030104 developmental biologyCohortIntellectual disabilityGeneticsmedicinebusinessGenetics (clinical)Exome sequencingClinical Genetics

researchProduct

Lost Strings in Genomes: What Sense Do They Make?

2017

We studied the sets of avoided strings to be observed over a family of genomes. It was found that the length of the minimal avoided string rarely exceeds 9 nucleotides, with neither respect to a phylogeny of a genome under consideration. The lists of the avoided strings observed over the sets of (related) genomes have been analyzed. Very low correlation between the phylogeny, and the set of those strings has been found.

0301 basic medicineGeneticsanimal structuresgenetic structuresinformation scienceString (physics)GenomeCombinatoricsSet (abstract data type)03 medical and health sciences030104 developmental biology0302 clinical medicinePhylogeneticscardiovascular systemLow correlation030217 neurology & neurosurgerySelection (genetic algorithm)Mathematics

researchProduct

Exploiting Helminth–Host Interactomes through Big Data

2017

Helminths facilitate their parasitic existence through the production and secretion of different molecules, including proteins. Some helminth proteins can manipulate the host's immune system, a phenomenon that is now being exploited with a view to developing therapeutics for inflammatory diseases. In recent years, hundreds of helminth genomes have been sequenced, but as a community we are still taking baby steps when it comes to identifying proteins that govern host-helminth interactions. The information generated from genomic, immunomic, and proteomic studies, as well as from cutting-edge approaches such as proteogenomics, is leading to a substantial volume of big data that can be utilised…

0301 basic medicineGenome HelminthVaccinesHost (biology)business.industryHelminth proteinBig dataComputational BiologyHelminth ProteinsComputational biologyBiologyProteogenomicsHelminth GenomesProteomicsBioinformaticsHost-Parasite Interactions03 medical and health sciences030104 developmental biologyInfectious Diseasesparasitic diseasesAnimalsHumansParasitologybusinessTrends in Parasitology

researchProduct

Phylogenomics of Lophotrochozoa with Consideration of Systematic Error.

2015

Phylogenomic studies have improved understanding of deep metazoan phylogeny and show promise for resolving incongruences among analyses based on limited numbers of loci. One region of the animal tree that has been especially difficult to resolve, even with phylogenomic approaches, is relationships within Lophotrochozoa (the animal clade that includes molluscs, annelids, and flatworms among others). Lack of resolution in phylogenomic analyses could be due to insufficient phylogenetic signal, limitations in taxon and/or gene sampling, or systematic error. Here, we investigated why lophotrochozoan phylogeny has been such a difficult question to answer by identifying and reducing sources of sys…

0301 basic medicineGenomebiologyPhylogenetic treeLophotrochozoabiology.organism_classificationMissing dataClassificationBryozoa03 medical and health sciences030104 developmental biologyEvolutionary biologyPhylogeneticsPhylogenomicsGeneticsAnimalsSpiraliaCladeEcology Evolution Behavior and SystematicsPhylogenyPlatyzoaSystematic biology

researchProduct

Exome-Wide Association Study on Alanine Aminotransferase Identifies Sequence Variants in the GPAM and APOE Associated With Fatty Liver Disease.

2021

BACKGROUND & AIMS: Fatty liver disease (FLD) is a growing epidemic that is expected to be the leading cause of end-stage liver disease within the next decade. Both environmental and genetic factors contribute to the susceptibility of FLD. Several genetic variants contributing to FLD have been identified in exome-wide association studies. However, there is still a missing hereditability indicating that other genetic variants are yet to be discovered. METHODS: To find genes involved in FLD, we first examined the association of missense and nonsense variants with alanine amino transferase at an exome-wide level in 425,671 participants from the UK Biobank. We then validated genetic variants wit…

0301 basic medicineGenome-wide association studyLiver disease0302 clinical medicineENRICHMENT ANALYSISNon-alcoholic Fatty Liver DiseaseRisk FactorsNonalcoholic fatty liver diseaseExomeCONFERS SUSCEPTIBILITYGeneticsINSULIN-RESISTANCEmedicine.diagnostic_testFatty liverGastroenterologyAlanine Transaminase1-Acylglycerol-3-Phosphate O-Acyltransferase3. Good healthGENOMEEuropePhenotypeLiver biopsy030211 gastroenterology & hepatologyNonalcoholic Fatty Liver DiseaseMAFLDSingle-nucleotide polymorphismBiologyTransaminaseRisk Assessment03 medical and health sciencesApolipoproteins ENAFLDmedicineGenetic predispositionHumansGenetic Predisposition to DiseaseHEPATIC STEATOSISGenetic associationMAFLD Phenotype Reproducibility of Results Risk Assessment Risk Factors Transcriptome Genetic Variation Metabolic Associated Fatty Liver Disease Nonalcoholic Fatty Liver Disease Transaminase 1-Acylglycerol-3-Phosphate O-Acyltransferase Alanine Transaminase Apolipoproteins E Biomarkers Europe Exome Gene Expression Profiling Genetic Predisposition to Disease Genome-Wide Association Study Humans Non-alcoholic Fatty Liver DiseaseHepatologyMUTATIONSGene Expression ProfilingGenetic VariationReproducibility of Resultsmedicine.diseaseX-RECEPTORGENE030104 developmental biology3121 General medicine internal medicine and other clinical medicineMetabolic Associated Fatty Liver DiseaseRNA-SEQ DATATranscriptomePATHOGENICITYBiomarkersGenome-Wide Association StudyGastroenterology

researchProduct