Search results for " COMPUTATION"

showing 10 items of 1478 documents

Alignment-free sequence comparison using absent words

2018

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…

0301 basic medicineFOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySequence alignmentInformation System0102 computer and information sciencesCircular wordAbsent words01 natural sciencesUpper and lower boundsSequence comparisonTheoretical Computer ScienceCombinatorics03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Absent wordCircular wordsMathematicsSequenceSettore INF/01 - InformaticaProcess (computing)q-gramComputer Science Applications1707 Computer Vision and Pattern Recognitionq-gramsComposition (combinatorics)Computer Science Applications030104 developmental biologyComputational Theory and MathematicsForbidden words010201 computation theory & mathematicsFocus (optics)Forbidden wordWord (computer architecture)Information SystemsInteger (computer science)
researchProduct

Measuring the clustering effect of BWT via RLE

2017

Abstract The Burrows–Wheeler Transform (BWT) is a reversible transformation on which are based several text compressors and many other tools used in Bioinformatics and Computational Biology. The BWT is not actually a compressor, but a transformation that performs a context-dependent permutation of the letters of the input text that often create runs of equal letters (clusters) longer than the ones in the original text, usually referred to as the “clustering effect” of BWT. In particular, from a combinatorial point of view, great attention has been given to the case in which the BWT produces the fewest number of clusters (cf. [5] , [16] , [21] , [23] ). In this paper we are concerned about t…

0301 basic medicineGeneral Computer SciencePermutationComputer Science (all)Binary number0102 computer and information sciencesQuantitative Biology::Genomics01 natural sciencesUpper and lower boundsTheoretical Computer ScienceCombinatorics03 medical and health sciencesPermutation030104 developmental biologyTransformation (function)BWT010201 computation theory & mathematicsRun-length encodingComputer Science::Data Structures and AlgorithmsCluster analysisPrimitive root modulo nBWT; Permutation; Run-length encoding; Theoretical Computer Science; Computer Science (all)Word (computer architecture)Run-length encodingMathematics
researchProduct

Coupling News Sentiment with Web Browsing Data Improves Prediction of Intra-Day Price Dynamics

2015

The new digital revolution of big data is deeply changing our capability of understanding society and forecasting the outcome of many social and economic systems. Unfortunately, information can be very heterogeneous in the importance, relevance, and surprise it conveys, affecting severely the predictive power of semantic and statistical methods. Here we show that the aggregation of web users' behavior can be elicited to overcome this problem in a hard to predict complex system, namely the financial market. Specifically, our in-sample analysis shows that the combined use of sentiment analysis of news and browsing activity of users of Yahoo! Finance greatly helps forecasting intra-day and dai…

0301 basic medicineINFORMATIONEconomicsComputer scienceBig datalcsh:MedicineSocial SciencesQuantitative Finance - Computational Financesocial and economic systemsMathematical and Statistical TechniquesSociologybig dataEconometrics050207 economicsComputer NetworksCapital Marketslcsh:ScienceFinancial Marketsmedia_common050208 financeMultidisciplinary05 social sciencesCommerceSocial CommunicationSettore FIS/02 - Fisica Teorica Modelli e Metodi MatematiciSurpriseModels EconomicSocial NetworksPhysical SciencesSocial SystemsEngineering and TechnologyComputational sociologyBEHAVIORStatistics (Mathematics)Network AnalysisResearch ArticleComputer and Information SciencesExploitmedia_common.quotation_subjectTwitterComputational Finance (q-fin.CP)Research and Analysis MethodsFOS: Economics and business03 medical and health sciencesSEARCH0502 economics and businessHumansRelevance (information retrieval)Web navigationInvestmentsStatistical MethodsInternetStatistical Finance (q-fin.ST)STOCK-MARKETbusiness.industrylcsh:RSentiment analysisFinancial marketATTENTIONQuantitative Finance - Statistical FinanceCommunicationsNoise ReductionFinancial Firms030104 developmental biologySignal ProcessingPredictive powerlcsh:QStock marketbusinessSocial MediaFinanceMathematicsForecastingPLOS ONE
researchProduct

Linear-time sequence comparison using minimal absent words & applications

2016

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realized by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as q-gram distance, are usually computed in time linear with respect to the length of the sequences. In this article, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an absent word of some sequence if it does not occur in…

0301 basic medicineLatin AmericansComputer Science (all)Library science0102 computer and information sciencesCircular wordAlgorithms on string01 natural sciencesAlignmentfree comparisonSequence comparisonTheoretical Computer Science03 medical and health sciences030104 developmental biology010201 computation theory & mathematicsInformaticsPolitical scienceAbsent wordForbidden word
researchProduct

A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.

2018

International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…

0301 basic medicineNematoda01 natural sciencesGaussian Mixture Model[STAT.ML]Statistics [stat]/Machine Learning [stat.ML][MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]ComputingMilieux_MISCELLANEOUScomputer.programming_language[STAT.AP]Statistics [stat]/Applications [stat.AP]Phylogenetic treeDNA ClusteringGenomicsHelminth ProteinsComputer Science Applications[STAT]Statistics [stat]010201 computation theory & mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Data analysisEmbeddingA priori and a posteriori[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Health Informatics0102 computer and information sciences[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]Biology[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]Laplacian EigenmapsAnimalsCluster analysis[SDV.GEN]Life Sciences [q-bio]/GeneticsModels Geneticbusiness.industryPattern recognitionNADH DehydrogenaseSequence Analysis DNAPython (programming language)Mixture model[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationVisualization030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONPlatyhelminths[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Programming LanguagesArtificial intelligence[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]businesscomputerComputers in biology and medicine
researchProduct

A stable brain from unstable components: Emerging concepts and implications for neural computation.

2017

Neuroscientists have often described the adult brain in similar terms to an electronic circuit board- dependent on fixed, precise connectivity. However, with the advent of technologies allowing chronic measurements of neural structure and function, the emerging picture is that neural networks undergo significant remodeling over multiple timescales, even in the absence of experimenter-induced learning or sensory perturbation. Here, we attempt to reconcile the parallel observations that critical brain functions are stably maintained, while synapse- and single-cell properties appear to be reformatted regularly throughout adult life. In this review, we discuss experimental evidence at multiple …

0301 basic medicineNeuronsArtificial neural networkGeneral NeuroscienceComputationModels NeurologicalBrainSensory systemSynapse03 medical and health sciences030104 developmental biology0302 clinical medicineModels of neural computationBiological neural networkAnimalsHumansNeural Networks ComputerPsychologyNeuroscience030217 neurology & neurosurgeryDynamic equilibriumElectronic circuitNeuroscience
researchProduct

Intermittent targeted therapies and stochastic evolution in patients affected by chronic myeloid leukemia

2016

Front line therapy for the treatment of patients affected by chronic myeloid leukemia (CML) is based on the administration of tyrosine kinase inhibitors, namely imatinib or, more recently, axitinib. Although imatinib is highly effective and represents an example of a successful molecular targeted therapy, the appearance of resistance is observed in a proportion of patients, especially those in advanced stages. In this work, we investigate the appearance of resistance in patients affected by CML, by modeling the evolutionary dynamics of cancerous cell populations in a simulated patient treated by an intermittent targeted therapy. We simulate, with the Monte Carlo method, the stochastic evolu…

0301 basic medicineOncologyDrugStatistics and Probabilitymedicine.medical_specialtymedicine.medical_treatmentmedia_common.quotation_subjectTargeted therapy03 medical and health sciencesClassical Monte Carlo simulations; computational biology; models for evolution (theory); mutational and evolutionary processes (theory); Statistical and Nonlinear Physics; Statistics and Probability; Statistics Probability and Uncertainty0302 clinical medicinecomputational biologyInternal medicinemedicineClassical Monte Carlo simulationmutational and evolutionary processes (theory)media_commonbusiness.industryMyeloid leukemiaStatistical and Nonlinear PhysicsImatinibSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Axitinib030104 developmental biology030220 oncology & carcinogenesisCancer cellToxicityStatistics Probability and Uncertaintybusinessmodels for evolution (theory)Tyrosine kinasemedicine.drugStatistical and Nonlinear Physic
researchProduct

An overview of recent molecular dynamics applications as medicinal chemistry tools for the undruggable site challenge

2018

Molecular dynamics (MD) has become increasingly popular due to the development of hardware and software solutions and the improvement in algorithms, which allowed researchers to scale up calculations in order to speed them up. MD simulations are usually used to address protein folding issues or protein-ligand complex stability through energy profile analysis over time. In recent years, the development of new tools able to deeply explore a potential energy surface (PES) has allowed researchers to focus on the dynamic nature of the binding recognition process and binding-induced protein conformational changes. Moreover, modern approaches have been demonstrated to be effective and reliable in …

0301 basic medicinePharmacologyVirtual screeningDrug discoveryComputer scienceOrganic ChemistryRational designPharmaceutical ScienceComputational biologyBiochemistrySmall moleculeSettore CHIM/08 - Chimica FarmaceuticaChemistry03 medical and health sciencesMolecular dynamics030104 developmental biology0302 clinical medicineDocking (molecular)030220 oncology & carcinogenesisDrug DiscoveryMolecular MedicineProtein foldingPharmacophoreMolecular Dynamics undruggable target computational studies
researchProduct

Block Sorting-Based Transformations on Words: Beyond the Magic BWT

2018

The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression and later results have contributed to make it a fundamental tool for the design of self-indexing compressed data structures. The Alternating Burrows-Wheeler Transform (ABWT) is a more recent transformation, studied in the context of Combinatorics on Words, that works in a similar way, using an alternating lexicographical order instead of the usual one. In this paper we study a more general class of block sorting-based transformations. The transformations in this new class prove to be interesting combinatorial tools that offer new research perspectives. In particular, we show that all the tra…

0301 basic medicineSettore INF/01 - InformaticaComputer scienceData_CODINGANDINFORMATIONTHEORY0102 computer and information sciencesBlock sortingData structureLexicographical order01 natural sciencesUpper and lower bounds03 medical and health sciencesCombinatorics on words030104 developmental biology010201 computation theory & mathematicsArithmeticCompressed Data Structures Block Sorting Combinatorics on Words AlgorithmsData compression
researchProduct

Assessing statistical significance in multivariable genome wide association analysis

2016

Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whe…

0301 basic medicineStatistics and Probability1303 BiochemistryGenotypeOperations researchLibrary sciencePolymorphism Single NucleotideBiochemistryGerman03 medical and health sciences10007 Department of EconomicsPolitical scienceGenome-Wide Association Analysis1312 Molecular Biology1706 Computer Science ApplicationsCluster AnalysisHumansComputer Simulation2613 Statistics and ProbabilityMolecular BiologyEuropean researchGenetics and Population AnalysisComputational BiologyReproducibility of ResultsOriginal Paperslanguage.human_languageComputer Science Applications330 EconomicsComputational MathematicsPhenotype030104 developmental biologyComputational Theory and MathematicsLinear Modelslanguage2605 Computational MathematicsGenome-Wide Association Study1703 Computational Theory and Mathematics
researchProduct