Search results for "algorithm"

showing 10 items of 4887 documents

Pathway analysis of high-throughput biological data within a Bayesian network framework

2011

Abstract Motivation: Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Results: Proposed method takes into account the connectivity and relatedness between nodes of the p…

Statistics and ProbabilityComputer scienceHigh-throughput screeningGene regulatory networkcomputer.software_genreModels BiologicalBiochemistrySynthetic dataBiological pathwayBayes' theoremHumansGene Regulatory NetworksCarcinoma Renal CellMolecular BiologyGeneBiological dataMicroarray analysis techniquesGene Expression ProfilingBayesian networkRobustness (evolution)Bayes TheoremPathway analysisKidney NeoplasmsHigh-Throughput Screening AssaysComputer Science ApplicationsGene expression profilingComputational MathematicsComputational Theory and MathematicsCausal inferenceData miningcomputerAlgorithmsSoftwareBioinformatics

researchProduct

Blind Source Separation Based on Joint Diagonalization in R: The Packages JADE and BSSasymp

2017

Blind source separation (BSS) is a well-known signal processing tool which is used to solve practical data analysis problems in various fields of science. In BSS, we assume that the observed data consists of linear mixtures of latent variables. The mixing system and the distributions of the latent variables are unknown. The aim is to find an estimate of an unmixing matrix which then transforms the observed data back to latent sources. In this paper we present the R packages JADE and BSSasymp. The package JADE offers several BSS methods which are based on joint diagonalization. Package BSSasymp contains functions for computing the asymptotic covariance matrices as well as their data-based es…

Statistics and ProbabilityComputer scienceJADE (programming language)02 engineering and technologyLatent variableMachine learningcomputer.software_genre01 natural sciencesBlind signal separation010104 statistics & probabilityMatrix (mathematics)nonstationary source separationMixing (mathematics)0202 electrical engineering electronic engineering information engineeringsecond order source separation0101 mathematicslcsh:Statisticslcsh:HA1-4737computer.programming_languageta113Signal processingta112matematiikkamultivariate time seriesmathematicsbusiness.industryEstimator020206 networking & telecommunicationsriippumattomien komponenttien analyysiindependent component analysis; multivariate time series; nonstationary source separation; performance indices; second order source separationIndependent component analysisperformance indicesstatisticsindependent component analysisArtificial intelligenceStatistics Probability and UncertaintybusinesscomputerAlgorithmSoftwareJournal of Statistical Software

researchProduct

Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis

2017

International audience; The geometric median covariation matrix is a robust multivariate indicator of dispersion which can be extended without any difficulty to functional data. We define estimators, based on recursive algorithms, that can be simply updated at each new observation and are able to deal rapidly with large samples of high dimensional data without being obliged to store all the data in memory. Asymptotic convergence properties of the recursive algorithms are studied under weak conditions. The computation of the principal components can also be performed online and this approach can be useful for online outlier detection. A simulation study clearly shows that this robust indicat…

Statistics and ProbabilityComputer scienceMathematics - Statistics TheoryStatistics Theory (math.ST)01 natural sciences010104 statistics & probabilityMatrix (mathematics)Dimension (vector space)Geometric medianStochastic gradientFOS: Mathematics0101 mathematicsL1-median010102 general mathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]Geometric medianCovariance[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]Functional dataMSC: 62G05 62L20Principal component analysisProjection pursuitAnomaly detectionRecursive robust estimationStatistics Probability and UncertaintyAlgorithm

researchProduct

Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods

2014

Abstract Motivation: Protein–protein interaction (PPI) networks are powerful models to represent the pairwise protein interactions of the organisms. Clustering PPI networks can be useful for isolating groups of interacting proteins that participate in the same biological processes or that perform together specific biological functions. Evolutionary orthologies can be inferred this way, as well as functions and properties of yet uncharacterized proteins. Results: We present an overview of the main state-of-the-art clustering methods that have been applied to PPI networks over the past decade. We distinguish five specific categories of approaches, describe and compare their main features and …

Statistics and ProbabilityComputer sciencePopulationPopulation basedMachine learningcomputer.software_genreBiochemistryProtein protein interaction networkgenetic algorithmsProtein–protein interactionBioinformatics Clustering Biological NetworksPPI networkscomplex detectionProtein Interaction MappingAnimalsCluster AnalysisHumanseducationCluster analysisMolecular BiologyTopology (chemistry)Class (computer programming)education.field_of_studybusiness.industryfood and beveragesProteinsComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsArtificial intelligenceData miningbusinessFocus (optics)computerAlgorithms

researchProduct

mRNAStab—a web application for mRNA stability analysis

2013

Abstract Eukaryotic gene expression is regulated both at the transcription and the mRNA degradation levels. The implementation of functional genomics methods that allow the simultaneous measurement of transcription (TR) and degradation (DR) rates for thousands of mRNAs is a huge improvement in this field. One of the best established methods for mRNA stability determination is genomic run-on (GRO). It allows the measurement of DR, TR and mRNA levels during cell dynamic responses. Here, we offer a software package that provides improved algorithms for determination of mRNA stability during dynamic GRO experiments. Availability and implementation: The program mRNAStab is freely accessible at h…

Statistics and ProbabilityComputer scienceRNA StabilityCellComputational biologyBioinformaticsBiochemistryTranscription (biology)Gene expressionMRNA degradationmedicineHumansWeb applicationRNA MessengerMolecular BiologyInternetMessenger RNAbusiness.industryRNAGenomicsComputer Science ApplicationsComputational Mathematicsmedicine.anatomical_structureComputational Theory and MathematicsMrna levelbusinessFunctional genomicsAlgorithmsSoftwareBioinformatics

researchProduct

Acceleration of short and long DNA read mapping without loss of accuracy using suffix array

2014

HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20 for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies.

Statistics and ProbabilityComputer scienceSequence analysisSequence alignmentdatabase searchescomputer.software_genreBiochemistrylaw.inventionAccelerationchemistry.chemical_compoundlawCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALAnimalsHumansMolecular BiologyDatabasesequencing dataSuffix arraySequence analysisHigh-Throughput Nucleotide SequencingalignmentSequence Analysis DNAApplications NotesComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicschemistryDrosophilaSuffixSequence AlignmentcomputerAlgorithmAlgorithmsSoftwareDNA

researchProduct

FLP estimation of semi-parametric models for space-time point processes and diagnostic tools

2015

Abstract The conditional intensity function of a space–time branching model is defined by the sum of two main components: the long-run term intensity and short-run term one. Their simultaneous estimation is a complex issue that usually requires the use of hard computational techniques. This paper deals with a new mixed estimation approach for a particular space–time branching model, the Epidemic Type Aftershock Sequence model. This approach uses a simultaneous estimation of the different model components, alternating a parametric step for estimating the induced component by Maximum Likelihood and a non-parametric estimation step, for the background intensity, by FLP (Forward Predictive Like…

Statistics and ProbabilityComputer scienceSpace timeR packageProbability and statisticsManagement Monitoring Policy and LawSpace-time point processePoint processSemiparametric modelTerm (time)ETAS modelComputers in Earth ScienceComponent (UML)StatisticsCode (cryptography)Computers in Earth SciencesAlgorithmEtasFLPParametric statistics

researchProduct

Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data

2012

Abstract Motivation: The imperfect sequence data produced by next-generation sequencing technologies have motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. Results: In this article, we present Musket, an efficient multistage k-mer-based corrector for Illumina short-read data. We use the k-mer spectrum approach and introduce three correction techniques in a multistage workflow: two-s…

Statistics and ProbabilityComputer sciencebusiness.industrySequence assemblySequence Analysis DNAMusketBiochemistryComputer Science ApplicationsComputational MathematicsCUDASoftwareComputational Theory and Mathematicsk-merEscherichia coliChromosomes HumanHumansbusinessFocus (optics)Molecular BiologyAlgorithmAlgorithmsGenome BacterialSoftwareIllumina dye sequencingBioinformatics

researchProduct

Community detection algorithm evaluation with ground-truth data

2018

International audience; Community structure is of paramount importance for the understanding of complex networks. Consequently, there is a tremendous effort in order to develop efficient community detection algorithms. Unfortunately, the issue of a fair assessment of these algorithms is a thriving open question. If the ground-truth community structure is available, various clustering-based metrics are used in order to compare it versus the one discovered by these algorithms. However, these metrics defined at the node level are fairly insensitive to the variation of the overall community structure. To overcome these limitations, we propose to exploit the topological features of the ‘communit…

Statistics and ProbabilityComputer science‘Community-graph’Community structureVariation (game tree)[INFO.INFO-RO]Computer Science [cs]/Operations Research [cs.RO]Complex networkCondensed Matter Physics01 natural sciencesGraph010305 fluids & plasmasCommunity structureSet (abstract data type)0103 physical sciencesNetwork analysis010306 general physicsCluster analysisAlgorithmNetwork analysis

researchProduct

Sequentially Rejective Test Procedures for Detecting Outlying Cells in One- and Two-Sample Multinomial Experiments

1985

For multiple testing of multinomial models in the case of one or two samples we propose using test procedures based on the principle described by MARCUS, PERITZ and GABRIEL (1976). These methods are based in each step of the sequentially rejective strategy on tests which exhaust the full α level (i.e. which are not conservative). The tests can be performed in a finite or asymptotic version.

Statistics and ProbabilityContingency tableTest proceduresStatisticsMultiple comparisons problemMultinomial distributionGeneral MedicineTwo sampleStatistics Probability and UncertaintyAlgorithmConfigural frequency analysisMathematicsBiometrical Journal

researchProduct