Search results for "probability"

showing 10 items of 3417 documents

Adaptive reference-free compression of sequence quality scores

2014

Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…

Statistics and ProbabilityFOS: Computer and information sciencesComputer sciencemedia_common.quotation_subjectReference-freecomputer.software_genreBiochemistryDNA sequencingSet (abstract data type)Redundancy (information theory)BWTComputer Science - Data Structures and AlgorithmsCode (cryptography)AnimalsHumansQuality (business)Data Structures and Algorithms (cs.DS)Quantitative Biology - GenomicsCaenorhabditis elegansMolecular Biologymedia_commonGenomics (q-bio.GN)SequenceGenomeSettore INF/01 - Informaticareference-free compressionHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAData CompressioncompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesData miningquality scoreMetagenomicscomputerBWT; compression; quality score; reference-free compressionAlgorithmsReference genome
researchProduct

What we look at in paintings: A comparison between experienced and inexperienced art viewers

2016

How do people look at art? Are there any differences between how experienced and inexperienced art viewers look at a painting? We approach these questions by analyzing and modeling eye movement data from a cognitive art research experiment, where the eye movements of twenty test subjects, ten experienced and ten inexperienced art viewers, were recorded while they were looking at paintings. Eye movements consist of stops of the gaze as well as jumps between the stops. Hence, the observed gaze stop locations can be thought as a spatial point pattern, which can be modeled by a spatio-temporal point process. We introduce some statistical tools to analyze the spatio-temporal eye movement data, a…

Statistics and ProbabilityFOS: Computer and information sciencesCoverageComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION01 natural sciencesStatistics - Applications050105 experimental psychologyVisual arts010104 statistics & probabilitysilmänliikkeetInformationSystems_MODELSANDPRINCIPLES0501 psychology and cognitive sciencesApplications (stat.AP)0101 mathematicspoint processPaintingPoint (typography)05 social sciencesEye movementCognitioncognitive art researchtransition probabilityGazeTest (assessment)shift functionModeling and Simulationart viewersStatistics Probability and UncertaintyPsychologyintensity
researchProduct

Latin hypercube sampling with inequality constraints

2010

International audience; In some studies requiring predictive and CPU-time consuming numerical models, the sampling design of the model input variables has to be chosen with caution. For this purpose, Latin hypercube sampling has a long history and has shown its robustness capabilities. In this paper we propose and discuss a new algorithm to build a Latin hypercube sample (LHS) taking into account inequality constraints between the sampled variables. This technique, called constrained Latin hypercube sampling (cLHS), consists in doing permutations on an initial LHS to honor the desired monotonic constraints. The relevance of this approach is shown on a real example concerning the numerical w…

Statistics and ProbabilityFOS: Computer and information sciencesEconomics and EconometricsMathematical optimizationDesign of Experiments020209 energyMonotonic functionSample (statistics)Mathematics - Statistics Theory02 engineering and technologyStatistics Theory (math.ST)01 natural sciencesStatistics - Computation010104 statistics & probabilityRobustness (computer science)[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]Sampling design0202 electrical engineering electronic engineering information engineeringFOS: Mathematics[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsDependenceUncertainty analysisLatin hypercube samplingComputation (stat.CO)MathematicsApplied MathematicsComputer experimentFunction (mathematics)[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]Computer experiment[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]Latin hypercube samplingModeling and SimulationUncertainty analysisSocial Sciences (miscellaneous)Analysis
researchProduct

Bayesian survival analysis with BUGS

2020

Survival analysis is one of the most important fields of statistics in medicine and biological sciences. In addition, the computational advances in the last decades have favored the use of Bayesian methods in this context, providing a flexible and powerful alternative to the traditional frequentist approach. The objective of this article is to summarize some of the most popular Bayesian survival models, such as accelerated failure time, proportional hazards, mixture cure, competing risks, multi-state, frailty, and joint models of longitudinal and survival data. Moreover, an implementation of each presented model is provided using a BUGS syntax that can be run with JAGS from the R programmin…

Statistics and ProbabilityFOS: Computer and information sciencesEpidemiologyComputer scienceBayesian probabilityContext (language use)Accelerated failure time modelMachine learningcomputer.software_genreBayesian inference01 natural sciencesStatistics - Applications010104 statistics & probability03 medical and health sciences0302 clinical medicineFrequentist inferenceHumansApplications (stat.AP)030212 general & internal medicine0101 mathematicsModels StatisticalSyntax (programming languages)business.industryR Programming LanguageBayes TheoremSurvival AnalysisMedical statisticsArtificial intelligencebusinesscomputer
researchProduct

Reassessing Accuracy Rates of Median Decisions

2007

We show how Bruno de Finetti''s fundamental theorem of prevision has computable applications in statistical problems that involve only partial information. Specifically, we assess accuracy rates for median decision procedures used in the radiological diagnosis of asbestosis. Conditional exchangeability of individual radiologists'' diagnoses is recognized as more appropriate than independence which is commonly presumed. The FTP yields coherent bounds on probabilities of interest when available information is insufficient to determine a complete distribution. Further assertions that are natural to the problem motivate a partial ordering of conditional probabilities, extending the computation …

Statistics and ProbabilityFOS: Computer and information sciencesFundamental theorem of previsionComputer scienceGeneral MathematicsComputationSpecificity.Quadratic programmingStatistics - ApplicationsMedical diagnosiSensitivityLinear programmingProbability boundApplications (stat.AP)Second opinionQuadratic programmingMedical diagnosisIndependence (probability theory)Fundamental theoremAsbestosiConditional probabilityDistribution (mathematics)ExchangeabilityPredictivevalueStatistics Probability and UncertaintyPartially ordered setCoherenceMathematical economics
researchProduct

A computationally fast alternative to cross-validation in penalized Gaussian graphical models

2015

We study the problem of selection of regularization parameter in penalized Gaussian graphical models. When the goal is to obtain the model with good predicting power, cross validation is the gold standard. We present a new estimator of Kullback-Leibler loss in Gaussian Graphical model which provides a computationally fast alternative to cross-validation. The estimator is obtained by approximating leave-one-out-cross validation. Our approach is demonstrated on simulated data sets for various types of graphs. The proposed formula exhibits superior performance, especially in the typical small sample size scenario, compared to other available alternatives to cross validation, such as Akaike's i…

Statistics and ProbabilityFOS: Computer and information sciencesGaussianInformation CriteriaCross-validationMethodology (stat.ME)symbols.namesakeBayesian information criterionStatisticsPenalized estimationGeneralized approximate cross-validationGraphical modelSDG 7 - Affordable and Clean EnergyStatistics - MethodologyMathematics/dk/atira/pure/sustainabledevelopmentgoals/affordable_and_clean_energyKullback-Leibler loApplied MathematicsEstimatorCross-validationGaussian graphical modelSample size determinationModeling and SimulationsymbolsInformation criteriaStatistics Probability and UncertaintyAkaike information criterionSettore SECS-S/01 - StatisticaAlgorithm
researchProduct

Can the Adaptive Metropolis Algorithm Collapse Without the Covariance Lower Bound?

2011

The Adaptive Metropolis (AM) algorithm is based on the symmetric random-walk Metropolis algorithm. The proposal distribution has the following time-dependent covariance matrix at step $n+1$ \[ S_n = Cov(X_1,...,X_n) + \epsilon I, \] that is, the sample covariance matrix of the history of the chain plus a (small) constant $\epsilon>0$ multiple of the identity matrix $I$. The lower bound on the eigenvalues of $S_n$ induced by the factor $\epsilon I$ is theoretically convenient, but practically cumbersome, as a good value for the parameter $\epsilon$ may not always be easy to choose. This article considers variants of the AM algorithm that do not explicitly bound the eigenvalues of $S_n$ away …

Statistics and ProbabilityFOS: Computer and information sciencesIdentity matrixMathematics - Statistics TheoryStatistics Theory (math.ST)Upper and lower boundsStatistics - Computation93E3593E15Combinatorics60J27Mathematics::ProbabilityLaw of large numbers65C40 60J27 93E15 93E35stochastic approximationFOS: MathematicsEigenvalues and eigenvectorsComputation (stat.CO)Metropolis algorithmMathematicsProbability (math.PR)Zero (complex analysis)CovariancestabilityUniform continuityBounded function65C40Statistics Probability and Uncertaintyadaptive Markov chain Monte CarloMathematics - Probability
researchProduct

Multiscale Granger causality

2017

In the study of complex physical and biological systems represented by multivariate stochastic processes, an issue of great relevance is the description of the system dynamics spanning multiple temporal scales. While methods to assess the dynamic complexity of individual processes at different time scales are well-established, multiscale analysis of directed interactions has never been formalized theoretically, and empirical evaluations are complicated by practical issues such as filtering and downsampling. Here we extend the very popular measure of Granger causality (GC), a prominent tool for assessing directed lagged interactions between joint processes, to quantify information transfer a…

Statistics and ProbabilityFOS: Computer and information sciencesMathematics - Statistics TheoryStatistics Theory (math.ST)01 natural sciencesStatistics - ApplicationsMethodology (stat.ME)03 medical and health sciences0302 clinical medicinegranger causalityGranger causalityMoving average0103 physical sciencesEconometricsFOS: MathematicsState spacecarbon dioxydeApplications (stat.AP)Time series010306 general physicsTemporal scalessignal processingclimateStatistics - MethodologyMathematicsStochastic processBiology and Life SciencestemperatureCondensed Matter PhysicsScience GeneralSystem dynamicsMathematics and StatisticsAutoregressive modelEarth and Environmental SciencesSettore ING-INF/06 - Bioingegneria Elettronica E InformaticaAlgorithm030217 neurology & neurosurgeryStatistical and Nonlinear Physic
researchProduct

Multivariate nonparametric estimation of the Pickands dependence function using Bernstein polynomials

2017

Abstract Many applications in risk analysis require the estimation of the dependence among multivariate maxima, especially in environmental sciences. Such dependence can be described by the Pickands dependence function of the underlying extreme-value copula. Here, a nonparametric estimator is constructed as the sample equivalent of a multivariate extension of the madogram. Shape constraints on the family of Pickands dependence functions are taken into account by means of a representation in terms of Bernstein polynomials. The large-sample theory of the estimator is developed and its finite-sample performance is evaluated with a simulation study. The approach is illustrated with a dataset of…

Statistics and ProbabilityFOS: Computer and information sciencesMultivariate statisticsNONPARAMETRIC ESTIMATIONMULTIVARIATE MAX-STABLE DISTRIBUTION01 natural sciencesCopula (probability theory)Methodology (stat.ME)010104 statistics & probabilityStatisticsStatistics::Methodology0101 mathematicsExtreme-value copulaEXTREMAL DEPENDENCEEXTREMEVALUE COPULA[SDU.ENVI]Sciences of the Universe [physics]/Continental interfaces environmentStatistics - MethodologyComputingMilieux_MISCELLANEOUSMathematics[SDU.OCEAN]Sciences of the Universe [physics]/Ocean AtmosphereApplied Mathematics010102 general mathematicsNonparametric statisticsEstimatorExtremal dependenceHEAVY RAINFALLBernstein polynomialBERNSTEIN POLYNOMIALS EXTREMAL DEPENDENCE EXTREMEVALUE COPULA HEAVY RAINFALL NONPARAMETRIC ESTIMATION MULTIVARIATE MAX-STABLE DISTRIBUTION PICKANDS DEPENDENCE FUNCTION13. Climate actionDependence functionStatistics Probability and UncertaintyMaximaSettore SECS-S/01 - StatisticaBERNSTEIN POLYNOMIALSPICKANDS DEPENDENCE FUNCTION
researchProduct

Comparative Evaluation of Community Detection Algorithms: A Topological Approach

2012

International audience; Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.). However, this type of comparison neglects the topological properties of the communities. In this article, we present a comprehensive comparative study of a representative set of commu…

Statistics and ProbabilityFOS: Computer and information sciencesPhysics - Physics and SocietyComputer science[INFO.INFO-OH]Computer Science [cs]/Other [cs.OH]Rand indexFOS: Physical sciences02 engineering and technologyPhysics and Society (physics.soc-ph)Topology01 natural sciencesMeasure (mathematics)010305 fluids & plasmasSet (abstract data type)Development (topology)0103 physical sciences0202 electrical engineering electronic engineering information engineeringEquivalence (measure theory)Random graphSocial and Information Networks (cs.SI)Computer Science - Social and Information NetworksStatistical and Nonlinear PhysicsNetwork dynamicsPartition (database)[ INFO.INFO-OH ] Computer Science [cs]/Other [cs.OH]020201 artificial intelligence & image processingStatistics Probability and Uncertainty
researchProduct