Search results for "Computation"

showing 10 items of 7362 documents

Extended differential geometric LARS for high-dimensional GLMs with general dispersion parameter

2018

A large class of modeling and prediction problems involves outcomes that belong to an exponential family distribution. Generalized linear models (GLMs) are a standard way of dealing with such situations. Even in high-dimensional feature spaces GLMs can be extended to deal with such situations. Penalized inference approaches, such as the $$\ell _1$$ or SCAD, or extensions of least angle regression, such as dgLARS, have been proposed to deal with GLMs with high-dimensional feature spaces. Although the theory underlying these methods is in principle generic, the implementation has remained restricted to dispersion-free models, such as the Poisson and logistic regression models. The aim of this…

Statistics and ProbabilityGeneralized linear modelMathematical optimizationGeneralized linear modelsPredictor-corrector algorithmGeneralized linear model02 engineering and technologyPoisson distributionDANTZIG SELECTOR01 natural sciencesCross-validationHigh-dimensional inferenceTheoretical Computer Science010104 statistics & probabilitysymbols.namesakeExponential familyLEAST ANGLE REGRESSION0202 electrical engineering electronic engineering information engineeringApplied mathematicsStatistics::Methodology0101 mathematicsCROSS-VALIDATIONMathematicsLeast-angle regressionLinear model020206 networking & telecommunicationsProbability and statisticsVARIABLE SELECTIONEfficient estimatorPredictor-corrector algorithmComputational Theory and MathematicsDispersion paremeterLINEAR-MODELSsymbolsSHRINKAGEStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaStatistics and Computing

researchProduct

Metagenomics reveals our incomplete knowledge of global diversity

2008

Metagenomic sequencing obtains huge amounts of sequences from environmental and clinical samples, thus providing a glimpse of the global prokaryotic diversity of both species and genes in these sources. The current trend in metagenomic analysis follows the so-called gene-centric approach, focused on describing the environments by the study of the functional roles of the proteins encoded in the sequenced genes. In this way, it is clear that metagenomic analysis relies heavily on the accurate knowledge of the universe of proteins stored in the databases. Nevertheless, it is known that some biases exist in the composition of databases (which are rich in sequences from common, cultivable and ea…

Statistics and ProbabilityGeneticsPhylogenetic treebiologyPhylumGenetic VariationGenomicsBiodiversityGenomicsGenome Analysisbiology.organism_classificationBiochemistryComputer Science ApplicationsComputational MathematicsTaxonComputational Theory and MathematicsEvolutionary biologyMetagenomicsGenBankCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALTaxonomic rankLetter to the EditorMolecular BiologyEcosystemAcidobacteria

researchProduct

Using Statistical and Computer Models to Quantify Volcanic Hazards

2009

Risk assessment of rare natural hazards, such as large volcanic block and ash or pyroclastic flows, is addressed. Assessment is approached through a combination of computer modeling, statistical modeling, and extreme-event probability computation. A computer model of the natural hazard is used to provide the needed extrapolation to unseen parts of the hazard space. Statistical modeling of the available data is needed to determine the initializing distribution for exercising the computer model. In dealing with rare events, direct simulations involving the computer model are prohibitively expensive. The solution instead requires a combination of adaptive design of computer model approximation…

Statistics and ProbabilityHazard (logic)Risk analysisVolcanic hazardsComputer scienceApplied MathematicsComputationInitializationStatistical modelcomputer.software_genreModeling and SimulationNatural hazardRare eventsData miningcomputerTechnometrics

researchProduct

Robust estimation and inference for bivariate line-fitting in allometry.

2011

In allometry, bivariate techniques related to principal component analysis are often used in place of linear regression, and primary interest is in making inferences about the slope. We demonstrate that the current inferential methods are not robust to bivariate contamination, and consider four robust alternatives to the current methods -- a novel sandwich estimator approach, using robust covariance matrices derived via an influence function approach, Huber's M-estimator and the fast-and-robust bootstrap. Simulations demonstrate that Huber's M-estimators are highly efficient and robust against bivariate contamination, and when combined with the fast-and-robust bootstrap, we can make accurat…

Statistics and ProbabilityHeteroscedasticityAnalysis of VarianceCovariance matrixRobust statisticsEstimatorGeneral MedicineBivariate analysisCovarianceBiostatisticsStatistics::ComputationEfficient estimatorPrincipal component analysisStatisticsEconometricsStatistics::MethodologyBody SizeStatistics Probability and UncertaintyMathematicsProbabilityBiometrical journal. Biometrische Zeitschrift

researchProduct

Importance sampling type estimators based on approximate marginal Markov chain Monte Carlo

2020

We consider importance sampling (IS) type weighted estimators based on Markov chain Monte Carlo (MCMC) targeting an approximate marginal of the target distribution. In the context of Bayesian latent variable models, the MCMC typically operates on the hyperparameters, and the subsequent weighting may be based on IS or sequential Monte Carlo (SMC), but allows for multilevel techniques as well. The IS approach provides a natural alternative to delayed acceptance (DA) pseudo-marginal/particle MCMC, and has many advantages over DA, including a straightforward parallelisation and additional flexibility in MCMC implementation. We detail minimal conditions which ensure strong consistency of the sug…

Statistics and ProbabilityHyperparameter05 social sciencesBayesian probabilityStrong consistencyEstimatorContext (language use)Markov chain Monte Carlo01 natural sciencesStatistics::Computation010104 statistics & probabilitysymbols.namesake0502 economics and businesssymbols0101 mathematicsStatistics Probability and UncertaintyParticle filterAlgorithmImportance sampling050205 econometrics MathematicsScandinavian Journal of Statistics

researchProduct

Visualizing categorical data in ViSta

2003

The modules in the statistical package ViSta related to categorical data analysis are presented These modules are: visualization of frequency data with mosaic and bar plots, correspondence analysis, multiple correspondence analysis and loglinear analysis. All these methods are implemented in ViSta with a big emphasis on plots and graphical representations of data, as well as interactivity for the user with the system. These provide a system that has shown to be easy, useful, and powerful, both for novice and experienced users.

Statistics and ProbabilityInformation retrievalComputer sciencebusiness.industryApplied MathematicsMosaic (geodemography)computer.software_genreCorrespondence analysisVisualizationComputational MathematicsData visualizationInteractivityComputational Theory and MathematicsMultiple correspondence analysisLog-linear modelData miningbusinessCategorical variablecomputerComputational Statistics & Data Analysis

researchProduct

SeqEditor: an application for primer design and sequence analysis with or without GTF/GFF files

2021

[Motivation]: Sequence analyses oriented to investigate specific features, patterns and functions of protein and DNA/RNA sequences usually require tools based on graphic interfaces whose main characteristic is their intuitiveness and interactivity with the user’s expertise, especially when curation or primer design tasks are required. However, interface-based tools usually pose certain computational limitations when managing large sequences or complex datasets, such as genome and transcriptome assemblies. Having these requirments in mind we have developed SeqEditor an interactive software tool for nucleotide and protein sequences’ analysis.

Statistics and ProbabilityInterface (Java)Sequence analysisComputer sciencePcr assayBiochemistryGenomeTranscriptome03 medical and health sciencesSequence Analysis ProteinMultiplex polymerase chain reactionHumansNucleotideAmino Acid SequenceMolecular Biology030304 developmental biologychemistry.chemical_classification0303 health sciencesGenomeInformation retrievalContig030302 biochemistry & molecular biologyChromosomeComputer Science ApplicationsComputational MathematicsComputingMethodologies_PATTERNRECOGNITIONComputational Theory and MathematicschemistryLine (text file)Primer (molecular biology)Sequence AnalysisSoftwareReference genome

researchProduct

Clustering of spatial point patterns

2006

Spatial point patterns arise as the natural sampling information in many problems. An ophthalmologic problem gave rise to the problem of detecting clusters of point patterns. A set of human corneal endothelium images is given. Each image is described by using a point pattern, the cell centroids. The main problem is to find groups of images corresponding with groups of spatial point patterns. This is interesting from a descriptive point of view and for clinical purposes. A new image can be compared with prototypes of each group and finally evaluated by the physician. Usual descriptors of spatial point patterns such as the empty-space function, the nearest distribution function or Ripley's K-…

Statistics and ProbabilityK-functionbusiness.industryApplied MathematicsCentroidPattern recognitionFunction (mathematics)Point processComputational MathematicsComputational Theory and MathematicsSurvival functionStatisticsPoint (geometry)Artificial intelligencePoint estimationCluster analysisbusinessMathematicsComputational Statistics & Data Analysis

researchProduct

A Knowledge Management and Decision Support Model for Enterprises

2011

We propose a novel knowledge management system (KMS) for enterprises. Our system exploits two different approaches for knowledge representation and reasoning: a document-based approach based on data-driven creation of a semantic space and an ontology-based model. Furthermore, we provide an expert system capable of supporting the enterprise decisional processes and a semantic engine which performs intelligent search on the enterprise knowledge bases. The decision support process exploits the Bayesian networks model to improve business planning process when performed under uncertainty. Copyright © 2011 Patrizia Ribino et al.

Statistics and ProbabilityKnowledge Management SystemsSettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniDecision support systemKnowledge managementArticle SubjectKnowledge representation and reasoningExploitProcess (engineering)business.industryComputer sciencelcsh:MathematicsApplied MathematicsGeneral Decision SciencesBayesian networkOntology (information science)lcsh:QA1-939computer.software_genreExpert systemComputational MathematicsKnowledge-based systemsbusinesscomputer

researchProduct

Prior-based Bayesian information criterion

2019

We present a new approach to model selection and Bayes factor determination, based on Laplace expansions (as in BIC), which we call Prior-based Bayes Information Criterion (PBIC). In this approach, the Laplace expansion is only done with the likelihood function, and then a suitable prior distribution is chosen to allow exact computation of the (approximate) marginal likelihood arising from the Laplace approximation and the prior. The result is a closed-form expression similar to BIC, but now involves a term arising from the prior distribution (which BIC ignores) and also incorporates the idea that different parameters can have different effective sample sizes (whereas BIC only allows one ov…

Statistics and ProbabilityLaplace expansionApplied MathematicsBayes factorMarginal likelihoodStatistics::Computationsymbols.namesakeComputational Theory and MathematicsLaplace's methodBayesian information criterionPrior probabilitysymbolsApplied mathematicsStatistics::MethodologyStatistics Probability and UncertaintyLikelihood functionFisher informationAnalysisMathematics

researchProduct