Search results for "model selection"

showing 10 items of 64 documents

Selecting the tuning parameter in penalized Gaussian graphical models

2019

Penalized inference of Gaussian graphical models is a way to assess the conditional independence structure in multivariate problems. In this setting, the conditional independence structure, corresponding to a graph, is related to the choice of the tuning parameter, which determines the model complexity or degrees of freedom. There has been little research on the degrees of freedom for penalized Gaussian graphical models. In this paper, we propose an estimator of the degrees of freedom in $$\ell _1$$ -penalized Gaussian graphical models. Specifically, we derive an estimator inspired by the generalized information criterion and propose to use this estimator as the bias term for two informatio…

Statistics and ProbabilityStatistics::TheoryKullback–Leibler divergenceKullback-Leibler divergenceComputer scienceGaussianInformation Criteria010103 numerical & computational mathematicsModel complexityModel selection01 natural sciencesTheoretical Computer Science010104 statistics & probabilitysymbols.namesakeStatistics::Machine LearningGeneralized information criterionEntropy (information theory)Statistics::MethodologyGraphical model0101 mathematicsPenalized Likelihood Kullback-Leibler Divergence Model Complexity Model Selection Generalized Information Criterion.Model selectionEstimatorStatistics::ComputationComputational Theory and MathematicsConditional independencesymbolsPenalized likelihoodStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaAlgorithmStatistics and Computing

researchProduct

Efficient change point detection in genomic sequences of continuous measurements

2010

Abstract Motivation: Knowing the exact locations of multiple change points in genomic sequences serves several biological needs, for instance when data represent aCGH profiles and it is of interest to identify possibly damaged genes involved in cancer and other diseases. Only a few of the currently available methods deal explicitly with estimation of the number and location of change points, and moreover these methods may be somewhat vulnerable to deviations of model assumptions usually employed. Results: We present a computationally efficient method to obtain estimates of the number and location of the change points. The method is based on a simple transformation of data and it provides re…

Statistics and Probabilitymodel selectionBreast Neoplasmscomputer.software_genreBiochemistryCell LineSimple (abstract algebra)Cell Line TumorHumansComputer Simulationpiecewise constant modelMolecular BiologyMathematicsOligonucleotide Array Sequence AnalysisSupplementary dataComparative Genomic HybridizationModels StatisticalSeries (mathematics)Model selectionGenomicsComputer Science ApplicationsComputational MathematicsR packageTransformation (function)Computational Theory and MathematicsChange pointsChangepointaCGH analysiFemaleData miningSettore SECS-S/01 - StatisticacomputerChange detection

researchProduct

Contributed discussion on article by Pratola

2016

The author should be commended for his outstanding contribution to the literature on Bayesian regression tree models. The author introduces three innovative sampling approaches which allow for efficient traversal of the model space. In this response, we add a fourth alternative.

Statistics and Probabilitymodel selectionMarkov Chain Monte Carlo (MCMC)Bayesian regression treeComputer scienceBig dataBayesian regression tree (BRT) modelsComputingMilieux_LEGALASPECTSOFCOMPUTINGbirth–death processMachine learningcomputer.software_genreSequential Monte Carlo methods01 natural sciencespopulation Markov chain Monte Carlo010104 statistics & probabilitysymbols.namesakebig data0502 economics and businessBayesian Regression Trees (BART)0101 mathematics050205 econometrics Bayesian treed regressionMultiple Try Metropolis algorithmsINFERÊNCIA ESTATÍSTICAbusiness.industryApplied MathematicsModel selection05 social sciencesRejection samplingData scienceVariable-order Bayesian networkTree (data structure)Tree traversalMarkov chain Monte Carlocontinuous time Markov processsymbolsArtificial intelligencebusinessBayesian linear regressioncommunication-freecomputerGibbs samplingBayesian Analysis

researchProduct

Not all bull and bear markets are alike: insights from a five-state hidden semi-Markov model

2022

This paper employs the hidden semi-Markov model and a novel model selection procedure to detect different states in the US stock market. The empirical results suggest that the market is switching between five states that can be classified into three bull states and two bear states. The three bull states are categorized as a low volatility bull market, a high volatility bull market, and a stock market bubble. One of the bear states represents a regular bear market, while the other one corresponds to either a stock market crash or a market correction. The paper demonstrates that the five-state model is consistent with a number of stylized facts and provides many valuable insights into the dyn…

Stylized factEconomics and EconometricsModel selectionStrategy and ManagementStock market bubbleStock market crashEconometricsEconomicsStock marketHidden semi-Markov modelMarket correctionVolatility (finance)Business and International ManagementFinanceRisk Management

researchProduct

ETAS Space–Time Modeling of Chile Triggered Seismicity Using Covariates: Some Preliminary Results

2021

Chilean seismic activity is one of the strongest in the world. As already shown in previous papers, seismic activity can be usefully described by a space–time branching process, such as the ETAS (Epidemic Type Aftershock Sequences) model, which is a semiparametric model with a large time-scale component for the background seismicity and a small time-scale component for the triggered seismicity. The use of covariates can improve the description of triggered seismicity in the ETAS model, so in this paper, we study the Chilean seismicity separately for the North and South area, using some GPS-related data observed together with ordinary catalog data. Our results show evidence that the use of s…

Technologymodel selectionQH301-705.5QC1-999Induced seismicityPhysics::Geophysicssemiparametric modelComponent (UML)CovariateGeneral Materials Sciencetriggered seismicityBiology (General)InstrumentationQD1-999AftershockBranching processFluid Flow and Transfer ProcessesProcess Chemistry and TechnologySpace timeModel selectionTPhysicsGeneral EngineeringcovariatesEngineering (General). Civil engineering (General)Computer Science ApplicationsSemiparametric modelETAS modelChemistrycovariatesemiparametric modelsTA1-2040GeologySeismologyApplied Sciences

researchProduct

Wind Speed Forecasting by Box-Jenkins Models

2008

The possibility of modelling observed wind speed time series and forecasting their future values is presented in this paper. Seasonal autoregressive integrated moving average (SARIMA) models are applied to time series formed by four years hourly average wind speed measurements in thirty sites of Sicily. Our approach is considerably different from the original one (the Box-Jenkins approach) since it is completely automatic. We use a peculiar feature of wind speed on a land area, its daily period, to identify a class of SARIMA models within which to find the best fitting model by information criteria (here we employ AICC). Here we report the results, concerning the fit and forecast accuracy, …

Wind forecastingSpectral analysiStochastic modelTime serieModel selection

researchProduct

Modelling hydrolysis: Simultaneous versus sequential biodegradation of the hydrolysable fractions

2018

Hydrolysis is considered the limiting step during solid waste anaerobic digestion (including co-digestion of sludge and biosolids). Mechanisms of hydrolysis are mechanistically not well understood with detrimental impact on model predictive capability. The common approach to multiple substrates is to consider simultaneous degradation of the substrates. This may not have the capacity to separate the different kinetics. Sequential degradation of substrates is theoretically supported by microbial capacity and the composite nature of substrates (bioaccessibility concept). However, this has not been experimentally assessed. Sequential chemical fractionation has been successfully used to define i…

[SDV.BIO]Life Sciences [q-bio]/BiotechnologyBiosolidsSEQUENTIAL EXTRACTIONANAEROBIC DIGESTIONBIODEGRADATION02 engineering and technology010501 environmental sciencesTRITICUM AESTIVUM01 natural sciences7. Clean energyNUMERICAL MODELSLUDGE DIGESTIONBioreactorsMETHANEBIOLOGICAL MATERIALSACTIVATED SLUDGE0202 electrical engineering electronic engineering information engineeringAnaerobiosisSequential modelPRIORITY JOURNALWaste Management and DisposalComputingMilieux_MISCELLANEOUSCALIBRATIONSewageCONCENTRATION (PARAMETER)ChemistryFRACTIONATIONACID HYDROLYSISINCUBATION TIMEMODELLINGHYDROLYSISCHEMICAL FRACTIONATIONSEQUENTIAL DEGRADATIONBiodegradation EnvironmentalWASTE TREATMENTORGANIC MATTER[SDE]Environmental SciencesANAEROBIC DIGESTION MODELADM1SOLID WASTE020209 energyMODELSFractionationCAPACITYHydrolysisDIGESTIONISOTOPIC FRACTIONATIONNONHUMANCHEMICAL OXYGEN DEMANDARTICLEMODEL SELECTION0105 earth and related environmental sciencesChromatographyModels TheoreticalSUBSTRATESBiodegradationSIMULTANEOUS DEGRADATIONHOMOGENEOUS MATERIALSAnaerobic digestionWASTE WATER MANAGEMENTActivated sludgeAPPLEDegradation (geology)Waste Management

researchProduct

Prediction Model Selection and Spare Parts Ordering Policy for Efficient Support of Maintenance and Repair of Equipment

2010

The prediction model selection problem via variable subset selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. Several papers have dealt with various aspects of the problem but it appears that the typical regression user has not benefited appreciably. One reason for the lack of resolution of the problem is the fact that it has not been well defined. Indeed, it is apparent that there is not a single probl…

business.industryComputer scienceModel selectionFeature selectionResolution (logic)Machine learningcomputer.software_genreVariable (computer science)Residual sum of squaresSpare partArtificial intelligencebusinesscomputerSelection (genetic algorithm)Parametric statistics

researchProduct

A New Method to Reconstruct Quantitative Food Webs and Nutrient Flows from Isotope Tracer Addition Experiments

2020

Understanding how nutrients flow through food webs is central in ecosystem ecology. Tracer addition experiments are powerful tools to reconstruct nutrient flows by adding an isotopically enriched element into an ecosystem and tracking its fate through time. Historically, the design and analysis of tracer studies have varied widely, ranging from descriptive studies to modeling approaches of varying complexity. Increasingly, isotope tracer data are being used to compare ecosystems and analyze experimental manipulations. Currently, a formal statistical framework for analyzing such experiments is lacking, making it impossible to calculate the estimation errors associated with the model fit, the…

ekosysteemit (ekologia)model selectionstate-space models.food websbayesilainen menetelmäMarkovin ketjutnutrient uptakebiomarkkerithidden Markov model (HMM)ravinteetravinnonotto (kasvit)ravintoverkotisotope tracer addition

researchProduct

Ecologists overestimate the importance of predictor variables in model averaging: a plea for cautious interpretations.

2014

Abstract: Information-theory procedures are powerful tools for multimodel inference and are now standard methods in ecology. When performing model averaging on a given set of models, the importance of a predictor variable is commonly estimated by summing the weights of models where the variable appears, the so-called sum of weights (SW). However, SWs have received little methodological attention and are frequently misinterpreted. We assessed the reliability of SW by performing model selection and averaging on simulated data sets including variables strongly and weakly correlated to the response variable and a variable unrelated to the response. Our aim was to investigate how useful SWs are …

model selectionInformation theorymultimodel inferenceBayesian information criterionStatisticsEconometricsRange (statistics)Akaike Information Criterion[ SDV.EE.IEO ] Life Sciences [q-bio]/Ecology environment/Symbiosisbaseline sum of weightsSet (psychology)BiologyEcology Evolution Behavior and SystematicsMathematicsinformation theory[STAT.AP]Statistics [stat]/Applications [stat.AP]Ecological ModelingModel selection[ STAT.AP ] Statistics [stat]/Applications [stat.AP]model averagingBayesian information criterionChemistryVariable (computer science)Sample size determinationvariable importanceAkaike information criterion[SDV.EE.IEO]Life Sciences [q-bio]/Ecology environment/Symbiosis

researchProduct