Search results for "model selection"

showing 10 items of 64 documents

A graphical model selection tool for mixed models

2017

Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing th…

0301 basic medicineStatistics and ProbabilityMixed modelModel selectionFeature selection01 natural sciencesTask (project management)Deviance plot Penalized Weighted Residual Sum of Squares Variable selection010104 statistics & probability03 medical and health sciences030104 developmental biologyModeling and SimulationStatisticsGraphical model0101 mathematicsSelection (genetic algorithm)Mathematics
researchProduct

Pitfalls of hypothesis tests and model selection on bootstrap samples: Causes and consequences in biometrical applications

2015

The bootstrap method has become a widely used tool applied in diverse areas where results based on asymptotic theory are scarce. It can be applied, for example, for assessing the variance of a statistic, a quantile of interest or for significance testing by resampling from the null hypothesis. Recently, some approaches have been proposed in the biometrical field where hypothesis testing or model selection is performed on a bootstrap sample as if it were the original sample. P-values computed from bootstrap samples have been used, for example, in the statistics and bioinformatics literature for ranking genes with respect to their differential expression, for estimating the variability of p-v…

0301 basic medicineStatistics and Probabilityeducation.field_of_studyComputer scienceModel selectionBootstrap aggregatingPopulationGeneral MedicineAsymptotic theory (statistics)01 natural sciences010104 statistics & probability03 medical and health sciences030104 developmental biologyResamplingStatisticsEconometrics0101 mathematicsStatistics Probability and UncertaintyeducationNull hypothesisQuantileStatistical hypothesis testingBiometrical Journal
researchProduct

Bayesian dynamic modeling of time series of dengue disease case counts

2017

The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model’s short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order …

Atmospheric ScienceMeteorological ConceptsUrban PopulationEpidemiologyRainPoisson distributionGeographical locationsDengueMathematical and Statistical Techniques0302 clinical medicineStatisticsMedicine and Health Sciences030212 general & internal medicineAtmospheric DynamicsMathematicsMathematical Modelslcsh:Public aspects of medicinePhysicsElectromagnetic RadiationRandom walkDeviance information criterionGeophysicsInfectious DiseasesMean absolute percentage errorPhysical SciencessymbolsSolar RadiationStatistics (Mathematics)Research ArticleGeneralized linear modelConstant coefficientslcsh:Arctic medicine. Tropical medicinelcsh:RC955-962030231 tropical medicineColombiaDisease SurveillanceResearch and Analysis Methods03 medical and health sciencessymbols.namesakeMeteorologyHumansStatistical MethodsCitiesModel selectionPublic Health Environmental and Occupational Healthlcsh:RA1-1270HumidityBayes TheoremMarkov chain Monte CarloSouth AmericaAtmospheric PhysicsRandom WalkEarth SciencesPeople and placesMathematicsForecastingPLOS Neglected Tropical Diseases
researchProduct

Stochastic models for wind speed forecasting

2011

Abstract This paper is concerned with the problem of developing a general class of stochastic models for hourly average wind speed time series. The proposed approach has been applied to the time series recorded during 4 years in two sites of Sicily, a region of Italy, and it has attained valuable results in terms both of modelling and forecasting. Moreover, the 24 h predictions obtained employing only 1-month time series are quite similar to those provided by a feed-forward artificial neural network trained on 2 years data.

Class (computer programming)EngineeringSeries (mathematics)Artificial neural networkMeteorologyRenewable Energy Sustainability and the EnvironmentStochastic modellingbusiness.industryModel selectionSettore FIS/01 - Fisica SperimentaleEnergy Engineering and Power TechnologySettore FIS/03 - Fisica Della MateriaSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Wind speedFuel TechnologyNuclear Energy and EngineeringSpectral analysisbusinessstochastic models time series model selection spectral analysis artificial neural networks wind forecastingAlgorithmEnergy Conversion and Management
researchProduct

Stability-Based Model Selection for High Throughput Genomic Data: An Algorithmic Paradigm

2012

Clustering is one of the most well known activities in scien- tific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is the model selection problem, i.e., the identifi- cation of the correct number of clusters in a dataset. In the last decade, a few novel techniques for model selection, representing a sharp departure from previous ones in statistics, have been proposed and gained promi- nence for microarray data analysis. Among those, the stability-based methods are the most robust and best performing in terms of predic- tion, but the slowest in terms of time. Unfortunately…

Class (computer programming)Settore INF/01 - Informaticabusiness.industryComputer scienceHeuristic (computer science)Model selectionStability (learning theory)Machine learningcomputer.software_genreIdentification (information)Algorithm designArtificial intelligenceCluster analysisbusinessAlgorithms and Data StructuresThroughput (business)computer
researchProduct

Bayesian versus data driven model selection for microarray data

2014

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. A…

Clustering Model selection Bayesian information criterion Akaike information criterion Minimum message length BioinformaticsSettore INF/01 - InformaticaComputer sciencebusiness.industryModel selectionBayesian probabilitycomputer.software_genreMachine learningComputer Science ApplicationsData-drivenDetermining the number of clusters in a data setIdentification (information)Bayesian information criterionData miningArtificial intelligenceAkaike information criterionCluster analysisbusinesscomputer
researchProduct

The Effective Sample Size

2013

Model selection procedures often depend explicitly on the sample size n of the experiment. One example is the Bayesian information criterion (BIC) criterion and another is the use of Zellner–Siow priors in Bayesian model selection. Sample size is well-defined if one has i.i.d real observations, but is not well-defined for vector observations or in non-i.i.d. settings; extensions of critera such as BIC to such settings thus requires a definition of effective sample size that applies also in such cases. A definition of effective sample size that applies to fairly general linear models is proposed and illustrated in a variety of situations. The definition is also used to propose a suitable ‘sc…

Deviance information criterionEconomics and EconometricsBayesian information criterionSample size determinationModel selectionPrior probabilityStatisticsLinear modelBayesian inferenceAlgorithmSelection (genetic algorithm)Statistics::ComputationMathematicsEconometric Reviews
researchProduct

WEIGHTED-AVERAGE LEAST SQUARES (WALS): A SURVEY

2014

Model averaging has become a popular method of estimation, following increasing evidence that model selection and estimation should be treated as one joint procedure. Weighted- average least squares (WALS) is a recent model-average approach, which takes an intermediate position between frequentist and Bayesian methods, allows a credible treatment of ignorance, and is extremely fast to compute. We review the theory of WALS and discuss extensions and applications.

Economics and EconometricsModel selection05 social sciencesBayesian probability01 natural sciencesLeast squares010104 statistics & probabilityFrequentist inferencePosition (vector)0502 economics and businessStatisticsPrior probability0101 mathematicsWeighted arithmetic mean050205 econometrics MathematicsJournal of Economic Surveys
researchProduct

DETECTING VOLCANIC ERUPTIONS IN TEMPERATURE RECONSTRUCTIONS BY DESIGNED BREAK-INDICATOR SATURATION

2016

We present a methodology for detecting breaks at any point in time-series regression models using an indicator saturation approach, applied here to modelling climate change. Building on recent developments in econometric model selection for more variables than observations, we saturate a regression model with a full set of designed break functions. By selecting over these break functions using an extended general-to-specific algorithm, we obtain unbiased estimates of the break date and magnitude. Monte Carlo simulations confirm the approximate properties of the approach. We assess the methodology by detecting volcanic eruptions in a time series of Northern Hemisphere mean temperature spanni…

Economics and Econometricsgeographygeography.geographical_feature_category010504 meteorology & atmospheric sciencesModel selectionMonte Carlo methodNorthern HemisphereClimate changeRegression analysis01 natural sciencesPhysics::Geophysics010104 statistics & probabilityVolcanoClimatologyPaleoclimatologyEconomics0101 mathematicsMean radiant temperaturePhysics::Atmospheric and Oceanic Physics0105 earth and related environmental sciencesJournal of Economic Surveys
researchProduct

The Euro-Dollar Exchange Rate: Is it Fundamental?

2002

In this paper we have applied two approaches to the study of the dollar real exchange rate in relation with the Euro-area currencies. First, using dynamic panel techniques, we estimate an error correction model for the dollar real exchange rate versus seven developed countries, four of them Euro-area members. Second, we aggregate the European variables and estimate a model for the Euro-dollar real exchange rate using time series techniques. After identification and model selection, the same specification can be adopted in the two cases, in an eclectic model including real interest rate and productivity differentials, together with relative fiscal policy and net foreign asset positions. This…

Error correction modelExchange rateInterest rate parityreal exchange rate cointegration time-series panel dollar Euro-zoneCointegrationModel selectionEconomicsLiberian dollarContext (language use)Monetary economicsReal interest rateSSRN Electronic Journal
researchProduct