Search results for "model selection"
showing 10 items of 64 documents
A graphical model selection tool for mixed models
2017
Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing th…
Pitfalls of hypothesis tests and model selection on bootstrap samples: Causes and consequences in biometrical applications
2015
The bootstrap method has become a widely used tool applied in diverse areas where results based on asymptotic theory are scarce. It can be applied, for example, for assessing the variance of a statistic, a quantile of interest or for significance testing by resampling from the null hypothesis. Recently, some approaches have been proposed in the biometrical field where hypothesis testing or model selection is performed on a bootstrap sample as if it were the original sample. P-values computed from bootstrap samples have been used, for example, in the statistics and bioinformatics literature for ranking genes with respect to their differential expression, for estimating the variability of p-v…
Bayesian dynamic modeling of time series of dengue disease case counts
2017
The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model’s short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order …
Stochastic models for wind speed forecasting
2011
Abstract This paper is concerned with the problem of developing a general class of stochastic models for hourly average wind speed time series. The proposed approach has been applied to the time series recorded during 4 years in two sites of Sicily, a region of Italy, and it has attained valuable results in terms both of modelling and forecasting. Moreover, the 24 h predictions obtained employing only 1-month time series are quite similar to those provided by a feed-forward artificial neural network trained on 2 years data.
Stability-Based Model Selection for High Throughput Genomic Data: An Algorithmic Paradigm
2012
Clustering is one of the most well known activities in scien- tific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is the model selection problem, i.e., the identifi- cation of the correct number of clusters in a dataset. In the last decade, a few novel techniques for model selection, representing a sharp departure from previous ones in statistics, have been proposed and gained promi- nence for microarray data analysis. Among those, the stability-based methods are the most robust and best performing in terms of predic- tion, but the slowest in terms of time. Unfortunately…
Bayesian versus data driven model selection for microarray data
2014
Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. A…
The Effective Sample Size
2013
Model selection procedures often depend explicitly on the sample size n of the experiment. One example is the Bayesian information criterion (BIC) criterion and another is the use of Zellner–Siow priors in Bayesian model selection. Sample size is well-defined if one has i.i.d real observations, but is not well-defined for vector observations or in non-i.i.d. settings; extensions of critera such as BIC to such settings thus requires a definition of effective sample size that applies also in such cases. A definition of effective sample size that applies to fairly general linear models is proposed and illustrated in a variety of situations. The definition is also used to propose a suitable ‘sc…
WEIGHTED-AVERAGE LEAST SQUARES (WALS): A SURVEY
2014
Model averaging has become a popular method of estimation, following increasing evidence that model selection and estimation should be treated as one joint procedure. Weighted- average least squares (WALS) is a recent model-average approach, which takes an intermediate position between frequentist and Bayesian methods, allows a credible treatment of ignorance, and is extremely fast to compute. We review the theory of WALS and discuss extensions and applications.
DETECTING VOLCANIC ERUPTIONS IN TEMPERATURE RECONSTRUCTIONS BY DESIGNED BREAK-INDICATOR SATURATION
2016
We present a methodology for detecting breaks at any point in time-series regression models using an indicator saturation approach, applied here to modelling climate change. Building on recent developments in econometric model selection for more variables than observations, we saturate a regression model with a full set of designed break functions. By selecting over these break functions using an extended general-to-specific algorithm, we obtain unbiased estimates of the break date and magnitude. Monte Carlo simulations confirm the approximate properties of the approach. We assess the methodology by detecting volcanic eruptions in a time series of Northern Hemisphere mean temperature spanni…
The Euro-Dollar Exchange Rate: Is it Fundamental?
2002
In this paper we have applied two approaches to the study of the dollar real exchange rate in relation with the Euro-area currencies. First, using dynamic panel techniques, we estimate an error correction model for the dollar real exchange rate versus seven developed countries, four of them Euro-area members. Second, we aggregate the European variables and estimate a model for the Euro-dollar real exchange rate using time series techniques. After identification and model selection, the same specification can be adopted in the two cases, in an eclectic model including real interest rate and productivity differentials, together with relative fiscal policy and net foreign asset positions. This…