Search results for "PROBABILITY"
showing 10 items of 3417 documents
Conditional Bias Robust Estimation of the Total of Curve Data by Sampling in a Finite Population: An Illustration on Electricity Load Curves
2020
Abstract For marketing or power grid management purposes, many studies based on the analysis of total electricity consumption curves of groups of customers are now carried out by electricity companies. Aggregated totals or mean load curves are estimated using individual curves measured at fine time grid and collected according to some sampling design. Due to the skewness of the distribution of electricity consumptions, these samples often contain outlying curves which may have an important impact on the usual estimation procedures. We introduce several robust estimators of the total consumption curve which are not sensitive to such outlying curves. These estimators are based on the conditio…
Asymptotic and bootstrap tests for subspace dimension
2022
Most linear dimension reduction methods proposed in the literature can be formulated using an appropriate pair of scatter matrices, see e.g. Ye and Weiss (2003), Tyler et al. (2009), Bura and Yang (2011), Liski et al. (2014) and Luo and Li (2016). The eigen-decomposition of one scatter matrix with respect to another is then often used to determine the dimension of the signal subspace and to separate signal and noise parts of the data. Three popular dimension reduction methods, namely principal component analysis (PCA), fourth order blind identification (FOBI) and sliced inverse regression (SIR) are considered in detail and the first two moments of subsets of the eigenvalues are used to test…
A multi-scale area-interaction model for spatio-temporal point patterns
2018
Models for fitting spatio-temporal point processes should incorporate spatio-temporal inhomogeneity and allow for different types of interaction between points (clustering or regularity). This paper proposes an extension of the spatial multi-scale area-interaction model to a spatio-temporal framework. This model allows for interaction between points at different spatio-temporal scales and the inclusion of covariates. We fit the proposed model to varicella cases registered during 2013 in Valencia, Spain. The fitted model indicates small scale clustering and regularity for higher spatio-temporal scales.
Imputation Procedures in Surveys Using Nonparametric and Machine Learning Methods: An Empirical Comparison
2020
Abstract Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse, nonparametric and machine learning procedures may thus provide a useful alternative to traditional imputation procedures for deriving a set of imputed values used next for the estimation of study parameters defined as solution of population estimating equation. In this paper, we conduct an extensive empirical investigation that compares a number of imputation procedures in terms of bias and efficiency in a wide variety of settings, including high-dimens…
An ensemble approach to short-term forecast of COVID-19 intensive care occupancy in Italian Regions
2020
Abstract The availability of intensive care beds during the COVID‐19 epidemic is crucial to guarantee the best possible treatment to severely affected patients. In this work we show a simple strategy for short‐term prediction of COVID‐19 intensive care unit (ICU) beds, that has proved very effective during the Italian outbreak in February to May 2020. Our approach is based on an optimal ensemble of two simple methods: a generalized linear mixed regression model, which pools information over different areas, and an area‐specific nonstationary integer autoregressive methodology. Optimal weights are estimated using a leave‐last‐out rationale. The approach has been set up and validated during t…
A New Nonparametric Estimate of the Risk-Neutral Density with Applications to Variance Swaps
2021
We develop a new nonparametric approach for estimating the risk-neutral density of asset prices and reformulate its estimation into a double-constrained optimization problem. We evaluate our approach using the S\&P 500 market option prices from 1996 to 2015. A comprehensive cross-validation study shows that our approach outperforms the existing nonparametric quartic B-spline and cubic spline methods, as well as the parametric method based on the Normal Inverse Gaussian distribution. As an application, we use the proposed density estimator to price long-term variance swaps, and the model-implied prices match reasonably well with those of the variance future downloaded from the CBOE websi…
KFAS : Exponential Family State Space Models in R
2017
State space modelling is an efficient and flexible method for statistical inference of a broad class of time series and other data. This paper describes an R package KFAS for state space modelling with the observations from an exponential family, namely Gaussian, Poisson, binomial, negative binomial and gamma distributions. After introducing the basic theory behind Gaussian and non-Gaussian state space models, an illustrative example of Poisson time series forecasting is provided. Finally, a comparison to alternative R packages suitable for non-Gaussian time series modelling is presented.
Community characterization of heterogeneous complex systems
2011
We introduce an analytical statistical method to characterize the communities detected in heterogeneous complex systems. By posing a suitable null hypothesis, our method makes use of the hypergeometric distribution to assess the probability that a given property is over-expressed in the elements of a community with respect to all the elements of the investigated set. We apply our method to two specific complex networks, namely a network of world movies and a network of physics preprints. The characterization of the elements and of the communities is done in terms of languages and countries for the movie network and of journals and subject categories for papers. We find that our method is ab…
Alignment-free Genomic Analysis via a Big Data Spark Platform
2021
Abstract Motivation Alignment-free distance and similarity functions (AF functions, for short) are a well-established alternative to pairwise and multiple sequence alignments for many genomic, metagenomic and epigenomic tasks. Due to data-intensive applications, the computation of AF functions is a Big Data problem, with the recent literature indicating that the development of fast and scalable algorithms computing AF functions is a high-priority task. Somewhat surprisingly, despite the increasing popularity of Big Data technologies in computational biology, the development of a Big Data platform for those tasks has not been pursued, possibly due to its complexity. Results We fill this impo…
A Unified SVM Framework for Signal Estimation
2013
This paper presents a unified framework to tackle estimation problems in Digital Signal Processing (DSP) using Support Vector Machines (SVMs). The use of SVMs in estimation problems has been traditionally limited to its mere use as a black-box model. Noting such limitations in the literature, we take advantage of several properties of Mercer's kernels and functional analysis to develop a family of SVM methods for estimation in DSP. Three types of signal model equations are analyzed. First, when a specific time-signal structure is assumed to model the underlying system that generated the data, the linear signal model (so called Primal Signal Model formulation) is first stated and analyzed. T…