Search results for "Application"
showing 10 items of 5559 documents
Simulation-based marginal likelihood for cluster strong lensing cosmology
2015
Comparisons between observed and predicted strong lensing properties of galaxy clusters have been routinely used to claim either tension or consistency with $\Lambda$CDM cosmology. However, standard approaches to such cosmological tests are unable to quantify the preference for one cosmology over another. We advocate approximating the relevant Bayes factor using a marginal likelihood that is based on the following summary statistic: the posterior probability distribution function for the parameters of the scaling relation between Einstein radii and cluster mass, $\alpha$ and $\beta$. We demonstrate, for the first time, a method of estimating the marginal likelihood using the X-ray selected …
Sensitivity versus block sensitivity of Boolean functions
2010
Determining the maximal separation between sensitivity and block sensitivity of Boolean functions is of interest for computational complexity theory. We construct a sequence of Boolean functions with bs(f) = 1/2 s(f)^2 + 1/2 s(f). The best known separation previously was bs(f) = 1/2 s(f)^2 due to Rubinstein. We also report results of computer search for functions with at most 12 variables.
Quasi conjunction, quasi disjunction, t-norms and t-conorms: Probabilistic aspects
2013
We make a probabilistic analysis related to some inference rules which play an important role in nonmonotonic reasoning. In a coherence-based setting, we study the extensions of a probability assessment defined on $n$ conditional events to their quasi conjunction, and by exploiting duality, to their quasi disjunction. The lower and upper bounds coincide with some well known t-norms and t-conorms: minimum, product, Lukasiewicz, and Hamacher t-norms and their dual t-conorms. On this basis we obtain Quasi And and Quasi Or rules. These are rules for which any finite family of conditional events p-entails the associated quasi conjunction and quasi disjunction. We examine some cases of logical de…
Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform
2012
Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the leng…
The FLUXCOM ensemble of global land-atmosphere energy fluxes
2019
Although a key driver of Earth’s climate system, global land-atmosphere energy fluxes are poorly constrained. Here we use machine learning to merge energy flux measurements from FLUXNET eddy covariance towers with remote sensing and meteorological data to estimate global gridded net radiation, latent and sensible heat and their uncertainties. The resulting FLUXCOM database comprises 147 products in two setups: (1) 0.0833° resolution using MODIS remote sensing data (RS) and (2) 0.5° resolution using remote sensing and meteorological data (RS + METEO). Within each setup we use a full factorial design across machine learning methods, forcing datasets and energy balance closure corrections. For…
Isotonic regression for metallic microstructure data: estimation and testing under order restrictions
2021
Investigating the main determinants of the mechanical performance of metals is not a simple task. Already known physical inspired qualitative relations between 2D microstructure characteristics and 3D mechanical properties can act as the starting point of the investigation. Isotonic regression allows to take into account ordering relations and leads to more efficient and accurate results when the underlying assumptions actually hold. The main goal in this paper is to test order relations in a model inspired by a materials science application. The statistical estimation procedure is described considering three different scenarios according to the knowledge of the variances: known variance ra…
Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R
2019
Sequence analysis is being more and more widely used for the analysis of social sequences and other multivariate categorical time series data. However, it is often complex to describe, visualize, and compare large sequence data, especially when there are multiple parallel sequences per subject. Hidden (latent) Markov models (HMMs) are able to detect underlying latent structures and they can be used in various longitudinal settings: to account for measurement error, to detect unobservable states, or to compress information across several types of observations. Extending to mixture hidden Markov models (MHMMs) allows clustering data into homogeneous subsets, with or without external covariate…
Nowcasting COVID‐19 incidence indicators during the Italian first outbreak
2020
A novel parametric regression model is proposed to fit incidence data typically collected during epidemics. The proposal is motivated by real-time monitoring and short-term forecasting of the main epidemiological indicators within the first outbreak of COVID-19 in Italy. Accurate short-term predictions, including the potential effect of exogenous or external variables are provided. This ensures to accurately predict important characteristics of the epidemic (e.g., peak time and height), allowing for a better allocation of health resources over time. Parameter estimation is carried out in a maximum likelihood framework. All computational details required to reproduce the approach and replica…
Conditional Bias Robust Estimation of the Total of Curve Data by Sampling in a Finite Population: An Illustration on Electricity Load Curves
2020
Abstract For marketing or power grid management purposes, many studies based on the analysis of total electricity consumption curves of groups of customers are now carried out by electricity companies. Aggregated totals or mean load curves are estimated using individual curves measured at fine time grid and collected according to some sampling design. Due to the skewness of the distribution of electricity consumptions, these samples often contain outlying curves which may have an important impact on the usual estimation procedures. We introduce several robust estimators of the total consumption curve which are not sensitive to such outlying curves. These estimators are based on the conditio…
Imputation Procedures in Surveys Using Nonparametric and Machine Learning Methods: An Empirical Comparison
2020
Abstract Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse, nonparametric and machine learning procedures may thus provide a useful alternative to traditional imputation procedures for deriving a set of imputed values used next for the estimation of study parameters defined as solution of population estimating equation. In this paper, we conduct an extensive empirical investigation that compares a number of imputation procedures in terms of bias and efficiency in a wide variety of settings, including high-dimens…