Search results for "PROBABILITY"
showing 10 items of 3417 documents
ADME Prediction with KNIME: Development and Validation of a Publicly Available Workflow for the Prediction of Human Oral Bioavailability.
2020
In silico prediction of human oral bioavailability is a relevant tool for the selection of potential drug candidates and for the rejection of those molecules with less probability of success during the early stages of drug discovery and development. However, the high variability and complexity of oral bioavailability and the limited experimental data in the public domain have mainly restricted the development of reliable in silico models to predict this property from the chemical structure. In this study we present a KNIME automated workflow to predict human oral bioavailability of new drug and drug-like molecules based on five machine learning approaches combined into an ensemble model. Th…
Register data in sample allocations for small-area estimation
2018
The inadequate control of sample sizes in surveys using stratified sampling and area estimation may occur when the overall sample size is small or auxiliary information is insufficiently used. Very small sample sizes are possible for some areas. The proposed allocation based on multi-objective optimization uses a small-area model and estimation method and semi-collected empirical data annually collected empirical data. The assessment of its performance at the area and at the population levels is based on design-based sample simulations. Five previously developed allocations serve as references. The model-based estimator is more accurate than the design-based Horvitz–Thompson estimator and t…
Computational issues in fitting joint frailty models for recurrent events with an associated terminal event.
2020
Abstract Background and objective: Joint frailty regression models are intended for the analysis of recurrent event times in the presence of informative drop-outs. They have been proposed for clinical trials to estimate the effect of some treatment on the rate of recurrent heart failure hospitalisations in the presence of drop-outs due to cardiovascular death. Whereas a R-software-package for fitting joint frailty models is available, some technical issues have to be solved in order to use SASⓇ 1 software, which is required in the regulatory environment of clinical trials. Methods: First, we demonstrate how to solve these issues by deriving proper likelihood-decompositions, in particular fo…
Adaptive Population Importance Samplers: A General Perspective
2016
Importance sampling (IS) is a well-known Monte Carlo method, widely used to approximate a distribution of interest using a random measure composed of a set of weighted samples generated from another proposal density. Since the performance of the algorithm depends on the mismatch between the target and the proposal densities, a set of proposals is often iteratively adapted in order to reduce the variance of the resulting estimator. In this paper, we review several well-known adaptive population importance samplers, providing a unified common framework and classifying them according to the nature of their estimation and adaptive procedures. Furthermore, we interpret the underlying motivation …
Group Metropolis Sampling
2017
Monte Carlo (MC) methods are widely used for Bayesian inference and optimization in statistics, signal processing and machine learning. Two well-known class of MC methods are the Importance Sampling (IS) techniques and the Markov Chain Monte Carlo (MCMC) algorithms. In this work, we introduce the Group Importance Sampling (GIS) framework where different sets of weighted samples are properly summarized with one summary particle and one summary weight. GIS facilitates the design of novel efficient MC techniques. For instance, we present the Group Metropolis Sampling (GMS) algorithm which produces a Markov chain of sets of weighted samples. GMS in general outperforms other multiple try schemes…
Recycling Gibbs sampling
2017
Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning and statistics. The key point for the successful application of the Gibbs sampler is the ability to draw samples from the full-conditional probability density functions efficiently. In the general case this is not possible, so in order to speed up the convergence of the chain, it is required to generate auxiliary samples. However, such intermediate information is finally disregarded. In this work, we show that these auxiliary samples can be recycled within the Gibbs estimators, improving their efficiency with no extra cost. Theoretical and exhaustive numerical co…
Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: A simulation study
2004
This paper focuses on a combination of two disclosure limitation techniques, additive noise and multiplicative bias, and studies their efficacy in protecting confidentiality of continuous microdata. A Bayesian intruder model is extensively simulated in order to assess the performance of these disclosure limitation techniques as a function of key parameters like the variability amongst profiles in the original data, the amount of users prior information, the amount of bias and noise introduced in the data. The results of the simulation offer insight into the degree of vulnerability of data on continuous random variables and suggests some guidelines for effective protection measures.
Error mitigation using RaptorQ codes in an experimental indoor free space optical link under the influence of turbulence
2015
In free space optical (FSO) communications, several factors can strongly affect the link quality. Among them, one of the most important impairments that can degrade the FSO link quality and its reliability even under the clear sky conditions consists of optical turbulence. In this work, the authors investigate the generation of both weak and moderate turbulence regimes in an indoor environment to assess the FSO link quality. In particular, they show that, due to the presence of the turbulence, the link experiences both erasure errors and packet losses during transmission, and also compare the experimental statistical distribution of samples with the predicted Gamma Gamma model. Furthermore,…
Path Integral approach via Laplace’s method of integration for nonstationary response of nonlinear systems
2019
In this paper the nonstationary response of a class of nonlinear systems subject to broad-band stochastic excitations is examined. A version of the Path Integral (PI) approach is developed for determining the evolution of the response probability density function (PDF). Specifically, the PI approach, utilized for evaluating the response PDF in short time steps based on the Chapman–Kolmogorov equation, is here employed in conjunction with the Laplace’s method of integration. In this manner, an approximate analytical solution of the integral involved in this equation is obtained, thus circumventing the repetitive integrations generally required in the conventional numerical implementation of …
CovSel
2018
Ensemble methods combine the predictions of a set of models to reach a better prediction quality compared to a single model's prediction. The ensemble process consists of three steps: 1) the generation phase where the models are created, 2) the selection phase where a set of possible ensembles is composed and one is selected by a selection method, 3) the fusion phase where the individual models' predictions of the selected ensemble are combined to an ensemble's estimate. This paper proposes CovSel, a selection approach for regression problems that ranks ensembles based on the coverage of adequately estimated training points and selects the ensemble with the highest coverage to be used in th…