Search results for "STATISTICS"
showing 10 items of 7671 documents
2013
Currently, a growing number of programs become available in statistical software for multiple imputation of missing values. Among others, two algorithms are mainly implemented: Expectation Maximization (EM) and Multiple Imputation by Chained Equations (MICE). They have been shown to work well in large samples or when only small proportions of missing data are to be imputed. However, some researchers have begun to impute large proportions of missing data or to apply the method to small samples. A simulation was performed using MICE on datasets with 50, 100 or 200 cases and four or eleven variables. A varying proportion of data (3% - 63%) was set as missing completely at random and subsequent…
Cluster-based active learning for compact image classification
2010
In this paper, we consider active sampling to label pixels grouped with hierarchical clustering. The objective of the method is to match the data relationships discovered by the clustering algorithm with the user's desired class semantics. The first is represented as a complete tree to be pruned and the second is iteratively provided by the user. The active learning algorithm proposed searches the pruning of the tree that best matches the labels of the sampled points. By choosing the part of the tree to sample from according to current pruning's uncertainty, sampling is focused on most uncertain clusters. This way, large clusters for which the class membership is already fixed are no longer…
Structural difficulty in grammatical evolution versus genetic programming
2013
Genetic programming (GP) has problems with structural difficulty as it is unable to search effectively for solutions requiring very full or very narrow trees. As a result of structural difficulty, GP has a bias towards narrow trees which means it searches effectively for solutions requiring narrow trees. This paper focuses on the structural difficulty of grammatical evolution (GE). In contrast to GP, GE works on variable-length binary strings and uses a grammar in Backus-Naur Form (BNF) to map linear genotypes to phenotype trees. The paper studies whether and how GE is affected by structural difficulty. For the analysis, we perform random walks through the search space and compare the struc…
Putting molecules in their place.
2014
Each class of microscope is limited to imaging specific aspects of cell structure and/or molecular organization. However, imaging the specimen by complementary microscopes and correlating the data can overcome this limitation. Whilst not a new approach, the field of correlative imaging is currently benefitting from the emergence of new microscope techniques. Here we describe the correlation of cryogenic fluorescence tomography (CFT) with soft X‐ray tomography (SXT). This amalgamation of techniques integrates 3D molecular localization data (CFT) with a high‐resolution, 3D cell reconstruction of the cell (SXT). Cells are imaged in both modalities in a near‐native, cryopreserved state. Here we…
Bayesian hierarchical models for analysing the spatial distribution of bioclimatic indices
2017
A methodological approach for modelling the spatial distribution of bioclimatic indices is proposed in this paper. The value of the bioclimatic index is modelled with a hierarchical Bayesian model that incorporates both structured and unstructured random effects. Selection of prior distributions is also discussed in order to better incorporate any possible prior knowledge about the parameters that could refer to the particular characteristics of bioclimatic indices. MCMC methods and distributed programming are used to obtain an approximation of the posterior distribution of the parameters and also the posterior predictive distribution of the indices. One main outcome of the proposal is the …
Mapping and determinism of soil microbial community distribution across an agricultural landscape.
2015
Article en open access; International audience; Despite the relevance of landscape, regarding the spatial patterning of microbial communities and the relative influence of environmental parameters versus human activities, few investigations have been conducted at this scale. Here, we used a systematic grid to characterize the distribution of soil microbial communities at 278 sites across a monitored agricultural landscape of 13km(2). Molecular microbial biomass was estimated by soil DNA recovery and bacterial diversity by 16S rRNA gene pyrosequencing. Geostatistics provided the first maps of microbial community at this scale and revealed a heterogeneous but spatially structured distribution…
Measurement of lean body mass using bioelectrical impedance analysis: a consideration of the pros and cons
2017
The assessment of body composition has important applications in the evaluation of nutritional status and estimating potential health risks. Bioelectrical impedance analysis (BIA) is a valid method for the assessment of body composition. BIA is an alternative to more invasive and expensive methods like dual-energy X-ray absorptiometry, computerized tomography, and magnetic resonance imaging. Bioelectrical impedance analysis is an easy-to-use and low-cost method for the estimation of fat-free mass (FFM) in physiological and pathological conditions. The reliability of BIA measurements is influenced by various factors related to the instrument itself, including electrodes, operator, subject, a…
Biological indices applied to benthic macroinvertebrates at reference conditions of mountain streams in two ecoregions (Poland, the Slovak Republic)
2013
The study was carried out from 2007 to 2010 in two ecoregions: the Carpathians and the Central Highlands. The objectives of our survey were to test the existing biological index metric based on benthic macroinvertebrates at reference conditions in the high- and mid-altitude mountain streams of two ecoregions according to the requirements of the EU WFD and to determine which environmental factors influence the distribution of benthic macroinvertebrates. Our results revealed statistically significant differences in the values of the physical and chemical parameters of water as well as the mean values of metrics between the types of streams at the sampling sites. RDA analysis showed that the t…
Reproducing kernel hilbert spaces regression methods for genomic assisted prediction of quantitative traits.
2008
Abstract Reproducing kernel Hilbert spaces regression procedures for prediction of total genetic value for quantitative traits, which make use of phenotypic and genomic data simultaneously, are discussed from a theoretical perspective. It is argued that a nonparametric treatment may be needed for capturing the multiple and complex interactions potentially arising in whole-genome models, i.e., those based on thousands of single-nucleotide polymorphism (SNP) markers. After a review of reproducing kernel Hilbert spaces regression, it is shown that the statistical specification admits a standard mixed-effects linear model representation, with smoothing parameters treated as variance components.…
Assessment of Granger causality by nonlinear model identification: application to short-term cardiovascular variability.
2007
A method for assessing Granger causal relationships in bivariate time series, based on nonlinear autoregressive (NAR) and nonlinear autoregressive exogenous (NARX) models is presented. The method evaluates bilateral interactions between two time series by quantifying the predictability improvement (PI) of the output time series when the dynamics associated with the input time series are included, i.e., moving from NAR to NARX prediction. The NARX model identification was performed by the optimal parameter search (OPS) algorithm, and its results were compared to the least-squares method to determine the most appropriate method to be used for experimental data. The statistical significance of…