Search results for "Simulation and Modeling"
showing 10 items of 38 documents
Efficient estimation of generalized linear latent variable models.
2019
Generalized linear latent variable models (GLLVM) are popular tools for modeling multivariate, correlated responses. Such data are often encountered, for instance, in ecological studies, where presence-absences, counts, or biomass of interacting species are collected from a set of sites. Until very recently, the main challenge in fitting GLLVMs has been the lack of computationally efficient estimation methods. For likelihood based estimation, several closed form approximations for the marginal likelihood of GLLVMs have been proposed, but their efficient implementations have been lacking in the literature. To fill this gap, we show in this paper how to obtain computationally convenient estim…
Measuring spectrally-resolved information transfer.
2020
Information transfer, measured by transfer entropy, is a key component of distributed computation. It is therefore important to understand the pattern of information transfer in order to unravel the distributed computational algorithms of a system. Since in many natural systems distributed computation is thought to rely on rhythmic processes a frequency resolved measure of information transfer is highly desirable. Here, we present a novel algorithm, and its efficient implementation, to identify separately frequencies sending and receiving information in a network. Our approach relies on the invertible maximum overlap discrete wavelet transform (MODWT) for the creation of surrogate data in t…
Objective Assessment of Nuclear and Cortical Cataracts through Scheimpflug Images: Agreement with the LOCS III Scale.
2016
Purpose To assess nuclear and cortical opacities through the objective analysis of Scheimpflug images, and to check the correlation with the Lens Opacity Classification System III (LOCS III). Methods Nuclear and cortical opacities were graded according to the LOCS III rules after pupil dilation. The maximum and average pixel intensity values along an elliptical mask within the lens nucleus were taken to analyse nuclear cataracts. A new metric based on the percentage of opaque pixels within a region of interest was used to analyse cortical cataracts. The percentage of opaque pixels was also calculated for half, third and quarter areas from the region of interest’s periphery. Results The maxi…
Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling
2016
Clinical cohorts with time-to-event endpoints are increasingly characterized by measurements of a number of single nucleotide polymorphisms that is by a magnitude larger than the number of measurements typically considered at the gene level. At the same time, the size of clinical cohorts often is still limited, calling for novel analysis strategies for identifying potentially prognostic SNPs that can help to better characterize disease processes. We propose such a strategy, drawing on univariate testing ideas from epidemiological case-controls studies on the one hand, and multivariable regression techniques as developed for gene expression data on the other hand. In particular, we focus on …
Invasion Ability and Disease Dynamics of Environmentally Growing Opportunistic Pathogens under Outside-Host Competition
2014
Most theories of the evolution of virulence concentrate on obligatory host-pathogen relationship. Yet, many pathogens replicate in the environment outside-host where they compete with non-pathogenic forms. Thus, replication and competition in the outside-host environment may have profound influence on the evolution of virulence and disease dynamics. These environmentally growing opportunistic pathogens are also a logical step towards obligatory pathogenicity. Efficient treatment methods against these diseases, such as columnaris disease in fishes, are lacking because of their opportunist nature. We present a novel epidemiological model in which replication and competition in the outside-hos…
Big Data in metagenomics: Apache Spark vs MPI.
2020
The progress of next-generation sequencing has lead to the availability of massive data sets used by a wide range of applications in biology and medicine. This has sparked significant interest in using modern Big Data technologies to process this large amount of information in distributed memory clusters of commodity hardware. Several approaches based on solutions such as Apache Hadoop or Apache Spark, have been proposed. These solutions allow developers to focus on the problem while the need to deal with low level details, such as data distribution schemes or communication patterns among processing nodes, can be ignored. However, performance and scalability are also of high importance when…
Kernel manifold alignment for domain adaptation
2016
The wealth of sensory data coming from different modalities has opened numerous opportu- nities for data analysis. The data are of increasing volume, complexity and dimensionality, thus calling for new methodological innovations towards multimodal data processing. How- ever, multimodal architectures must rely on models able to adapt to changes in the data dis- tribution. Differences in the density functions can be due to changes in acquisition conditions (pose, illumination), sensors characteristics (number of channels, resolution) or different views (e.g. street level vs. aerial views of a same building). We call these different acquisition modes domains, and refer to the adaptation proble…
Estimation of confidence limits for descriptive indexes derived from autoregressive analysis of time series: Methods and application to heart rate va…
2017
The growing interest in personalized medicine requires making inferences from descriptive indexes estimated from individual recordings of physiological signals, with statistical analyses focused on individual differences between/within subjects, rather than comparing supposedly homogeneous cohorts. To this end, methods to compute confidence limits of individual estimates of descriptive indexes are needed. This study introduces numerical methods to compute such confidence limits and perform statistical comparisons between indexes derived from autoregressive (AR) modeling of individual time series. Analytical approaches are generally not viable, because the indexes are usually nonlinear funct…
Least-squares community extraction in feature-rich networks using similarity data
2021
We explore a doubly-greedy approach to the issue of community detection in feature-rich networks. According to this approach, both the network and feature data are straightforwardly recovered from the underlying unknown non-overlapping communities, supplied with a center in the feature space and intensity weight(s) over the network each. Our least-squares additive criterion allows us to search for communities one-by-one and to find each community by adding entities one by one. A focus of this paper is that the feature-space data part is converted into a similarity matrix format. The similarity/link values can be used in either of two modes: (a) as measured in the same scale so that one may …
A Bayesian unified framework for risk estimation and cluster identification in small area health data analysis.
2020
Many statistical models have been proposed to analyse small area disease data with the aim of describing spatial variation in disease risk. In this paper, we propose a Bayesian hierarchical model that simultaneously allows for risk estimation and cluster identification. Our model formulation assumes that there is an unknown number of risk classes and small areas are assigned to a risk class by means of independent allocation variables. Therefore, areas within each cluster are assumed to share a common risk but they may be geographically separated. The posterior distribution of the parameter representing the number of risk classes is estimated using a novel procedure that combines its prior …