0000000000933916

AUTHOR

Antonino Abbruzzo

Graphical models for estimating network determinants of multi-destination trips in Sicily

Abstract This paper proposes a two-step approach for analysing the main determinants of multi-destination trip behaviour. It is based on a combination of graphical models and of a multinomial logistic regression model; the aim is to analyse direct and indirect effects of a wide set of tourist- and trip-related characteristics on multi-destination trip behaviour. Empirical data have been derived from a frontier survey of approximately 4000 incoming tourists in Sicily (Italy) at the end of their trip. Results suggest that multi-destination trips depend directly on the length of stay, the number of previous visits and motivation for the trip, and only indirectly on the interview month, travel …

research product

Determinants of individual tourist expenditure as a network: Empirical findings from Uruguay

Abstract This paper introduces the use of graphical models for assessing the determinants of individual tourist spending. These models have the advantage of synthesizing and visualizing the relationships occurring within large sets of random variables, through an easy to interpret output. To this end, individual data from a large official survey of international tourists in Uruguay are used. Symmetric conditional independence structures are first investigated. Then subgraphs of each expenditure item's neighbourhood are extracted in order to assess the impact of main effects and interactions through proportional ordinal logistic regression. Results highlight the marginal role of socio-demogr…

research product

Determinants of spatial intensity of stop locations on cruise passengers tracking data

This paper aims at analyzing the spatial intensity in the distribution of stop locations of cruise passengers during their visit at the destination through a stochastic point process modelling approach on a linear network. Data collected through the integration of GPS tracking technology and questionnaire-based survey on cruise passengers visiting the city of Palermo are used, to identify the main determinants which characterize their stop locations pattern. The spatial intensity of stop locations is estimated through a Gibbs point process model, taking into account for both individual-related variables, contextual-level information, and for spatial interaction among stop points. The Berman…

research product

Model selection procedure for mixture hidden Markov models

This paper proposes a model selection procedure to identify the number of clusters and hidden states in discrete Mixture Hidden Markov models (MHMMs). The model selection is based on a step-wise approach that uses, as score, information criteria and an entropy criterion. By means of a simulation study, we show that our procedure performs better than classical model selection methods in identifying the correct number of clusters and hidden states or an approximation of them

research product

Operational and financial performance of Italian airport companies: A dynamic graphical model

Abstract This paper provides evidence on the relationship within a set of financial and operational indicators for Italian airports over 2008–2014. The limited sample size of national and regional airports suggests to apply the penalised RCON ( V , E ) model, which falls within the class of Gaussian graphical models. It provides both estimate and easy way to visualise conditional independence structures of the variables. Moreover, it is particularly suitable for handling longitudinal data where small number of units and huge number of variables have been collected. Findings highlight that a qualified concept of size matters in determining good financial performance. Specifically, increasing…

research product

Dynamic factorial graphical models for dynamic networks

Dynamic networks models describe a growing number of important scientific processes, from cell biology and epidemiology to sociology and finance. Estimating dynamic networks from noisy time series data is a difficult task since the number of components involved in the system is very large. As a result, the number of parameters to be estimated is typically larger than the number of observations. However, a characteristic of many real life networks is that they are sparse. For example, the molec- ular structure of genes make interactions with other components a highly-structured and, therefore, a sparse process. Penalized Gaussian graphical models have been used to estimate sparse networks. H…

research product

Socio-economic inequality, interregional mobility and mortality among cancer patients: A mediation analysis approach

This paper investigates the effect of socio-economic status on interregional mobility and mortality among cancer patients. The cohort under analysis comprises patients residing in Sicily (Italy), who were diagnosed with lung and colon cancer between 2010 and 2011. The data was collated from the hospital discharge records of the Sicilian Region and the Regional register of the causes of death, by considering all those patients for whom information relating to socio-economic status was available. First, graphical models were applied to highlight the multivariate structure of association among socio-economic status, interregional mobility and 3-year mortality. Secondly, mediation analysis quan…

research product

Hierarchical Bayesian models for analysing fish biomass data. An application to Parapenaeus longirostris biomass data

The Mediterranean International Trawl Survey (MEDITS) programme provides spatially referenced ecological data. We adopted a hierarchical Bayesian model to analyse Parapenaeus longirostris biomass data. The model comprises three parts, each of which identifies: the variability due to the explanatory variables, the variability due to the spatial domain (seen as a Gaussian Process) and the irregular component modelled as white noise. The estimated parameters show that some seabed characteristics affect biomass quantity and that the estimated behaviour of the Gaussian Process changes over different groups of years.

research product

Identification and modeling of stop activities at the destination from GPS tracking data

Il presente articolo ha lo scopo di analizzare il comportamento turistico a destinazione, con un focus specifico sulle soste effettuate dai turisti nella destinazione. Vengono analizzati dati desunti da dispositivi GPS raccolti su un campione di crocieristi, a partire dai quali e possibile individuare le soste a destinazione `attraverso l’impiego di un opportuno algoritmo. L’effetto delle caratteristiche sociodemografiche e legate all’itinerario intrapreso sul numero di soste effettuate viene studiato attraverso l’impiego di modelli di reggressione di Poisson. I risultati sono di interesse sia da un punto di vista metodologico, legato all’analisi e sintesi di dati GPS, che dal punto di vist…

research product

Inhomogeneous spatio-temporal point processes on linear networks for visitors’ stops data

We analyse the spatio-temporal distribution of visitors' stops by touristic attractions in Palermo (Italy) using theory of stochastic point processes living on linear networks. We first propose an inhomogeneous Poisson point process model, with a separable parametric spatio-temporal first-order intensity. We account for the spatial interaction among points on the given network, fitting a Gibbs point process model with mixed effects for the purely spatial component. This allows us to study first-order and second-order properties of the point pattern, accounting both for the spatio-temporal clustering and interaction and for the spatio-temporal scale at which they operate. Due to the strong d…

research product

Dynamic Gaussian Graphical Models for Modelling Genomic Networks

After sequencing the entire DNA for various organisms, the challenge has become understanding the functional interrelatedness of the genome. Only by understanding the pathways for various complex diseases can we begin to make sense of any type of treatment. Unfortunately, decyphering the genomic network structure is an enormous task. Even with a small number of genes the number of possible networks is very large. This problem becomes even more difficult, when we consider dynamical networks. We consider the problem of estimating a sparse dynamic Gaussian graphical model with \(L_1\) penalized maximum likelihood of structured precision matrix. The structure can consist of specific time dynami…

research product

Italian wines in the new world wine consumers countries: the case of the Russian market

Over the last few decades, the wine market has been affected by a deep structural transformation due to globalization and mounting international competition. In particular, the wine demand has registered a geographical change with a fall in the traditional markets and an increase in the new markets among which Russia and China. Russia is one of the largest markets for wine on the planet, therefore, know what quality attributes are appreciated by Russian consumers is relevant in order to define effective business and marketing strategies. An hedonic price model has been used in this work in order to estimate implicit price for the main objective attributes of Italian wine sales in the Russia…

research product

Analysing the mediating role of a network: a Bayesian latent space approach

The use of network analysis for the investigation of social structures has recently seen a rise, due both to the high availability of data and to the numerous insights it can provide into different fields. Most analyses focus on the topological characteristics of networks and the estimation of relationships between the nodes. We adopt a different point of view, by considering the whole network as a random variable conveying the effect of an exposure on a response. This point of view represents a classical mediation setting, where the interest lies in the estimation of the indirect effect, that is, the effect propagated through the mediating variable. We introduce a latent space model mappin…

research product

Sparse model-based network inference using Gaussian graphical models

We consider the problem of estimating a sparse dynamic Gaussian graphical model with L1 penalized maximum likelihood of structured precision matrix. The structure can consist of specific time dynamics, known presence or absence of links in the graphical model or equality constraints on the parameters. The model is defined on the basis of partial correlations, which results in a specific class precision matrices. A priori L1 penalized maximum likelihood estimation in this class is extremely difficult, because of the above mentioned constraints, the computational complexity of the L1 constraint on the side of the usual positive-definite constraint. The implementation is non-trivial, but we sh…

research product

Hydrological post-processing based on approximate Bayesian computation (ABC)

[EN] This study introduces a method to quantify the conditional predictive uncertainty in hydrological post-processing contexts when it is cumbersome to calculate the likelihood (intractable likelihood). Sometimes, it can be difficult to calculate the likelihood itself in hydrological modelling, specially working with complex models or with ungauged catchments. Therefore, we propose the ABC post-processor that exchanges the requirement of calculating the likelihood function by the use of some sufficient summary statistics and synthetic datasets. The aim is to show that the conditional predictive distribution is qualitatively similar produced by the exact predictive (MCMC post-processor) or …

research product

Scad-elastic net and the estimation of individual tourism expenditure determinants

This paper introduces the use of scad-elastic net in the assessment of the determinants of individual tourist spending. This technique approaches two main estimation-related issues of primary importance. So far studies of tourism literature have made a wide use of classic regressions, whose results might be affected by multicollinearity. In addition, because of the absence of robust economic theory on tourism behavior, regressor selection is often left to researcher's choice when not driven by non-optimal automatic criteria. Scad-elastic net is an OLS model that accounts for both these problems by including two types of parameters constraints, namely the smoothly clipped absolute deviation …

research product

An innovative way to highlight the power of each polymorphism on elite athletes phenotype expression

The purpose of this study was to determine the probability of soccer players having the best genetic background that could increase performance, evaluating the polymorphism that are considered Performance Enhancing Polymorphism (PEPs) distributed on five genes: PPAR alpha, PPARGC1A, NRF2, ACE e CKMM. Particularly, we investigated how each polymorphism works directly or through another polymorphism to distinguish elite athletes from non-athletic population. Sixty professional soccer players (age 22.5 +/- 2.2) and sixty healthy volunteers (age 21.2 +/- 2.3) were enrolled. Samples of venous blood was used to prepare genomic DNA. The polymorphic sites were scanned using PCR-RFLP protocols with …

research product

Socio-economic deprivation and COVID-19 infection: a Bayesian spatial modelling approach

Il presente articolo ha l’obiettivo di analizzare l’effetto della deprivazione socio-economica sull’incidenza da COVID-19 a livello sub-comunale. Grazie alla disponibilit`a di informazioni sui tassi di incidenza mensili da COVID-19 a livello di sezione di censimento per i due comuni di Palermo e Catania (Italia), viene pro- posto l’utilizzo di un modello spaziale Bayesiano con distribuzione binomiale zero- inflated. I risultati mostrano un’associazione tra livelli di deprivazione e incidenza da COVID-19 nei due comuni, controllando per la struttura spaziale delle unit`a areali considerate. Alla luce dei risultati, si rendono necessarie azioni di politica sanitaria focalizzando gli intervent…

research product

L1-Penalized Censored Gaussian Graphical Model

Graphical lasso is one of the most used estimators for inferring genetic networks. Despite its diffusion, there are several fields in applied research where the limits of detection of modern measurement technologies make the use of this estimator theoretically unfounded, even when the assumption of a multivariate Gaussian distribution is satisfied. Typical examples are data generated by polymerase chain reactions and flow cytometer. The combination of censoring and high-dimensionality make inference of the underlying genetic networks from these data very challenging. In this article, we propose an $\ell_1$-penalized Gaussian graphical model for censored data and derive two EM-like algorithm…

research product

Probiotics, prebiotics and symbiotics in inflammatory bowel diseases: state-of-the-art and new insights

Inflammatory bowel disease (IBD) consists of two distinct clinical forms, ulcerative colitis (UC) and Crohn's disease (CD), with unknown aetiology, which nevertheless are considered to share almost identical pathophysiological backgrounds. Up to date, a full coherent mechanistic explanation for IBD is still lacking, but people start to realize that the pathogenesis of IBD involves four fundamental components: the environment, gut microbiota, the immune system and the genome. As a consequence, IBD development might be due to an altered immune response and a disrupted mechanism of host tolerance to the non-pathogenic resident microbiota, leading to an elevated inflammatory response. Consideri…

research product

An innovative way to highlight the power of each polymorphism on the elite athletes phenotype expression

The purpose of this study was to determine the probability of soccer players having the best genetic background that could increase performance, evaluating the polymorphism that are considered Performance Enhancing Polymorphism (PEPs) distributed on five genes: PPARα, PPARGC1A, NRF2, ACE e CKMM. Particularly, we investigated how each polymorphism works directly or through another polymorphism to distinguish elite athletes from non-athletic population. Materials And Methods. Sixty professional soccer players (age 22.5 ± 2.2) and sixty healthy volunteers (age 21.2± 2.3) were enrolled. Samples of venous blood was used to prepare genomic DNA. The polymorphic sites were scanned using PCR-RFLP pr…

research product

Density-based Algorithm and Network Analysis for GPS Data

La diffusione dei sistemi di localizzazione GPS offre numerose opportunita per la raccolta di dati di movimento. I dati GPS presentano diversi elementi di complessita derivanti anche dall’elevato dettaglio temporale e territoriale. Numerosi sono gli aspetti che possono essere presi in esame per tale tipologia di dati. Il presente studio propone un approccio statistico basato sull’identificazione dei punti di attrazione e sullo studio dei network. In particolare, viene proposto un algoritmo di identificazione di cluster, sulla base della densita di punti, che vengono sintetizzati in un network che riassume il comportamento individuale. In un secondo step, i movimenti complessivi sono aggrega…

research product

A computationally fast alternative to cross-validation in penalized Gaussian graphical models

We study the problem of selection of regularization parameter in penalized Gaussian graphical models. When the goal is to obtain the model with good predicting power, cross validation is the gold standard. We present a new estimator of Kullback-Leibler loss in Gaussian Graphical model which provides a computationally fast alternative to cross-validation. The estimator is obtained by approximating leave-one-out-cross validation. Our approach is demonstrated on simulated data sets for various types of graphs. The proposed formula exhibits superior performance, especially in the typical small sample size scenario, compared to other available alternatives to cross validation, such as Akaike's i…

research product

Spatio-temporal analysis of the Covid-19 spread in Italy by Bayesian hierarchical models

In this paper, we investigate the spatio-temporal spread pattern of the virus Covid-19 in Italy, during the first wave of infections, from February to October 2020. We provide a disease mapping of the virus infections, by using the Besag-Yorke-Molliè model and its spatio-temporal extensions. Our results confirm the effectiveness of the lockdown action, and show that, during the first wave, the virus spread by an inhomogeneous spatial trend and each province was characterised by a specific temporal trend, independent of the temporal evolution of the observed cases in the other provinces

research product

Sea Surface Temperature Effects on the Mediterranean Marine Ecosystem: a Semiparametric Model Approach

Ocean warming is a worldwide phenomenon. The mean temperature of the catch (MTC) is becoming one of the leading indicators to assess the impact of sea surface temperature on fish communities. In this study, we apply a semiparametric regression approach to the MTC of the catches from MEDITS bottom trawl program in the Strait of Sicily (Central Mediterranean Sea) for the period 1995 to 2018 to evaluate the effects of climate change on continental shelf fish community. All covariates included in the model have a significant impact on the MTC level. Notably, the sea surface temperature (SST) effect on the MTC depends on depth, being positive near the surface and negative at the bottom.

research product

The premium price for Italian red wines in new world wine consuming countries: the case of the Russian market

Italian wine is increasingly appreciated in new world consumer countries and, in particular, in Russia where consumers associate its consumption with an Italian lifestyle. In this paper, market value for wine search attributes is measured through the estimation of a hedonic price model using online data from a Wine Searcher website and the information contained in the labels of wines marketed in Russia. Results show a premium price for wines from Piedmont and Tuscany, and in particular for non-native varieties and for Indicazione Geografica Tipica and Protected Geographical Indication wines. Additionally, vintage and higher alcohol content have a significant positive impact on the prices th…

research product

Spatio-Temporal Spread Pattern of COVID-19 in Italy

This paper investigates the spatio-temporal spread pattern of COVID-19 in Italy, during the first wave of infections, from February to October 2020. Disease mappings of the virus infections by using the Besag–York–Mollié model and some spatio-temporal extensions are provided. This modeling framework, which includes a temporal component, allows the studying of the time evolution of the spread pattern among the 107 Italian provinces. The focus is on the effect of citizens’ mobility patterns, represented here by the three distinct phases of the Italian virus first wave, identified by the Italian government, also characterized by the lockdown period. Results show the effectiveness of the lockdo…

research product

Inferring slowly changing dynamic gene-regulatory networks

Dynamic gene-regulatory networks are complex since the interaction patterns between its components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between the random variables. By interpreting the random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experi…

research product

Penalized linear discriminant analysis and Discrete AdaBoost to distinguish human hair metal profiles: The case of adolescents residing near Mt. Etna

The research focus of the present paper was twofold. First, we tried to document that human intake of trace elements is influenced by geological factors of the place of residence. Second, we showed that the elemental composition of human hair is a useful screening tool for assessing people's exposure to potentially toxic substances. For this purpose, we used samples of human hair from adolescents and applied two robust statistical approaches. Samples from two distinct geological and environmental sites were collected: the first one was characterized by the presence of the active volcano Mt. Etna (ETNA group) and the second one lithologically made up of sedimentary rocks (SIC group). Chemica…

research product

Identification of points of attraction and network analysis for GPS tracking data

Global positional system data provide accurate information on units’ movements from both the temporal and the spatial perspective. Several aspects of these movements can be analyzed according to the aim of interest. In this study, we focus on statistical methods for the identification of points of interest and for the analysis of the network of movements for GPS data. A density-based cluster algorithm is applied to summarize the vast amount of information and to find the most relevant points of attraction. A directed network synthesizes the individual unit’s path by using the latter information. Finally, we aggregate the units’ paths in a weighted directed network which is studied through n…

research product

Selecting the tuning parameter in penalized Gaussian graphical models

Penalized inference of Gaussian graphical models is a way to assess the conditional independence structure in multivariate problems. In this setting, the conditional independence structure, corresponding to a graph, is related to the choice of the tuning parameter, which determines the model complexity or degrees of freedom. There has been little research on the degrees of freedom for penalized Gaussian graphical models. In this paper, we propose an estimator of the degrees of freedom in $$\ell _1$$ -penalized Gaussian graphical models. Specifically, we derive an estimator inspired by the generalized information criterion and propose to use this estimator as the bias term for two informatio…

research product

Analysis of clickstream data with mixture hidden markov models

clickstream data sono un’importante fonte di informazioni per l’ecommerce, sebbene non siano semplici da gestire e convertire queste informazioni in un reale vantaggio competitivo non e un compito banale. In questo articolo, consid- ` eriamo l’applicazione dei mixture hidden Markov model a dati relativi al flusso di clickstream estratti dal portale e-commerce di un’azienda di servizi turistici. Sono stati individuati cluster relativi al comportamento di navigazione degli utenti e alla loro posizione geografica che forniscono indicazioni importanti per lo sviluppo di nuove strategie di business. Clickstream data is an important source of information for businesses, however it is not easy to …

research product

Graphical models for estimating dynamic networks

Het bepalen van dynamische netwerken met behulp van data is een actief onderzoeksgebied, met name in de systeem biologie. Het schatten van de structuur van een netwerk heeft te maken met het bepalen van de aan of afwezigheid van een relatie tussen de hoekpunten in de graaf. Grafische modellen definiëren deze relaties via conditionele afhankelijkheid. In Gaussiaanse grafische modellen (GGM) wordt verondersteld dat de hoekpunten een normale verdeling volgen. Dit heeft grote voordelen vanwege de computationele handelbaarheid van GGM. Standaard GGM zijn echter niet bruikbaar om grote netwerken te bestuderen, i.e. als het aantal waarnemingen minder is dan het aantal hoekpunten van de graaf. Rece…

research product

Inferring networks from high-dimensional data with mixed variables

We present two methodologies to deal with high-dimensional data with mixed variables, the strongly decomposable graphical model and the regression-type graphical model. The first model is used to infer conditional independence graphs. The latter model is applied to compute the relative importance or contribution of each predictor to the response variables. Recently, penalized likelihood approaches have also been proposed to estimate graph structures. In a simulation study, we compare the performance of the strongly decomposable graphical model and the graphical lasso in terms of graph recovering. Five different graph structures are used to simulate the data: the banded graph, the cluster gr…

research product

Inferring slowly-changing dynamic gene-regulatory networks

Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experi…

research product

Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks.

Abstract Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order – some entries of the precision matrix are a priori zeros – or equal dependency strengths across time lags – some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l 1-penalized maximum likelihood, imposing a further constraint on the absolute value…

research product

A pre-processing and network analysis of GPS tracking data

Global Positioning System (GPS) devices afford the opportunity to collect accurate data on unit movements from temporal and spatial perspectives. With a special focus on GPS technology in travel surveys, this paper proposes: (1) two algorithms for the pre-processing of GPS data in order to deal with outlier identification and missing data imputation; (2) a clustering approach to recover the main points of interest from GPS trajectories; and (3) a weighted-directed network, which incorporates the most relevant characteristics of the GPS trajectories at an aggregate level. A simulation study shows the goodness-of-fit of the imputation data algorithm and the robustness of the clustering algori…

research product

Integrating functional traits into correlative species distribution models to investigate the vulnerability of marine human activities to climate change

Climate change and particularly warming are significantly impacting marine ecosystems and the services they provided. Temperature, as the main factor driving all biological processes, may influence ectotherms metabolism, thermal tolerance limits and distribution species patterns. The joining action of climate change and local stressors (including the increasing human marine use) may facilitate the spread of non-indigenous and native outbreak forming species, leading to associated economic consequences for marine coastal economies. Marine aquaculture is one among the most economic anthropogenic activities threatened by multiple stressors and in turn, by increasing hard artificial substrates …

research product

Spatio-Temporal Linear Network Point Processes for GPS Data Analysis

This work aims at analyzing the spatio-temporal intensity in the distribution of stop locations of cruise passengers during their visit at the destination. Data are collected through the integration of GPS tracking technology and questionnaire-based survey on a sample of cruise passengers visiting the city of Palermo (Italy), and they are used to identify the main determinants which characterize cruise passengers’ stop locations pattern. The spatio-temporal distribution of visitors' stops is analysed by mean of the theory of stochastic point processes occurring on linear networks, in order to consider the street configuration of the city and the location of the main attractions. First, an i…

research product

Asthma control, severity and lung function impairment through network analysis in children

Background: Achieving and maintaining asthma control in children is the primary goal recommended by current guidelines. Aim: To identify risk factors associated with Asthma control and severity, as well as their relative weight. Methods: Within a consecutive series of outpatients visited in a three years period at the IBIM pediatric clinic, we selected 128 persistent asthmatics. A standardized medical interview was carried out to collect information on environmental risk factors, symptoms and comorbidities. Spirometry was performed using Pony FX, Cosmed, Italy; spirometric values were expressed as %pred using GLI-2012equation. Statistical analyses were performed by using R. Results: The ide…

research product

Tourism Statistics

How a nation collects and shares data on tourism in the country is influenced by that country’s historical, cultural, and political background, as well as its geographical characteristics. This means there are a wide variety of ways tourism data are collected, evaluated, and disseminated. Although standard statistical mechanisms for tracking tourism have been improved, to be of greatest use, statistical sources should be further enhanced and be comparable across locations and time. For this reason, several international organizations have the responsibility of harmonizing definitions of, and methodologies for, collecting data. These organizations are also a relevant supplier of data sources…

research product

Spatial seismic point pattern analysis with Integrated Nested Laplace Approximation

This paper proposes the use of Integrated Nested Laplace Approximation (Rue et al., 2009) to describe the spatial displacement of earthquake data. Specifying a hiechical structure of the data and parameters, an inhomogeneuos Log-Gaussian Cox Processes model is applied for describing seismic events occurred in Greece, an area of seismic hazard. In this way, the dependence of the spatial point process on external covariates can be taken into account, as well as the interaction among points, through the estimation of the parameters of the covariance of the Gaussian Random Field, with a computationally efficient approach.

research product

Generalized information criterion for model selection in penalized graphical models

This paper introduces an estimator of the relative directed distance between an estimated model and the true model, based on the Kulback-Leibler divergence and is motivated by the generalized information criterion proposed by Konishi and Kitagawa. This estimator can be used to select model in penalized Gaussian copula graphical models. The use of this estimator is not feasible for high-dimensional cases. However, we derive an efficient way to compute this estimator which is feasible for the latter class of problems. Moreover, this estimator is, generally, appropriate for several penalties such as lasso, adaptive lasso and smoothly clipped absolute deviation penalty. Simulations show that th…

research product

Estimating the Bayesian posterior distribution of indirect effects in causal longitudinal mediation analysis

Many research studies aim to unveil the causal mechanism underlying a particular phenomenon; mediation analysis is increasingly used for this scope, and longitudinal data are particularly suited for mediation since they ensure the correct temporal order among variables and the time spanning allows the causal effects to unfold. This explains the rise of interest in the topic of longitudinal mediation analysis. Many approaches have been proposed to cope with longitudinal mediation (Fosen et al., 2005; Lin et al., 2017), among which mixed-effect modelling. In a recent paper, Bind et al. (Biostatistics, 2016) made use of generalised mixed effect models and provided conditions for the identifiab…

research product

Bayesian causal mediation analysis through linear mixed-effect models

In mediational settings, the main focus is on the estimation of the indirect effect of an exposure on an outcome through a third variable called mediator. The traditional maximum likelihood estimation method presents several problems in the estimation of the standard error and the confidence interval of the indirect effect. In this paper, we propose a Bayesian approach to obtain the posterior distribution of the indirect effect through MCMC, in the context of mediational mixed models for longitudinal data. A simulation study shows that our method outperforms the traditional maximum likelihood approach in terms of bias and coverage rates.

research product

Factorial graphical models for dynamic networks

AbstractDynamic network models describe many important scientific processes, from cell biology and epidemiology to sociology and finance. Estimating dynamic networks from noisy time series data is a difficult task since the number of components involved in the system is very large. As a result, the number of parameters to be estimated is typically larger than the number of observations. However, a characteristic of many real life networks is that they are sparse. For example, the molecular structure of genes make interactions with other components a highly-structured and, therefore, a sparse process. Until now, the literature has focused on static networks, which lack specific temporal inte…

research product

INFERRING GENE NETWORKS FROM MICROARRAY WITH GRAPHICAL MODELS

ABSTRACT. Microarray technology allows to collect a large amount of genetic data, such as gene expression data. The activity of the genes are coordinate by a complex network that regulates their expressions controlling common functions, such as the formation of a transcriptional complex or the availability of a signalling pathway. Understanding this organization is crucial to explain normal cell physiology as well as to analyse complex pathological phenotypes. Graphical models are a class of statistical models that can be used to infer gene regulatory networks. In this paper, we examine a class of graphical models: the strongly decomposable graphical models for mixed variables. Among oth- e…

research product

Networks as mediating variables: a Bayesian latent space approach

AbstractThe use of network analysis to investigate social structures has recently seen a rise due to the high availability of data and the numerous insights it can provide into different fields. Most analyses focus on the topological characteristics of networks and the estimation of relationships between the nodes. We adopt a different perspective by considering the whole network as a random variable conveying the effect of an exposure on a response. This point of view represents a classical mediation setting, where the interest lies in estimating the indirect effect, that is, the effect propagated through the mediating variable. We introduce a latent space model mapping the network into a …

research product