0000000000136212

AUTHOR

Fabio Divino

Bayesian Modeling and MCMC Computation in Linear Logistic Regression for Presence-only Data

Presence-only data are referred to situations in which, given a censoring mechanism, a binary response can be observed only with respect to on outcome, usually called \textit{presence}. In this work we present a Bayesian approach to the problem of presence-only data based on a two levels scheme. A probability law and a case-control design are combined to handle the double source of uncertainty: one due to the censoring and one due to the sampling. We propose a new formalization for the logistic model with presence-only data that allows further insight into inferential issues related to the model. We concentrate on the case of the linear logistic regression and, in order to make inference on…

research product

Covid-19 in Italy: Modelling, Communications, and Collaborations

Abstract When Covid-19 arrived in Italy in early 2020, a group of statisticians came together to provide tools to make sense of the unfolding epidemic and to counter misleading media narratives. Here, members of StatGroup-19 reflect on their work to date

research product

Estimating COVID-19-induced Excess Mortality in Lombardy

AbstractWe compare the expected all-cause mortality with the observed one for different age classes during the pandemic in Lombardy, which was the epicenter of the epidemic in Italy and still is the region most affected by the pandemic. A generalized linear mixed model is introduced to model weekly mortality from 2011 to 2019, taking into account seasonal patterns and year-specific trends. Based on the 2019 year-specific conditional best linear unbiased predictions, a significant excess of mortality is estimated in 2020, leading to approximately 35000 more deaths than expected, mainly arising during the first wave. In 2021, instead, the excess mortality is not significantly different from z…

research product

Spatio-temporal modelling of COVID-19 incident cases using Richards’ curve: An application to the Italian regions

Abstract We introduce an extended generalised logistic growth model for discrete outcomes, in which spatial and temporal dependence are dealt with the specification of a network structure within an Auto-Regressive approach. A major challenge concerns the specification of the network structure, crucial to consistently estimate the canonical parameters of the generalised logistic curve, e.g. peak time and height. We compared a network based on geographic proximity and one built on historical data of transport exchanges between regions. Parameters are estimated under the Bayesian framework, using Stan probabilistic programming language. The proposed approach is motivated by the analysis of bot…

research product

Empirical Bayes improves assessments of diversity and similarity when overdispersion prevails in taxonomic counts with no covariates

Abstract The assessment of diversity and similarity is relevant in monitoring the status of ecosystems. The respective indicators are based on the taxonomic composition of biological communities of interest, currently estimated through the proportions computed from sampling multivariate counts. In this work we present a novel method to estimate the taxonomic composition able to work even with a single sample and no covariates, when data are affected by overdispersion. The presence of overdispersion in taxonomic counts may be the result of significant environmental factors which are often unobservable but influence communities. Following the empirical Bayes approach, we combine a Bayesian mo…

research product

Unreliable predictions about COVID‐19 infections and hospitalizations make people worry: The case of Italy

research product

Estimating COVID-19-induced Excess Mortality in Lombardy, Italy.

We compare the expected all-cause mortality with the observed one for different age classes during the pandemic in Lombardy, which was the epicenter of the epidemic in Italy. The first case in Italy was found in Lombardy in early 2020, and the first wave was mainly centered in Lombardy. The other three waves, in Autumn 2020, March 2021 and Summer 2021 are also characterized by a high number of cases in absolute terms. A generalized linear mixed model is introduced to model weekly mortality from 2011 to 2019, taking into account seasonal patterns and year-specific trends. Based on the 2019 year-specific conditional best linear unbiased predictions, a significant excess of mortality is estima…

research product

Spatial Bayesian Modeling of Presence-only Data

research product

An ensemble approach to short-term forecast of COVID-19 intensive care occupancy in Italian Regions

Abstract The availability of intensive care beds during the COVID‐19 epidemic is crucial to guarantee the best possible treatment to severely affected patients. In this work we show a simple strategy for short‐term prediction of COVID‐19 intensive care unit (ICU) beds, that has proved very effective during the Italian outbreak in February to May 2020. Our approach is based on an optimal ensemble of two simple methods: a generalized linear mixed regression model, which pools information over different areas, and an area‐specific nonstationary integer autoregressive methodology. Optimal weights are estimated using a leave‐last‐out rationale. The approach has been set up and validated during t…

research product

Nowcasting COVID‐19 incidence indicators during the Italian first outbreak

A novel parametric regression model is proposed to fit incidence data typically collected during epidemics. The proposal is motivated by real-time monitoring and short-term forecasting of the main epidemiological indicators within the first outbreak of COVID-19 in Italy. Accurate short-term predictions, including the potential effect of exogenous or external variables are provided. This ensures to accurately predict important characteristics of the epidemic (e.g., peak time and height), allowing for a better allocation of health resources over time. Parameter estimation is carried out in a maximum likelihood framework. All computational details required to reproduce the approach and replica…

research product

Data Augmentation Approach in Bayesian Modelling of Presence-only Data

Abstract Ecologists are interested in prediction of potential distribution of species in suitable areas, essential for planning conservation and management strategies. Unfortunately, often the only available information in such studies is the true presence of the species at few locations of the study area and the associated environmental covariates over the entire area, referred as presence-only data. We propose a Bayesian approach to estimate logistic linear regressions adapted to presence-only data through the introduction of a random approximation of the correction factor in the adjusted logistic model that allows us to overcome the need to know a priori the prevalence of the species.

research product