0000000000136215
AUTHOR
Antti Penttinen
Discussion of "modern statistics of spatial point processes"
The paper ‘Modern statistics for spatial point processes' by Jesper Møller and Rasmus P. Waagepetersen is based on a special invited lecture given by the authors at the 21st Nordic Conference on Mathematical Statistics, held at Rebild, Denmark, in June 2006. At the conference, Antti Penttinen and Eva B. Vedel Jensen were invited to discuss the paper. We here present the comments from the two invited discussants and from a number of other scholars, as well as the authors' responses to these comments. Below Figure 1, Figure 2, etc., refer to figures in the paper under discussion, while Figure A, Figure B, etc., refer to figures in the current discussion. All numbered sections and formulas ref…
Estimation of orientation characteristic of fibrous material
A new statistical method for estimating the orientation distribution of fibres in a fibre process is suggested where the process is observed in the form of a degraded digital greyscale image. The method is based on line transect sampling of the image in a few fixed directions. A well-known method based on stereology is available if the intersections between the transects and fibres can be counted. We extend this to the case where, instead of the intersection points, only scaled variograms of grey levels along the transects are observed. The nonlinear estimation equations for a parametric orientation distribution as well as a numerical algorithm are given. The method is illustrated by a real…
Bayesian Modeling and MCMC Computation in Linear Logistic Regression for Presence-only Data
Presence-only data are referred to situations in which, given a censoring mechanism, a binary response can be observed only with respect to on outcome, usually called \textit{presence}. In this work we present a Bayesian approach to the problem of presence-only data based on a two levels scheme. A probability law and a case-control design are combined to handle the double source of uncertainty: one due to the censoring and one due to the sampling. We propose a new formalization for the logistic model with presence-only data that allows further insight into inferential issues related to the model. We concentrate on the case of the linear logistic regression and, in order to make inference on…
Bayesian modeling of the evolution of male height in 18th century Finland from incomplete data.
Abstract Data on army recruits’ height are frequently available and can be used to analyze the economics and welfare of the population in different periods of history. However, such data are not a random sample from the whole population at the time of interest, but instead is skewed since the short men were less likely to be recruited. In statistical terms this means that the data are left-truncated. Although truncation is well-understood in statistics a further complication is that the truncation threshold is not known, may vary from time to time, and auxiliary information on the threshold is not at our disposal. The advantage of the fully Bayesian approach presented here is that both the …
Deducing self-interaction in eye movement data using sequential spatial point processes
Eye movement data are outputs of an analyser tracking the gaze when a person is inspecting a scene. These kind of data are of increasing importance in scientific research as well as in applications, e.g. in marketing and man-machine interface planning. Thus the new areas of application call for advanced analysis tools. Our research objective is to suggest statistical modelling of eye movement sequences using sequential spatial point processes, which decomposes the variation in data into structural components having interpretation. We consider three elements of an eye movement sequence: heterogeneity of the target space, contextuality between subsequent movements, and time-dependent behaviou…
Statistics in Practice
Spatiotemporal Structure of Host‐Pathogen Interactions in a Metapopulation
International audience; The ecological and evolutionary dynamics of species are influenced by spatiotemporal variation in population size. Unfortunately, we are usually limited in our ability to investigate the numerical dynamics of natural populations across large spatial scales and over long periods of time. Here we combine mechanistic and statistical approaches to reconstruct continuous-time infection dynamics of an obligate fungal pathogen on the basis of discrete-time occurrence data. The pathogen, Podosphaera plantaginis, infects its host plant, Plantago lanceolata, in a metapopulation setting where the presence of the pathogen has been recorded annually for 6 years in similar to 4,00…
The Homogeneous Poisson Point Process
Plant colonization of a bare peat surface: population changes and spatial patterns
. Changes in size and spatial arrangement of plant populations established on an initially bare peat surface were described over a period of 5 yr by following plant individuals on a 1-cm grid in an area of 10 m x 25 m. The spatial pattern of populations and association between species was analyzed statistically. The study site was very slowly colonized by 14 perennial plant species. The early successional stage was dominated by Carex rostrata, with a clumped spatial distribution, and the homogeneously distributed Eriophorum vaginatum and Pinus sylvestris. Both the growth in size of populations and changes in their spatial distribution were interpreted as a result of species dispersal abilit…
Finite Point Processes
Bayesian Mapping of Lichens Growing on Trees
Suitability of trees as hosts for epiphytic lichens are studied in a forest stand of size 25 ha. Suitability is measured as occupation probabilites which are modelled using hierarchical Bayesian approach. These probabilities are useful for an ecologist. They give smoothed spatial distribution map of suitability for each of the species and can be used in detecting high- and low-probability areas. In addition, suitability is explained by tree-level covariates. Spatial dependence, which is due to unobserved spatially structured covariates, is modelled through an unobserved Markov random field. Markov chain Monte Carlo method has been applied in Bayesian computation. The extensive spatial data …
Bayesian Smoothing in the Estimation of the Pair Potential Function of Gibbs Point Processes
A flexible Bayesian method is suggested for the pair potential estimation with a high-dimensional parameter space. The method is based on a Bayesian smoothing technique, commonly applied in statistical image analysis. For the calculation of the posterior mode estimator a new Monte Carlo algorithm is developed. The method is illustrated through examples with both real and simulated data, and its extension into truly nonparametric pair potential estimation is discussed.
Poisson Regression with Change-Point Prior in the Modelling of Disease Risk around a Point Source
Bayesian estimation of the risk of a disease around a known point source of exposure is considered. The minimal requirements for data are that cases and populations at risk are known for a fixed set of concentric annuli around the point source, and each annulus has a uniquely defined distance from the source. The conventional Poisson likelihood is assumed for the counts of disease cases in each annular zone with zone-specific relative risk and parameters and, conditional on the risks, the counts are considered to be independent. The prior for the relative risk parameters is assumed to be piecewise constant at the distance having a known number of components. This prior is the well-known cha…
Determination of Fibre Orientation Distribution from Images of Fibre Networks
We recall two categories of algorithms for estimating fibre orientation distribution from an image of a spatial fibre system. In the first algorithm, the estimate is a magnitude-weighted distribution from angles perpendicular to the directions of the gradients in the image. The second algorithm is based on the scaled variogram of grey values scanned along a sampling line and its relation to the fibre orientation distribution. Using lines in several directions and assuming a parametric model for the orientation distribution, the orientation parameters are estimated numerically from a least-squares type procedure. Two versions of variogram-based methods are used in this work. We compare the p…
Empirical Bayes improves assessments of diversity and similarity when overdispersion prevails in taxonomic counts with no covariates
Abstract The assessment of diversity and similarity is relevant in monitoring the status of ecosystems. The respective indicators are based on the taxonomic composition of biological communities of interest, currently estimated through the proportions computed from sampling multivariate counts. In this work we present a novel method to estimate the taxonomic composition able to work even with a single sample and no covariates, when data are affected by overdispersion. The presence of overdispersion in taxonomic counts may be the result of significant environmental factors which are often unobservable but influence communities. Following the empirical Bayes approach, we combine a Bayesian mo…
Fitting and Testing Point Process Models
Spatial pattern of the threatened epiphytic bryophyte Neckera pennata at two scales in a fragmented boreal forest
The spatial pattern and occurrence of a threatened bryophyte, Neckera pennata, were studied in relation to the abundance and pattern of suitable substrate trees at two spatial scales: I) in a 4 x 4 km fraction of fragmented, mostly managed southern boreal forest landscape, and 2) in an old-growth forest stand within this landscape, with abundant occurrence of suitable habitats. To explore in detail the spatial clustering of N. pennata at the forest stand scale, we applied a second order point process analysis based on the Ripley's K-function for binary point patterns. Neckera pennata proved to be a rare species in the studied landscape: it was found only on 31 Populus tremula trees. At the …
On statistical inference for the random set generated Cox process with set-marking.
Cox point process is a process class for hierarchical modelling of systems of non-interacting points in ℝd under environmental heterogeneity which is modelled through a random intensity function. In this work a class of Cox processes is suggested where the random intensity is generated by a random closed set. Such heterogeneity appears for example in forestry where silvicultural treatments like harvesting and site-preparation create geometrical patterns for tree density variation in two different phases. In this paper the second order property, important both in data analysis and in the context of spatial sampling, is derived. The usefulness of the random set generated Cox process is highly…
A nonstationary cylinder-based model describing group dispersal in a fragmented habitat
International audience; A doubly nonstationary cylinder-based model is built to describe the dispersal of a population from a point source. In this model, each cylinder represents a fraction of the population, i.e., a group. Two contexts are considered: The dispersal can occur in a uniform habitat or in a fragmented habitat described by a conditional Boolean model. After the construction of the models, we investigate their properties: the first and second order moments, the probability that the population vanishes, and the distribution of the spatial extent of the population.
Spatial Bayesian Modeling of Presence-only Data
Modeling Forest Tree Data Using Sequential Spatial Point Processes
AbstractThe spatial structure of a forest stand is typically modeled by spatial point process models. Motivated by aerial forest inventories and forest dynamics in general, we propose a sequential spatial approach for modeling forest data. Such an approach is better justified than a static point process model in describing the long-term dependence among the spatial location of trees in a forest and the locations of detected trees in aerial forest inventories. Tree size can be used as a surrogate for the unknown tree age when determining the order in which trees have emerged or are observed on an aerial image. Sequential spatial point processes differ from spatial point processes in that the…
Appendix C: Fundamentals of geostatistics
A hierarchical Bayesian birth cohort analysis from incomplete registry data: evaluating the trends in the age of onset of insulin-dependent diabetes mellitus (T1DM).
Childhood diabetes is one of the major non-communicable diseases in children under 15 years of age. It requires a life-long insulin treatment and may lead to serious complications. Along with the worldwide increase in the incidence several countries have recently reported a decreasing trend in the age of onset of the disease. The aim of this study is to analyse long-term data on the incidence of the childhood diabetes in Finland from the birth cohorts perspective. The annual incidence data were available for the period 1965--1996 which translates into 1951--1996 birth cohorts. Hence the data consist of completely and partially observed cohorts. Bayesian modelling was employed in the analysi…
An evaluation of the microscopical counting methods of the tape in hirst-burkard pollen and spore trap
Abstract Three different sampling units in current use and different sampling strategies were tested. Randomly placed microscope fields are good in estimating the daily mean concentration, but very big sample size is needed. Traverses across the slide in systematic order are best to estimate the shortterm concentrations and diurnal variation. A formula for the estimation of the error in one transverse traverse is given. Twelve transverse traverses in systematic order is also enough to estimate the daily mean concentration. One or two traverses along the length of the slide give often unreliable estimates because of the irregularities in the transverse variation of the particle concentration…
Appendix B: Geometrical Characteristics of Sets
Estimation of forest stand characteristics using individual tree detection, stochastic geometry and a sequential spatial point process model
Airborne Laser Scanning (ALS) results in point-wise measurements of canopy height, which can further be used for Individual Tree Detection (ITD). However, ITD cannot find all trees because small trees can hide below larger tree crowns. Here we discuss methods where the plot totals and means of tree-level characteristics are estimated in such context. The starting point is a previously presented Horvitz–Thompson-like (HT-like) estimator, where the detectability is based on the larger tree crowns and a tuning parameter that models the detection condition. We propose a new method which is based on modeling the spatial pattern of hidden tree locations using a sequential spatial point process mo…
Probabilistic small area risk assessment using GIS-based data: a case study on Finnish childhood diabetes
A Bayesian hierarchical spatial model is constructed to describe the regional incidence of insulin dependent diabetes mellitus (IDDM) among the under 15-year-olds in Finland. The model exploits aggregated pixel-wise locations for both the cases and the population at risk. Typically such data arise from combining geographic information systems (GIS) with large databases. The dates of diagnosis and locations of the cases are observed from 1987 to 1996. The population at risk counts are available for every second year during the same period. A hierarchical model is suggested for the pixel wise case counts, including a population model to account for the uncertainty of the population at risk ov…
Stationary Point Processes
Recent applications of point process methods in forestry statistics
Forestry statistics is an important field of applied statistics with a long tradition. Many forestry problems can be solved by means of point processes or marked point processes. There, the "points" are tree locations and the "marks" are tree characteristics such as diameter at breast height or degree of damage by environmental factors. Point pro- cess characteristics are valuable tools for exploratory data analysis in forestry, for describing the variability of forest stands and for under- standing and quantifying ecological relationships. Models of point pro- cesses are also an important basis of modern single-tree modeling, that gives simulation tools for the investigation of forest stru…
Bayesian analysis of a Gibbs hard-core point pattern model with varying repulsion range
A Bayesian solution is suggested for the modelling of spatial point patterns with inhomogeneous hard-core radius using Gaussian processes in the regularization. The key observation is that a straightforward use of the finite Gibbs hard-core process likelihood together with a log-Gaussian random field prior does not work without penalisation towards high local packing density. Instead, a nearest neighbour Gibbs process likelihood is used. This approach to hard-core inhomogeneity is an alternative to the transformation inhomogeneous hard-core modelling. The computations are based on recent Markovian approximation results for Gaussian fields. As an application, data on the nest locations of Sa…
Stationary Marked Point Processes
Data Augmentation Approach in Bayesian Modelling of Presence-only Data
Abstract Ecologists are interested in prediction of potential distribution of species in suitable areas, essential for planning conservation and management strategies. Unfortunately, often the only available information in such studies is the true presence of the species at few locations of the study area and the associated environmental covariates over the entire area, referred as presence-only data. We propose a Bayesian approach to estimate logistic linear regressions adapted to presence-only data through the introduction of a random approximation of the correction factor in the adjusted logistic model that allows us to overcome the need to know a priori the prevalence of the species.
Spatial Mark-Recapture Method in the Estimation of Crayfish Population Size
The mark-recapture method is considered for estimation of population size of slowly moving animals like crayfish. The Petersen type estimator for closed population is generalized for situations where recaptures are spatially dependent between the capture sites, and its variance approximation is derived using point processes as models for the population. The method of quadratic forms is suggested to be used as variance estimator. Finally, a trapping design is proposed where onc trap at recapture is replaced by four adjacent traps. A simulation experiment is performed to explain the robusticity of the new trapping design against movements of animals.
Modelling and Simulation of Stationary Point Processes
Appendix A: Fundamentals of statistics
Cancer Risk Near a Polluted River in Finland
The River Kymijoki in southern Finland is heavily polluted with polychlorinated dibenzo-p-dioxins and dibenzofurans and may pose a health threat to local residents, especially farmers. In this study we investigated cancer risk in people living near the river (less than 20.0 km) in 1980. We used a geographic information system, which stores registry data, in 500 m times 500 m grid squares, from the Population Register Centre, Statistics Finland, and Finnish Cancer Registry. From 1981 to 2000, cancer incidence in all people (N = 188884) and in farmers (n = 11132) residing in the study area was at the level expected based on national rates. Relative risks for total cancer and 27 cancer subtype…
Intensity estimation for inhomogeneous Gibbs point process with covariates-dependent chemical activity
Recent development of intensity estimation for inhomogeneous spatial point processes with covariates suggests that kerneling in the covariate space is a competitive intensity estimation method for inhomogeneous Poisson processes. It is not known whether this advantageous performance is still valid when the points interact. In the simplest common case, this happens, for example, when the objects presented as points have a spatial dimension. In this paper, kerneling in the covariate space is extended to Gibbs processes with covariates-dependent chemical activity and inhibitive interactions, and the performance of the approach is studied through extensive simulation experiments. It is demonstr…
Conditionally heteroscedastic intensity-dependent marking of log Gaussian Cox processes
Spatial marked point processes are models for systems of points which are randomly distributed in space and provided with measured quantities called marks. This study deals with marking, that is methods of constructing marked point processes from unmarked ones. The focus is density-dependent marking where the local point intensity affects the mark distribution. This study develops new markings for log Gaussian Cox processes. In these markings, both the mean and variance of the mark distribution depend on the local intensity. The mean, variance and mark correlation properties are presented for the new markings, and a Bayesian estimation procedure is suggested for statistical inference. The p…