Search results for "statistical"
showing 10 items of 4960 documents
Exploring topics in LDA models through Statistically Validated Networks: directed and undirected approaches
2022
Probabilistic topic models are machine learning tools for processing and understanding large text document collections. Among the different models in the literature, Latent Dirichlet Allocation (LDA) has turned out to be the benchmark of the topic modelling community. The key idea is to represent text documents as random mixtures over latent semantic structures called topics. Each topic follows a multinomial distribution over the vocabulary words. In order to understand the result of a topic model, researchers usually select the top-n (essential words) words with the highest probability given a topic and look for meaningful and interpretable semantic themes. This work proposes a new method …
Ranking coherence in topic models using statistically validated networks
2023
Probabilistic topic models have become one of the most widespread machine learning techniques in textual analysis. Topic discovering is an unsupervised process that does not guarantee the interpretability of its output. Hence, the automatic evaluation of topic coherence has attracted the interest of many researchers over the last decade, and it is an open research area. This article offers a new quality evaluation method based on statistically validated networks (SVNs). The proposed probabilistic approach consists of representing each topic as a weighted network of its most probable words. The presence of a link between each pair of words is assessed by statistically validating their co-oc…
High-frequency trading and networked markets
2021
Financial markets have undergone a deep reorganization during the last 20 y. A mixture of technological innovation and regulatory constraints has promoted the diffusion of market fragmentation and high-frequency trading. The new stock market has changed the traditional ecology of market participants and market professionals, and financial markets have evolved into complex sociotechnical institutions characterized by a great heterogeneity in the time scales of market members’ interactions that cover more than eight orders of magnitude. We analyze three different datasets for two highly studied market venues recorded in 2004 to 2006, 2010 to 2011, and 2018. Using methods of complex network th…
Second‐order analysis of marked inhomogeneous spatiotemporal point processes: Applications to earthquake data
2018
To analyse interactions in marked spatio-temporal point processes (MSTPPs), we introduce marked second-order reduced moment measures and K-functions for inhomogeneous second-order intensity reweigh ...
One-dimensional random walks with self-blocking immigration
2017
We consider a system of independent one-dimensional random walkers where new particles are added at the origin at fixed rate whenever there is no older particle present at the origin. A Poisson ansatz leads to a semi-linear lattice heat equation and predicts that starting from the empty configuration the total number of particles grows as $c \sqrt{t} \log t$. We confirm this prediction and also describe the asymptotic macroscopic profile of the particle configuration.
Statistics of nonlinear stochastic dynamical systems under Lévy noises by a convolution quadrature approach
2010
This paper describes a novel numerical approach to find the statistics of the non-stationary response of scalar non-linear systems excited by L\'evy white noises. The proposed numerical procedure relies on the introduction of an integral transform of Wiener-Hopf type into the equation governing the characteristic function. Once this equation is rewritten as partial integro-differential equation, it is then solved by applying the method of convolution quadrature originally proposed by Lubich, here extended to deal with this particular integral transform. The proposed approach is relevant for two reasons: 1) Statistics of systems with several different drift terms can be handled in an efficie…
Random walks in dynamic random environments and ancestry under local population regulation
2015
We consider random walks in dynamic random environments, with an environment generated by the time-reversal of a Markov process from the oriented percolation universality class. If the influence of the random medium on the walk is small in space-time regions where the medium is typical, we obtain a law of large numbers and an averaged central limit theorem for the walk via a regeneration construction under suitable coarse-graining. Such random walks occur naturally as spatial embeddings of ancestral lineages in spatial population models with local regulation. We verify that our assumptions hold for logistic branching random walks when the population density is sufficiently high.
Detection of spatial disease clusters with LISA functions.
2011
Detection of disease clusters is an important tool in epidemiology that can help to identify risk factors associated with the disease and in understanding its etiology. In this article we propose a method for the detection of spatial clusters where the locations of a set of cases and a set of controls are available. The method is based on local indicators of spatial association functions (LISA functions), particularly on the development of a local version of the product density, which is a second-order characteristic of spatial point processes. The behavior of the method is evaluated and compared with Kulldorff's spatial scan statistic by means of a simulation study. It is shown that the LI…
Wronskian and Casorati determinant representations for Darboux–Pöschl–Teller potentials and their difference extensions
2009
We consider some special reductions of generic Darboux?Crum dressing formulae and of their difference versions. As a matter of fact, we obtain some new formulae for Darboux?P?schl?Teller (DPT) potentials by means of Wronskian determinants. For their difference deformations (called DDPT-I and DDPT-II potentials) and the related eigenfunctions, we obtain new formulae described by the ratios of Casorati determinants given by the functional difference generalization of the Darboux?Crum dressing formula.
A Bayesian Sequential Look at u-Control Charts
2005
We extend the usual implementation of u-control charts (uCCs) in two ways. First, we overcome the restrictive (and often inadequate) assumptions of the Poisson model; next, we eliminate the need for the questionable base period by using a sequential procedure. We use empirical Bayes(EB) and Bayes methods and compare them with the traditional frequentist implementation. EB methods are somewhat easy to implement, and they deal nicely with extra-Poisson variability (and, at the same time, informally check the adequacy of the Poisson assumption). However, they still need the base period. The sequential, full Bayes approach, on the other hand, also avoids this drawback of traditional u-charts. T…