Search results for "statistically validated network"

showing 8 items of 18 documents

Statistically Validated Networks for assessing topic quality in LDA models

2022

Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) (Blei et al., 2003) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution overwords characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be only characterized by a set of irrelevant or unchained words, being useless for the interpretation. Although many topic-quality metrics were proposed (Newman et al., 2009; Alet…

Settore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.Settore SECS-S/01 - StatisticaTopic Model Topic Coherence LDA Statistically Validated Networks
researchProduct

Statistically Validated Networks for evaluating coherence in topic models

2022

Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be characterized by a set of irrelevant or unchained words, being useless for the interpretation. In the framework of topic quality evaluation, the pairwise semantic cohesion among the top-N most pr…

Settore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.Text Mining Probabilistic Topic Models Topic coherence Statistically Validated NetworksSettore SECS-S/01 - Statistica
researchProduct

MEASURING TOPIC COHERENCE THROUGH STATISTICALLY VALIDATED NETWORKS

2020

Topic models arise from the need of understanding and exploring large text document collections and predicting their underlying structure. Latent Dirichlet Allocation (LDA) (Blei et al., 2003) has quickly become one of the most popular text modelling techniques. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models give no guaranty on the interpretability of their outputs. The topics learned from texts may be characterized by a set of irrelevant or unchained words. Therefore, topic models require validation of the coherence of estimated topics. However, the automatic evaluation …

Settore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.topic model topic coherence LDA statistically validated networks.Settore SECS-S/01 - Statistica
researchProduct

STRANIERI, MERIDIONALI O PROVINCIALI? I CONSUMI NEL TEMPO LIBERO DELLE SECONDE GENERAZIONI

2022

In this paper, we analyze consumption patterns of leisure time among young people belonging to the so-called “second generation” of immigrants in Italy. Leisure time consumption describes how young immigrants use cultural products and services. We analyze data collected by the ISTAT through the survey on the “second generations” (2015). A comparison of leisure consumption patterns between second-generation immigrants and their Italian peers does not show significant differences. Rather, differences in consumption styles are associated to gender (male/female), geographic area of residence (North/South), and size of the municipality (large municipality/small municipality) of residence.

Settore SPS/07 - Sociologia GeneraleYoung immigrants Leisure time consumption Social integration Statistically validated networks.
researchProduct

Exploring topics in LDA models through Statistically Validated Networks: directed and undirected approaches

2022

Probabilistic topic models are machine learning tools for processing and understanding large text document collections. Among the different models in the literature, Latent Dirichlet Allocation (LDA) has turned out to be the benchmark of the topic modelling community. The key idea is to represent text documents as random mixtures over latent semantic structures called topics. Each topic follows a multinomial distribution over the vocabulary words. In order to understand the result of a topic model, researchers usually select the top-n (essential words) words with the highest probability given a topic and look for meaningful and interpretable semantic themes. This work proposes a new method …

Statistically Validated NetworkLDATopic Model
researchProduct

Ranking coherence in topic models using statistically validated networks

2023

Probabilistic topic models have become one of the most widespread machine learning techniques in textual analysis. Topic discovering is an unsupervised process that does not guarantee the interpretability of its output. Hence, the automatic evaluation of topic coherence has attracted the interest of many researchers over the last decade, and it is an open research area. This article offers a new quality evaluation method based on statistically validated networks (SVNs). The proposed probabilistic approach consists of representing each topic as a weighted network of its most probable words. The presence of a link between each pair of words is assessed by statistically validating their co-oc…

Statistically Validated NetworksTopic coherenceText MiningProbabilistic Topic modelLibrary and Information SciencesInformation SystemsJournal of Information Science
researchProduct

High-frequency trading and networked markets

2021

Financial markets have undergone a deep reorganization during the last 20 y. A mixture of technological innovation and regulatory constraints has promoted the diffusion of market fragmentation and high-frequency trading. The new stock market has changed the traditional ecology of market participants and market professionals, and financial markets have evolved into complex sociotechnical institutions characterized by a great heterogeneity in the time scales of market members’ interactions that cover more than eight orders of magnitude. We analyze three different datasets for two highly studied market venues recorded in 2004 to 2006, 2010 to 2011, and 2018. Using methods of complex network th…

Statistically validated networks050208 financeMultidisciplinarySociotechnical systemFinancial markets05 social sciencesFinancial marketEvolutionary Models of Financial Markets Special FeatureComplex networksMonetary economicsComplex networkSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Market liquidity0502 economics and businessPortfolioStock marketBusiness050207 economicsHigh-frequency tradingHigh-frequency tradingStock (geology)Proceedings of the National Academy of Sciences
researchProduct

A primer on statistically validated networks

2019

In this contribution we discuss some approaches of network analysis providing information about single links or single nodes with respect to a null hypothesis taking into account the heterogeneity of the system empirically observed. With this approach, a selection of nodes and links is feasible when the null hypothesis is statistically rejected. We focus our discussion on approaches using i) the so-called disparity filter and ii) statistically validated network in bipartite networks. For both methods we discuss the importance of using multiple hypothesis test correction. Specific applications of statistically validated networks are discussed. We also discuss how statistically validated netw…

complex networks statistically validated networksSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)
researchProduct