Search results for "Twitter"
showing 10 items of 207 documents
Real-time detection of twitter social events from the user's perspective
2015
Over the last 40 years, automatic solutions to analyze text documents collection have been one of the most attractive challenges in the field of information retrieval. More recently, the focus has moved towards dynamic, distributed environments, where documents are continuously created by the users of a virtual community, i.e., the social network. In the case of Twitter, such documents, called tweets, are usually related to events which involve many people in different parts of the world. In this work we present a system for real-time Twitter data analysis which allows to follow a generic event from the user's point of view. The topic detection algorithm we propose is an improved version of…
A framework for real-time Twitter data analysis
2016
A framework for real-time Twitter data analysisWe propose improvements to the Soft Frequent Pattern Mining (SFPM) algorithmThe stream of tweets is organized in dynamic windows whose size depends both on the volume of tweets and timeThe set of keywords used to query Twitter is progressively refined to highlight the user's point of viewComparisons with two state of the art systems Twitter is a popular social network which allows millions of users to share their opinions on what happens all over the world. In this work we present a system for real-time Twitter data analysis in order to follow popular events from the user's perspective. The method we propose extends and improves the Soft Freque…
T100: A modern classic ensemble to profile irony and stereotype spreaders
2022
In this work we propose a novel ensemble model based on deep learning and non-deep learning classifiers. The proposed model was developed by our team for participating at the Profiling Irony and Stereotype Spreaders (ISSs) task hosted at PAN@CLEF2022. Our ensemble (named T100), include a Logistic Regressor (LR) that classifies an author as ISS or not (nISS) considering the predictions provided by a first stage of classifiers. All these classifiers are able to reach state-of-the-art results on several text classification tasks. These classifiers (namely, the voters) are a Convolutional Neural Network (CNN), a Support Vector Machine (SVM), a Decision Tree (DT) and a Naive Bayes (NB) classifie…
Twitter spam account detection by effective labeling
2019
In the last years, the widespread diffusion of Online Social Networks (OSNs) has enabled new forms of communications that make it easier for people to interact remotely. Unfortunately, one of the first consequences of such a popularity is the increasing number of malicious users who sign-up and use OSNs for non-legit activities. In this paper we focus on spam detection, and present some preliminary results of a system that aims at speeding up the creation of a large-scale annotated dataset for spam account detection on Twitter. To this aim, two different algorithms capable of capturing the spammer behaviors, i.e., to share malicious urls and recurrent contents, are exploited. Experimental r…
Fake News Spreaders Detection: Sometimes Attention Is Not All You Need
2022
Guided by a corpus linguistics approach, in this article we present a comparative evaluation of State-of-the-Art (SotA) models, with a special focus on Transformers, to address the task of Fake News Spreaders (i.e., users that share Fake News) detection. First, we explore the reference multilingual dataset for the considered task, exploiting corpus linguistics techniques, such as chi-square test, keywords and Word Sketch. Second, we perform experiments on several models for Natural Language Processing. Third, we perform a comparative evaluation using the most recent Transformer-based models (RoBERTa, DistilBERT, BERT, XLNet, ELECTRA, Longformer) and other deep and non-deep SotA models (CNN,…
COURAGE at CheckThat! 2022: Harmful Tweet Detection using Graph Neural Networks and ELECTRA
2022
In this paper we propose a deep learning model based on graph machine learning (i.e. Graph Attention Convolution) and a pretrained transformer language model (i.e. ELECTRA). Our model was developed to detect harmful tweets about COVID-19 and was used to tackle subtask 1C (harmful tweet detection) at the CheckThat!Lab shared task organized as part of CLEF 2022. In this binary classification task, our proposed model reaches a binary F1 score (positive class label, i.e. harmful tweet) of 0.28 on the test set. We demonstrate that our approach outperforms the official baseline by 8% and describe our model as well as the experimental setup and results in detail. We also refer to limitations of th…
«IN ALTO I CUORI / L’ITALIA CAMBIA VERSO». DISCORSO POLITICO E INTERAZIONE NEI SOCIAL NETWORK
2016
Scopo della comunicazione è studiare la presenza dei politici italiani nei due social network con maggior numero di utenti (Facebook e Twitter). Nel dibattito sviluppatosi negli ultimi anni è stato sottolineato come il passaggio dal web 1.0 al web 2.0 (e ormai quasi 3.0) abbia portato a un avvicinamento fra politici e utenti/elettori/cittadini (in quest’ordine). Nel più generale contesto di “disintermediazione della comunicazione” (Bentivegna 2002), si è pensato (forse ci si è illusi) che il superamento del filtro rappresentato dai media tradizionali avrebbe creato un rapporto diretto fra politici e cittadini, quasi del tipo “dal produttore al consumatore”. Recenti lavori (ad esempio Spina …
Discorsi geneticamente modificati nella democrazia dello 'streaming'. Il nuovo ordine del discorso politico nell'Italia post-berlusconiana
2016
In this paper I discuss some features of Italian political discourse after the ending of the so called ‘berlusconismo’ (that is, the specific communicative style of Silvio Berlusconi, the leader of Italian right wing for almost two decades). In particular, I display the figure of Matteo Renzi, the current Italian Prime Minister and Secretary of PD, the most important Italian political party. In my work, I try to pinpoint the peculiarity of Renzi’s discourse strategies. In this vein, I suggest to use the notion of ‘disphasic discourse’, in order to underline the strong decrease of styles, genres and textual types variability characterizing his ways of communicating. I focus the attention on …
Marked Hawkes processes for Twitter data
2022
In this paper, we propose to model retweet event sequences using a marked Hawkes process, which is a self-exciting point process where the occurrence of previous events in time increases the probability of further events. The aim is to analyse Twitter data combining temporal point processes theory and textual analysis. Since each retweet event carries a set of properties, we mark the process by different characteristics drawn from the textual analysis, finding that the tone of the description of the Twitter user is a good predictor of the number of retweets in a single cascade.
Markov Model for Tweets Geographic Distribution Characterization
2015
Abstract In this paper we will continue our researches regarding e-Business and e-Government modeling on Social Media presented in (Stoica, Pitic, & Mihaescu, 2013). Among message and user parameters we add a new parameter used to describe the geographical dispersion of Twitter messages. This new parameter will characterize the way one set of messages will spread in Social Graph from the physical word point of view. The first model, presented as “A Novel Model for E-Business and E-Government Processes on Social”, will be extended with the geographical parameter PG. We will define and we will describe the Markov Model used to organize the messages gathered from social media. The main idea of…