6533b7d5fe1ef96bd1263c0b

RESEARCH PRODUCT

Methodological Approach for Messages Classification on Twitter Within E-Government Area

Eduard Alexandru StoicaAntoniu Gabriel PiticEsra Kahya ÖZyirmidokuzKumru Uyar

subject

Text corpusFocus (computing)Computer scienceRomanianmedia_common.quotation_subjectSubject (documents)language.human_languageWorld Wide WebNaive Bayes classifierConstant (computer programming)languageSocial mediaConversationmedia_common

description

The constant growth in the numbers of Social Media users is a reality of the past few years. Companies, governments and researchers focus on extracting useful data from Social Media. One of the most important things we can extract from the messages transmitted from one user to another is the sentiment—positive, negative or neutral—regarding the subject of the conversation. There are many studies on how to classify these messages, but all of them need a huge amount of data already classified for training, data not available for Romanian language texts. We present a case study in which we use a Naive Bayes classifier trained on an English short text corpus on several thousand Romanian texts. We use Google Translate to adapt the Romanian texts and we validate the results by manually classifying some of them.

https://doi.org/10.1007/978-3-030-01878-8_30