0000000000193129

AUTHOR

Irene Carola Spera

Supervised vs Unsupervised Latent DirichletAllocation: topic detection in lyrics.

Topic modeling is a type of statistical modeling for discovering the abstract ``topics'' that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of topic model and is used to classify text in a document to a particular topic. It builds a fixed number of topics starting from words in each document modeled according to a Dirichlet distribution. In this work we are going to apply LDA to a set of songs from four famous Italian songwriters and split them into topics. This work studies the use of themes in lyrics using statistical analysis to detect topics. Aim of the work is to underline the main limits of the standard unsupervised LDA and to propose a supervised…

research product

A model-based approach to Spotify data analysis: a Beta GLMM

Digital music distribution is increasingly powered by automated mechanisms that continuously capture, sort and analyze large amounts of Web-based data. This paper deals with the management of songs audio features from a statistical point of view. In particular, it explores the data catching mechanisms enabled by Spotify Web API and suggests statistical tools for the analysis of these data. Special attention is devoted to songs popularity and a Beta model, including random effects, is proposed in order to give the first answer to questions like: which are the determinants of popularity? The identification of a model able to describe this relationship, the determination within the set of char…

research product

Impact of the COVID-19 pandemic on music: a method for clustering sentiments

The outbreak of coronavirus disease 2019 (COVID-19) was highly stressful for people. In general, fear and anxiety about a disease can be overwhelming and cause strong emotions in adults and children. One way to cope with this stress consists in listening to music. Aim of this work is to understand if the music heard during the lock-down reflects the emotions generated by the pandemic on each of us. So, the primary goal of this work is to build two indices for measuring the anger and joy levels of the top streamed songs by Italian Spotify users (during the SARS-CoV-2 pandemic), and study their evolution over time. A Hierarchical Cluster Analysis has been applied in order to identify groups o…

research product

Ho perso le parole: come ritrovarle con la sentiment analysis. Metodi statistici per l'analisi della produzione discografica di Luciano Ligabue.

Questo libro nasce dall’incontro tra due persone con le stesse due passioni di vita: la statistica e Ligabue. E chi lo ha detto che nella vita non si può unire la passione musicale con la vita lavorativa quotidiana? Karl Pearson diceva «Statistics is the grammar of Science» e chi può non definire la musica come Scienza? La musica che incontra la scienza dà origine a creatività e bellezza, e in questa libro si andrà alla ricerca di questa connessione. Attraverso l’uso di sofisticate tecniche statistiche viaggeremo tra le caratteristiche musicali dei brani di Luciano Ligabue, studiandone i cambiamenti temporali e gli elementi di maggiore interesse per chi lo ascolta ma anche per chi lo critic…

research product