0000000000763756
AUTHOR
Annabelle Gillet
Une plateforme haute performance pour l’exploitation des données massives - Application aux données des réseaux sociaux
International audience
Frontières des communautés polarisées : application à l'étude des théories complotistes autour des vaccins
Les données des réseaux sociaux sont de plus en plus utilisées pour en extraire de la valeur, dans des domaines tels que le marketing, la politique ou la sociologie. Celles-ci peuvent être représentées sous forme de graphes, en modélisant précisément les interactions à travers des liens dirigés et pondérés. Dans l'analyse des données des réseaux sociaux, l'étude des communautés est une étape essentielle. Toutefois, pour une interprétation fine des phénomènes, il est également nécessaire d'étudier leurs interactions et de pouvoir détecter des traces de polarisation. Nous proposons une méthode qui permet d'évaluer l'antagonisme des communautés et d'identifier leurs frontières dans des réseaux…
Modelling and development of a generic observatory to harvest and analyze big data
Big Data fascinate, both because of the value they hold that can provide a significant advantage in decision-making, and because of the challenges that their exploitation represents. These challenges are present at several levels of analytics workflows. At the level of the creation of software architectures, the volume and the velocity require at least enough performance to handle the ingestion and storage of data. The data variety has also an impact, as several new storage systems have emerged, each one corresponding to a specific need. The polystores are systems that integrate this diversity, to gain flexibility compared to the data warehouses, now too rigid. However, this diversification…
Lambda Architecture pour une analyse à haute performance des données des réseaux sociaux
In this article, we show how a Lambda Architecture can contribute to the development of a platform for collecting and analyzing, in real-time, data from Twitter. After having presented the context, detailed the needs and identified the expected specificities, we compare the Lambda and Kappa architectures and we describe the state of the art on Lambda Architecture use in different domains. We propose an adaptation of the Lambda architecture to allow the storage of data in a polystore and to take into account different types of analysis to be carried out to answer researches in social sciences and communication sciences. In these projects the objectives are to study the structure of communica…
Detection of antagonism and polarization on social media through community boundaries
Analyse des discours sur Twitter dans une situation de crise:Étude de l’incident à l’usine Lubrizol de Rouen
International audience
Lambda+ Architecture - Un patron d'architecture haute performance pour le traitement des Big Data
International audience
Lambda+, the renewal of the Lambda Architecture: Category Theory to the rescue
Designing software architectures for Big Data is a complex task that has to take into consideration multiple parameters, such as the expected functionalities, the properties that are untradeable, or the suitable technologies. Patterns are abstractions that guide the design of architectures to reach the requirements. One of the famous patterns is the Lambda Architecture, which proposes real-time computations with correctness and fault-tolerance guarantees. But the Lambda has also been highly criticized, mostly because of its complexity and because the real-time and correctness properties are each effective in a different layer but not in the overall architecture. Furthermore, its use cases a…
ERIS: An Approach Based on Community Boundaries to Assess Polarization in Online Social Networks
Detection and characterization of polarization are of major interest in Social Network Analysis, especially to identify conflictual topics that animate the interactions between users. As gatekeepers of their community, users in the boundaries significantly contribute to its polarization. We propose ERIS, a formal graph approach relying on community boundaries and users' interactions to compute two metrics: the community antagonism and the porosity of boundaries. These values assess the degree of opposition between communities and their aversion to external exposure, allowing an understanding of the overall polarization through the behaviors of the different communities. We also present an i…
The Tucker tensor decomposition for data analysis: capabilities and advantages
Tensors are powerful multi-dimensional mathematical objects, that easily embed various data models such as relational, graph, time series, etc. Furthermore, tensor decomposition operators are of great utility to reveal hidden patterns and complex relationships in data. In this article, we propose to study the analytical capabilities of the Tucker decomposition, as well as the differences brought by its major algorithms. We demonstrate these differences through practical examples on several datasets having a ground truth. It is a preliminary work to add the Tucker decomposition to the Tensor Data Model, a model aiming to make tensors data-centric, and to optimize operators in order to enable…