Une plateforme haute performance pour l’exploitation des données massives - Application aux données des réseaux sociaux
International audience
Influence Assessment in Twitter Multi-relational Network
International audience; Influence in Twitter has become recently a hot research topic since this micro-blogging service is widely used to share and disseminate information. Some users are more able than others to influence and persuade peers. Thus, studying most influential users leads to reach a large-scale information diffusion area, something very useful in marketing or political campaigns. In this paper, we propose a new approach for influence assessment on Twitter network, it is based on a modified version of the conjunctive combination rule in belief functions theory in order to combine different influence markers such as retweets, mentions and replies. We experiment the proposed meth…
A comparison of community-aware centrality measures in online social networks
Do we need metamodels AND ontologies for engineering platforms?
In this paper we show how the joint use of metamodeling and ontologies allows to describe domain knowledge for a complex domain. Ontologies are used as stabilized descriptions of a business domain while metamodels allow a fine description of the domain (to be constructed in the initial phases of modeling). We propose to use an ontology for early categorization, i.e., as a "natural" complement of the formal system that is induced by the metamodel.
Access and Annotation of Archaeological Corpus via a Semantic Wiki
Semantic wikis have shown their ability to allow knowledge management and collaborative authoring. They are particularly appropriate for scientific collaboration. This paper details the main concepts and the architecture of WikiBridge, a semantic wiki, and its application in the archaelogical domain. Archaeologists primarily have a document-centric work. Adding meta-information in the form of annotations has proved useful to enhance search. WikiBridge combines models and ontologies to increase data consistency within the wiki. Moreover, it allows several types of annotations: simple annotations, n-ary relations and recursive annotations. The consistency of these annotations is checked synch…
Investigating the Relationship Between Community-aware and Classical Centrality Measures
International audience
A Modularity Backbone Extraction Method for Weighted Complex Networks
The exponential growth in the size of real-world networks is a major barrier to analyzing their structure and dynamics. Thus, reducing the network's size while maintaining its topological features is highly significant. As community structure is one of the fundamental fingerprints of real-world networks, this work proposes a new node-filtering backbone extraction method to preserve the network's community structure.
A la recherche des mini-publics : un problème de communautés, de singularités et de sémantique
International audience
An empirical study on classical and community-aware centrality measures in complex networks
Community structure is a ubiquitous feature in natural and artificial systems. Identifying key nodes is a fundamental task to speed up or mitigate any diffusive processes in these systems. Centrality measures aim to do so by selecting a small set of critical nodes. Classical centrality measures are agnostic to community structure, while community-aware centrality measures exploit this property. Several works study the relationship between classical centrality measures, but the relationship between classical and community-aware centrality measures is almost unexplored. In this work [1], we answer two questions: (1) How do classical and community-aware centrality measures relate? (2) What is …
Modèle de réseaux multiplexe pour l'étude de l'influence sur Twitter.
International audience
Development Platforms as a Niche for Software Companies in Open Source Software
As long as information systems do not become overly large and while they address a well-known domain, they can be controlled by engineering staff. Nevertheless, when dealing with large-scale, complex, or innovative information systems, it can be difficult to separate design issues and to formulate a meaningful information system proposal. In such a context, platforms for software engineering appear to be a promising approach. In this paper, we propose to view development platforms as a major opportunity for Open Source Software and Open Formats.
A typology of algorithms for social network analysis: results from experiments - towards a toolbox for social scientist
International audience
AMUN: An Object Oriented Model For Cooperative Spatial Information Systems
International audience; The diversity of spatial information systems promote the need to integrate heterogeneous spatial or geographic information systems (GIS) in a cooperative environment. We present an on going research project, called ISIS (Interoperable Spatial Information System), which aims to build an environment to support interoperability of GIS by interconnecting spatial data repositories and spatial processing resources. Our solution combines techniques from traditional interoperable information systems, spatial data modeling and distributed object oriented databases. While object oriented data modeling impact has been studied in spatial databases, research in model for distribu…
Scientific collaborations: Principles of wikibridge design
Semantic wikis, wikis enhanced with Semantic Web technologies, are appropriate systems for community-authored knowledge models. They are particularly suitable for scientific collaboration. This paper details the design principles ofWikiBridge, a semantic wiki.
Structured Wiki with Annotation for Knowledge Management: an Application to Cultural Heritage
International audience; In this paper, we highlight how semantic wikis can be relevant solutions for building cooperative data driven applications in domains characterized by a rapid evolution of knowledge. We will point out the semantic capabilities of annotated databases and structured wikis to provide better quality of content, to support complex queries and finally to carry on different type of users. Then we compare database application development with wiki for domains that encompass evolving knowledge. We detail the architecture of WikiBridge, a semantic wiki, which integrates templates forms and allows complex annotations as well as consistency checking. We describe the archaeologic…
Towards A Twitter Observatory: A Multi-Paradigm Framework For Collecting, Storing And Analysing Tweets
International audience; In this article we show how a multi-paradigm framework can fulfil the requirements of tweets analysis and reduce the waiting time for researchers that use computational resources and storage systems to support large-scale data analysis. The originality of our approach is to combine concerns about data harvesting, data storage, data analysis and data visualisation into a framework that supports inductive reasoning in multidisciplinary scientific research. Our main contribution is a polyglot storage system with a generic data model to support logical data independence and a set of tools that can provide a suitable solution for mixing different types of algorithms in or…
Détection de précurseurs d'évènements basés sur les motifs dans les réseaux sociaux
Les données issues des réseaux sociaux suscitent l'intérêt des chercheurs qui développent des algorithmes et des modèles d'apprentissage automatique pour analyser les interactions et les comportements des utilisateurs. Ces méthodes s'appuient sur la topologie du réseau pour représenter les changements structurels et pour détecter des précurseurs remarquables précédant généralement des évènements majeurs. L'étude présentée dans cet article vise à étudier si certains graphlets (motifs spécifiques) peuvent être considérés comme des précurseurs d'évènements. Nous expérimentons la méthode proposée sur trois ensembles de données de réseaux sociaux. Nous étudions également le rôle joué dans les gr…
On the definition of generic multi-layered ontologies for urban applications
Cooperation of information systems is essential for providing decision support for urban management applications. This involves sharing data across collections of the heterogeneous information systems that are used to manage large urban infrastructures. The objective of this work is to define a spatial ontology to describe key features of urban applications, providing a foundation for semantic reconciliation among heterogeneous spatial information sources. We propose a multi-layered ontologies definition framework consisting of ontology layers which are composed of a generic functional structure and one or more domain ontologies. The functional structure embodies general ontological concept…
Analyzing the Correlation of Classical and Community-aware Centrality Measures in Complex Networks
International audience; Identifying influential nodes in social networks is a fundamental issue. Indeed, it has many applications, such as inhibiting epidemic spreading, accelerating information diffusion, preventing terrorist attacks, and much more. Classically, centrality measures quantify the node's importance based on various topological properties of the network, such as Degree and Betweenness. Nonetheless, these measures are agnostic of the community structure, although it is a ubiquitous characteristic encountered in many real-world networks. To overcome this drawback, there is a growing trend to design so-called community-aware centrality measures. Although several works investigate…
Évaluation de l’influence dans un réseau multi-relationnel : le cas de Twitter
International audience
Adding Semantic Extension to Wikis for Enhancing Cultural Heritage Applications
International audience; Wikis are appropriate systems for community-authored content. In the past few years, they show that are particularly suitable for collaborative works in cultural heritage. In this paper, we highlight how wikis can be relevant solutions for building cooperative applications in domains characterized by a rapid evolution of knowledge. We will point out the capabilities of semantic extension to provide better quality of content, to improve searching, to support complex queries and finally to carry out di fferent type of users. We describe the CARE project and explain the conceptual modeling approach. We detail the architecture of WikiBridge, a semantic wiki which allows …
How Correlated Are Community-Aware and Classical Centrality Measures in Complex Networks?
Unlike classical centrality measures, recently developed community-aware centrality measures use a network’s community structure to identify influential nodes in complex networks. This paper investigates their relationship on a set of fifty real-world networks originating from various domains. Results show that classical and community-aware centrality measures generally exhibit low to medium correlation values. These results are consistent across networks. Transitivity and efficiency are the most influential macroscopic network features driving the correlation variation between classical and community-aware centrality measures. Additionally, the mixing parameter, the modularity, and the Max…
An Empirical Comparison of Centrality and Hierarchy Measures in Complex Networks
International audience
SNFreezer: a Platform for Harvesting and Storing Tweets in a Big Data Context
International audience
Lambda Architecture pour une analyse à haute performance des données des réseaux sociaux
In this article, we show how a Lambda Architecture can contribute to the development of a platform for collecting and analyzing, in real-time, data from Twitter. After having presented the context, detailed the needs and identified the expected specificities, we compare the Lambda and Kappa architectures and we describe the state of the art on Lambda Architecture use in different domains. We propose an adaptation of the Lambda architecture to allow the storage of data in a polystore and to take into account different types of analysis to be carried out to answer researches in social sciences and communication sciences. In these projects the objectives are to study the structure of communica…
Modèle tensoriel pour l'entreposage et l'analyse des données des réseaux sociaux Application à l'étude de la viralité sur Twitter
International audience; Dans cet article, nous montrons comment la notion de tenseur permet de construire un modèle multi-paradigmes pour l'entreposage des données sociales. D'un point de vue architecture , cette approche permet de lier différents systèmes de stockage (polystore) et de limiter l'impact des outils ETL réalisant les transformations de modèles pour alimenter différents al-gorithmes d'analyse. Ainsi, le modèle proposé permet d'assurer l'indépendance logique entre les données et les programmes implantant les algorithmes d'analyse. Avec un cas concret ex-trait d'une étude de la viralité sur Twitter durant la période de l'entre deux tours de l'élection présidentielle française de …
The EU Election on Twitter
In this chapter the political implications of social media and their affordances for political discourse are examined. The focus is on candidates’ Twitter usage during the 2014 EU elections in France and in Germany. Quantitative and qualitative analyses of tweets collected during a period of four weeks have been carried out on the basis of the functional operator model of Twitter. The model serves as a framework for assessing users’ tweeting styles, which can range between personal-interactive and topical-informative. The comparison of French and German top candidates’ tweeting styles, which mainly appear to be „personal-interactive“, however questions the candidates’ alleged efforts to ent…
Object Clustering Methods and a Query Decomposition Strategy for Distributed Object-Based Information Systems
Emerging developments and advances in distributed processing have created a need for tools and methods to partition and distribute information systems across interconnected processors. In particular, distribution approaches which take into account the key characteristics of OO concepts are required to extend traditional fragmentation results to object oriented database systems. To fulfill the above requirements, we propose a methodology for the distribution design of object-based information systems. The underlying approach consists of techniques and heuristics that can be used to create clusters of inter-related object classes that can be fragmented interdependently, producing distribution…
Évaluation de l’influence polarisée dans un réseau multi-relationnel : application à twitter
International audience
Temporal density of complex networks and ego-community dynamics
International audience
Detection of antagonism and polarization on social media through community boundaries
Traitement des variabilités métier dans les Systèmes d'Information biologiques
International audience; Les systèmes d'information scientifique nécessitent des fonctionnalités pour supporter deux types de variabilités majeures, la variabilité inter-acteurs et la variabilité inter-études. Nous traitons la variabilité inter-acteurs par un système d'importation des données garant de la qualité des données et la variabilité inter-études par l'utilisation d'un mécanisme d'annotation couplé au mécanisme de persistance. Afin de contrôler la qualité des données lors de leur importation en provenance de différents acteurs ou lors de leur annotation, nous proposons une approche basée sur deux niveaux de connaissance : 1) la connaissance relative aux applications du SI est représ…
ISIS: a semantic mediation model and an agent based architecture for GIS interoperability
The diversity of spatial information systems promotes the need to integrate heterogeneous spatial or geographic information systems (GIS) in a cooperative environment. The paper describes the research project ISIS (Interoperable Spatial Information System) which is a semantic mediation approach to support GIS interoperability. Its key characteristic is a dynamic resolution of semantic conflicts which is adequate for achieving autonomy, flexibility and extensibility. We propose a spatial OO data model and a mediation architecture based on multi-agent paradigm to support GIS interoperability.
A Top-Down Approach Based on Business Patterns for Web Information Systems Design
International audience; In this paper we develop an approach that is based on a top-down strategy to realization of transactional web services. Our approach highlights non-functional properties (e.g., traceability, security) which are essential to preserving an application's quality. It is implemented in three steps. The first step is a breakdown of the application in accordance with a related business involved. The goal of this step is to have sets of actors and activity patterns defined as an activity workflow that support the architecture of the application. The next step allows developing a mapping of the activity pattern on this architecture. The aim of this step is to identify the ris…
Viral Tweets, Fake News and Social Bots in Post-Factual Politics
•PurposeIn the wake of Brexit and the 2016 US Presidential Elections, “post-factual” society has been heralded as a new era of political communications, where the digital public sphere plays a central role, in spreading “viral” contents and “fake news”, with the help of automated accounts or “social bots”. This paper seeks to define these terms and the methods by which the phenomena they commonly designate might be studied, in order to characterise the dynamics of political deliberation during the 2017 French Presidential Elections on Twitter, the online platform most commonly used for political communication in France. It thus aims to better understand the mechanisms by which information b…
Assessing the Relationship Between Centrality and Hierarchy in Complex Networks
International audience
Investigating Centrality Measures in Social Networks with Community Structure
Centrality measures are crucial in quantifying the influence of the members of a social network. Although there has been a great deal of work dealing with this issue, the vast majority of classical centrality measures are agnostic of the community structure characterizing many social networks. Recent works have developed community-aware centrality measures that exploit features of the community structure information encountered in most real-world complex networks. In this paper, we investigate the interactions between 5 popular classical centrality measures and 5 community-aware centrality measures using 8 real-world online networks. Correlation as well as similarity measures between both t…
Classical versus Community-aware Centrality Measures: An Empirical Study
International audience
Analyse des discours sur Twitter dans une situation de crise:Étude de l’incident à l’usine Lubrizol de Rouen
International audience
A Approach to Clinical Proteomics Data Quality Control and Import
International audience; Biomedical domain and proteomics in particular are faced with an increasing volume of data. The heterogeneity of data sources implies heterogeneity in the representation and in the content of data. Data may also be incorrect, implicate errors and can compromise the analysis of experiments results. Our approach aims to ensure the initial quality of data during import into an information system dedicated to proteomics. It is based on the joint use of models, which represent the system sources, and ontologies, which are use as mediators between them. The controls, we propose, ensure the validity of values, semantics and data consistency during import process.
Les systèmes d'information coopératifs : le projet DECA
International audience; Ce travail répond aux besoins d'accès uniformes et coopératifs à plusieurs systèmes d'information hétérogènes. Nous décrivons un cadre de travail pour les systèmes d'information coopératifs et présentons le projet DECA qui propose une approche de médiation de contextes. La principale caractéristique de la solution proposée réside dans la résolution dynamique des conflits schématiques et sémantiques des données. Ceci est réalisé à travers le modèle AM UN, modèle orienté objet étendu pour gérer la coopération de systèmes. Une architecture faiblement couplée, basée sur le paradigme multi-agents, est présentée.
UML-Based Metamodeling for Information System Engineering and Evolution
In modelers’ practice metamodels have become the core of UML-based metamodeling environments: metamodels form the basis of application domain descriptions, and they are instantiated into models. In the context of information system engineering and interoperability, we have developped two operations on metamodels: metamodel integration and measure of semantical distance between metamodels. In this paper, we explore application of these operations to information systems’ evolution.
Évaluation de l’influence dans un réseau multi-relationnel
L'influence sur Twitter est devenue un sujet de recherche important. Certains utilisateurs révèlent plus de capacité que d'autres pour influencer les personnes avec lesquelles ils sont connectés. Ainsi, trouver les utilisateurs les plus influents peut permettre une diffusion efficace de l'information à grande échelle, action très utile dans le marketing ou les campagnes politiques. Dans cet article, nous proposons une nouvelle approche pour l'évaluation de l'influence dans les réseaux multi-relationnels tels que Twitter. Notre méthode est basée sur la règle de combinaison conjonctive de la théorie des fonctions de croyance qui permet de fusionner différents types de relations. Nous expérime…
Constraint Management in Engineering of Complex Information Systems
We propose to build an engineering environment for information systems by using metamodels, OCL and symbolic model checkers to manage constraints. Our proposal is based on a definition of constraints as 3D spaces with dimensions corresponding to UML diagrams, constructs, and abstraction levels. We show how such environments can help with engineering quality complex systems by allowing to lift up a part of constraint verifications.
Lambda+ Architecture - Un patron d'architecture haute performance pour le traitement des Big Data
International audience
Apports des réseaux sociaux pour la gestion de la relation client
National audience; Depuis quelques années, le Web s'est transformé en une plateforme d'échanges. La gestion de relation client doit évoluer pour tirer partie des données disponibles sur les réseaux sociaux et mettre l'entreprise au coeur des échanges. Nous proposons dans cet article une approche générique de détection de communautés de clients d'une entreprise, basée sur leur comportement explicite et implicite, intégrant des données de sources diverses. Nous définissons une mesure de similarité, entre un utilisateur et un tag, prenant en compte la notation et la consultation des ressources et le réseau social de l'utilisateur. Nous validons cette approche sur une base exemple en utilisant …
Lambda+, the renewal of the Lambda Architecture: Category Theory to the rescue
Designing software architectures for Big Data is a complex task that has to take into consideration multiple parameters, such as the expected functionalities, the properties that are untradeable, or the suitable technologies. Patterns are abstractions that guide the design of architectures to reach the requirements. One of the famous patterns is the Lambda Architecture, which proposes real-time computations with correctness and fault-tolerance guarantees. But the Lambda has also been highly criticized, mostly because of its complexity and because the real-time and correctness properties are each effective in a different layer but not in the overall architecture. Furthermore, its use cases a…
Le projet COCKTAIL : une approche pluridisciplinaire de la circulation des discours sur Twitter
International audience
Hierarchy and Centrality: Two Sides of The Same Coin?
International audience
Un observatoire pour la modélisation et l’analyse des réseaux multi-relationnels. Une application à l'étude du discours politique sur Twitter
International audience
Using Social Networks to Enhance Customer Relationship Management
International audience; In recent years, the Web has evolved into an exchange platform. Customer Relationship Management (CRM) must follow this evolution and connect CRM tools to social networks in order to place companies in the center of all the exchanges. We propose, in this article, a community detection approach that identi fies clusters of customers of a company using their explicit and implicit behaviour. Our contribution is the definition of a composite pro le that integrates various informations gathered from di erent applications, such as the information system of the company, the existing CRM, or Twitter. We de ne a similarity measure, between a user and a tag, that takes into ac…
A web application for event detection and exploratory data analysis for Twitter data
International audience
Domain knowledge integration and semantical quality management -A biology case study
International audience; The management of semantical quality is a major challenge in the context of knowledge integration. In this paper, we describe a new approach to constraint management that emphasizes constraint traceability when moving from the semantical level to the operational one.Our strategy for management of semantical quality is related to a metamo-deling-based approach to knowledge integration. We carry out knowledge integration “on the fly” by using transformations applied to models belonging to our metamodeling architecture. The resulting integrated models access available resources through web services whose input and output parameters are guarded by constraints. Integrated…
Enhancing scientific information systems with semantic annotations
International audience; Scientific Information Systems aim to produce or improve knowledge on a subject through activities of research and development. The management of scientific dat a requires some essential properties. We propose SemLab an architecture that sup ports interoperability, data quality and extensibility through a unique paradigm: semantic annotation. We present two app lications that validate our architecture.
The Tucker tensor decomposition for data analysis: capabilities and advantages
Tensors are powerful multi-dimensional mathematical objects, that easily embed various data models such as relational, graph, time series, etc. Furthermore, tensor decomposition operators are of great utility to reveal hidden patterns and complex relationships in data. In this article, we propose to study the analytical capabilities of the Tucker decomposition, as well as the differences brought by its major algorithms. We demonstrate these differences through practical examples on several datasets having a ground truth. It is a preliminary work to add the Tucker decomposition to the Tensor Data Model, a model aiming to make tensors data-centric, and to optimize operators in order to enable…