Nadine Cullot
Une plateforme haute performance pour l’exploitation des données massives - Application aux données des réseaux sociaux
Context Comparison for Object Fusion
We propose a solution to help the integration of heterogeneous sources based on the fusion of objects according to their context. New requirements for information exchange have emerged with all the developments around Internet. Information consumers want to access and combine data from remote and heterogeneous sources in a transparent and dynamic way. To achieve this level of interoperation is yet a real challenge. We present a model to define local data as informative objects with a contextual representation associated to them. A semantic context comparison mechanism, based on a semantic distance, reconciles context of applications and constructs virtual objects in which rules make the fus…
Towards A Twitter Observatory: A Multi-Paradigm Framework For Collecting, Storing And Analysing Tweets
International audience; In this article we show how a multi-paradigm framework can fulfil the requirements of tweets analysis and reduce the waiting time for researchers that use computational resources and storage systems to support large-scale data analysis. The originality of our approach is to combine concerns about data harvesting, data storage, data analysis and data visualisation into a framework that supports inductive reasoning in multidisciplinary scientific research. Our main contribution is a polyglot storage system with a generic data model to support logical data independence and a set of tools that can provide a suitable solution for mixing different types of algorithms in or…
Similarity estimation module for OWSCIS
Semantic interoperability based on ontologies is nowadays becoming a great challenge. We propose an architecture for semantic-based cooperation called OWSCIS (Ontology and Web Service based Cooperation of Information Sources). It allows to map various information sources using ontologies and to answer users' queries over the cooperation. In this paper we focus on the similarity estimation service of the OWSCIS Architecture which allows to discover mappings between different ontologies. It relies on various mappings methods which are combined and refined to semi-automatically generate mappings between a local and a reference ontologies.
The design and implementation of Neuma, a collaborative Digital Scores Library - Requirements, architecture, and models
This paper presents the design and implementation of the Neuma platform, a digital library devoted to the preservation and dissemination of symbolic music content (scores). Neuma is open to musicologists, musicians, and music publishers. It consists of a repository dedicated to the storage of large collections of digital scores, where users/applications can upload their documents. It also proposes services to publish, annotate, query, transform, and analyze scores. The long-term goal of the project is to enable an open and collaborative space where musician communities will be able to share music in symbolic notation. The project is organized around the French IRPMF institute (BnF–CNRS) whi…
The NEUMA Project: Towards Cooperative On-line Music Score Libraries
Building Ontologies from XML Data Sources
In this paper, we present a tool called X2OWL that aims at building an OWL ontology from an XML datasource. This method is based on XML schema to automatically generate the ontology structure, as well as, a set of mapping bridges. The presented method also includes a refinement step that allows to clean the mapping bridges and possibly to restructure the generated ontology.
Lambda Architecture pour une analyse à haute performance des données des réseaux sociaux
In this article, we show how a Lambda Architecture can contribute to the development of a platform for collecting and analyzing, in real-time, data from Twitter. After having presented the context, detailed the needs and identified the expected specificities, we compare the Lambda and Kappa architectures and we describe the state of the art on Lambda Architecture use in different domains. We propose an adaptation of the Lambda architecture to allow the storage of data in a polystore and to take into account different types of analysis to be carried out to answer researches in social sciences and communication sciences. In these projects the objectives are to study the structure of communica…
QUEXME: A Query Expansion Method Applied to Water Information System
The aim of the paper is to present and apply a QUery EXpansion MEthod called QUEXME while querying the Euro-Mediterranean Information System (EMWIS) on know-how in the Water sector. EMWIS provides a strategic tool for exchanging information and knowledge in the water sector between and within the Euro Mediterranean partnership countries (www.emwis.net). Information retrieval on the web or through some cooperation of information sources or some general knowledge bases is a complex process and a great challenge with the emergence of the semantic web. The aim of the query expansion method is to help and guide users to build their requests giving them some usually related terms close to their q…
Fingering Watermarking in Symbolic Digital Scores.
Representing and Reasoning for Spatiotemporal Ontology Integration
International audience; The World-Wide Web hosts many autonomous and heterogeneous information sources. In the near future each source may be described by its own ontology. The distributed nature of ontology development will lead to a large number of local ontologies covering overlapping domains. Ontology integration will then become an essential capability for effective interoperability and information sharing. Integration is known to be a hard problem, whose complexity increases particularly in the presence of spatiotemporal information. Space and time entail additional problems such as the heterogeneity of granularity used in representing spatial and temporal features. Spatio-temporal ob…
On Using Conceptual Modeling for Ontologies
Are database concepts and techniques suitable for ontology design and management? The question has been on the floor for some time already. It gets a new emphasis today, thanks to the focus on ontologies and ontology services due to the spread of web services as a new paradigm for information management. This paper analyzes some of the arguments that are relevant to the debate, in particular the question whether conceptual data models would adequately support the design and use of ontologies. It concludes suggesting a hybrid approach, combining databases and logic-based services.
Spatio-temporal Schema Integration with Validation: A Practical Approach
We propose to enhance a schema integration process with a validation phase employing logic-based data models. In our methodology, we validate the source schemas against the data model; the inter-schema mappings are validated against the semantics of the data model and the syntax of the correspondence language. In this paper, we focus on how to employ a reasoning engine to validate spatio-temporal schemas and describe where the reasoning engine is plugged into our integration methodology. The validation phase distinguishes our integration methodology from other approaches. We shift the emphasis on automation from the a priori discovery to the a posteriori checking of the inter-schema mapping…
Ontology mapping specification in description logics for cooperative systems
Le developpement rapide du Web semantique est lie a la specification de plus en plus d'ontologies. Celles-ci permettent de modeliser des connaissances agreees par des communautes de personnes concernant des domaines ou des tâches specifiques. Le meme domaine decrit par deux communautes distinctes sera modelise de facon differente. Les systemes cooperatifs visent a rendre les informations provenant de differentes sources disponibles au-dela de leurs divergences. Pour cela, ils doivent aligner, fusionner ou integrer ces ontologies. La decouverte de mappings est un point cle dans la resolution efficace des heterogeneites entre ontologies. Nous developpons une architecture qui connecte des syst…
Semantic Mappings in Description Logics for Spatio-temporal Database Schema Integration
International audience; The interoperability problem arises in heterogeneous systems where different data sources coexist and there is a need for meaningful information sharing. One of the most representive realms of diversity of data representation is the spatio-temporal domain. Spatio-temporal data are most often described according to multiple and greatly diverse perceptions or viewpoints, using different terms and with heterogeneous levels of detail. Reconciling this heterogeneity to build a fully integrated database is known to be a complex and currently unresolved problem, and few formal approaches exist for the integration of spatio-temporal databases. The paper discusses the interope…
Lambda+ Architecture - Un patron d'architecture haute performance pour le traitement des Big Data
Apports des réseaux sociaux pour la gestion de la relation client
National audience; Depuis quelques années, le Web s'est transformé en une plateforme d'échanges. La gestion de relation client doit évoluer pour tirer partie des données disponibles sur les réseaux sociaux et mettre l'entreprise au coeur des échanges. Nous proposons dans cet article une approche générique de détection de communautés de clients d'une entreprise, basée sur leur comportement explicite et implicite, intégrant des données de sources diverses. Nous définissons une mesure de similarité, entre un utilisateur et un tag, prenant en compte la notation et la consultation des ressources et le réseau social de l'utilisateur. Nous validons cette approche sur une base exemple en utilisant …
Lambda+, the renewal of the Lambda Architecture: Category Theory to the rescue
Designing software architectures for Big Data is a complex task that has to take into consideration multiple parameters, such as the expected functionalities, the properties that are untradeable, or the suitable technologies. Patterns are abstractions that guide the design of architectures to reach the requirements. One of the famous patterns is the Lambda Architecture, which proposes real-time computations with correctness and fault-tolerance guarantees. But the Lambda has also been highly criticized, mostly because of its complexity and because the real-time and correctness properties are each effective in a different layer but not in the overall architecture. Furthermore, its use cases a…
Un observatoire pour la modélisation et l’analyse des réseaux multi-relationnels. Une application à l'étude du discours politique sur Twitter
Using Social Networks to Enhance Customer Relationship Management
International audience; In recent years, the Web has evolved into an exchange platform. Customer Relationship Management (CRM) must follow this evolution and connect CRM tools to social networks in order to place companies in the center of all the exchanges. We propose, in this article, a community detection approach that identi fies clusters of customers of a company using their explicit and implicit behaviour. Our contribution is the definition of a composite pro le that integrates various informations gathered from di erent applications, such as the information system of the company, the existing CRM, or Twitter. We de ne a similarity measure, between a user and a tag, that takes into ac…
Temporal Semantic Centrality for the Analysis of Communication Networks
National audience; De nos jours, la compréhension des communautés en ligne devient un enjeu majeur du Web. Dans cet article nous proposons une nouvelle mesure, la Probabilité de Propagation Sémantique (SPP), qui caractérise la capacité de l'utilisateur à propager un concept sémantique à d'autres utilisateurs, d'une manière rapide et ciblée. La sémantique des messages est analysée selon une ontologie donnée. Nous utilisons cette mesure pour obtenir la Centralité Sémantique Temporelle (TSC) d'un utilisateur dans une communauté. Nous proposons et évaluons une expérimentation de cette mesure, en utilisant une ontologie et des données réelles issues du Web.
ERIS: An Approach Based on Community Boundaries to Assess Polarization in Online Social Networks
Detection and characterization of polarization are of major interest in Social Network Analysis, especially to identify conflictual topics that animate the interactions between users. As gatekeepers of their community, users in the boundaries significantly contribute to its polarization. We propose ERIS, a formal graph approach relying on community boundaries and users' interactions to compute two metrics: the community antagonism and the porosity of boundaries. These values assess the degree of opposition between communities and their aversion to external exposure, allowing an understanding of the overall polarization through the behaviors of the different communities. We also present an i…