0000000000470173
AUTHOR
Aurélie Bertaux
Predictive and Evolutive Cross-Referencing for Web Textual Sources
International audience; One of the main challenges in the domain of competitive intelligence is to harness important volumes of information from the web, and extract the most valuable pieces of information. As the amount of information available on the web grows rapidly and is very heterogeneous, this process becomes overwhelming for experts. To leverage this challenge, this paper presents a vision for a novel process that performs cross-referencing at web scale. This process uses a focused crawler and a semantic-based classifier to cross-reference textual items without expert intervention, based on Big Data and Semantic Web technologies. The system is described thoroughly, and interests of…
Evaluation de la pertinence dans un système de recommandation sémantique de nouvelles économiques
Today in the commercial and financial sectors, staying informed about economic news is crucial and involves targeting good articles to read, because the huge amount of information. To address this problem, we propose an innovative article recommendation system, based on the integration of a semantic description of articles and on a knowledge ontological model. We support our recommendation system on an intrinsically efficient vector model that we have perfected to overcome the confusion existing in models between the concepts of similarity and relevancy that does not take into account the effects of the difference in the accuracy of the semantic descriptions precision between profiles and a…
De la science à l’expérience, la pluralité des savoirs en viticulture : le cas des maladies.
International audience
De la scène de crime aux connaissances : représentation d'évènements et peuplement d'ontologie appliqués au domaine de la criminalistique informatique
International audience; Avec la démocratisation des technologies, les enquêtes de criminalistique informatique impliquent des volumes de données toujours plus grands et hétérogènes. Pour faciliter le travail des enquêteurs, nos travaux ont pour objectif de reconstruire automatiquement les évènements liés à un incident numérique, tout en respectant les exigences légales. Pour cela, il est nécessaire d'introduire un modèle de représentation de connaissances permettant de structurer les informations recueillies sur une scène de crime dans le but de faciliter l'utilisation de processus d'analyse automatisés. Ce papier propose un état de l'art des modèles de représentations d'évènements pour le …
An Ontology-Based Approach for the Reconstruction and Analysis of Digital Incidents Timelines
International audience; Due to the democratisation of new technologies, computer forensics investigators have to deal with volumes of data which are becoming increasingly large and heterogeneous. Indeed, in a single machine, hundred of events occur per minute, produced and logged by the operating system and various software. Therefore, the identification of evidence, and more generally, the reconstruction of past events is a tedious and time-consuming task for the investigators. Our work aims at reconstructing and analysing automatically the events related to a digital incident, while respecting legal requirements. To tackle those three main problems (volume, heterogeneity and legal require…
Event Reconstruction
Event reconstruction is one of the most important step in digital forensic investigations. It allows investigators to have a clear view of the events that have occurred over time. Event reconstruction is a complex task which requires exploration of a large amount of events due to the pervasiveness of new technologies nowadays. Any evidence produced at the end of the investigative process must also meet the requirements of the courts, such as reproducibility, verifiability, validation, etc. After defining the most important concepts of event reconstruction, a survey of the challenges of this field and solutions proposed so far is given in this chapter. Irish Research Council Science Foundati…
k-Partite Graphs as Contexts
International audience; In formal concept analysis, 2-dimensional formal contexts are bipar-tite graphs. In this work, we generalise the notions of context and concept to graphs that are not bipartite. We then study the complexity of the enumeration and identify the structure of the set of such concepts.
Extraction de la Valeur des données du Big Data par classification multi-label hiérarchique sémantique
International audience; Cet article présente une solution centrée sur les ontologies pour la classification multi-label automatique d’information nécessaire à un système de recommandation d’informations économiques.
An Ontology-Based Monitoring System in Vineyards of the Burgundy Region
Given the France's rich wine heritage as well as its pioneering position as the world's second wine producer, the production of high quality wines plays a role of primary importance. The recent development of IOT and efficient big data processing has been shown to provide purposeful issue to permanent monitoring during the entire wine making process. Standing within this trend, we introduce in this paper an intelligent system for vineyards monitoring in the Burgundy region. The main trust of the proposed system relies on the use of the Swrl rules in WineCloud ontology. The design of the ontology is mainly based on information gathered from interviews with wine growers. In addition, sensor d…
Une ontologie de la culture de la vigne : des savoirs académiques aux savoirs d'expérience
16.00 Normal 0 21 false false false FR X-NONE X-NONE Dans le cadre d’un projet FUI initie en octobre 2016 (projet winecloud ) visant a construire un outil de tracabilite et predictif du cycle de la vigne et du vin, un travail sur la collecte et la nature des savoirs a ete necessaire de maniere a penser un systeme ontologique qui se rapproche le plus du raisonnement du domaine metier. Le present article vise plus specifiquement a etudier le cycle de vie de la vigne. Nous rendons compte que les savoirs academiques presents dans les sources theoriques et scientifiques s’ajustent, se reactualisent a la lumiere des savoirs d’experience des viticulteurs. Ce travail s’attache egalement a analyser …
Bag-of-word based brand recognition using Markov Clustering Algorithm for codebook generation
International audience; In order to address the issue of counterfeiting online, it is necessary to use automatic tools that analyze the large amount of information available over the Internet. Analysis methods that extract information about the content of the images are very promising for this purpose. In this paper, a method that automatically extract the brand of objects in images is proposed. The method does not explicitly search for text or logos. This information is implicitly included in the Bag-of-Words representation. In the Bag-of-Words paradigm, visual features are clustered to create the visual words. Despite its shortcomings, k-means is the most widely used algorithm. With k-mea…
Towards events ontology based on data sensors network for viticulture domain
International audience; Wine Cloud project is the first "Big Data" platform on the french viticulture value chain. The aim of this platform is to provide a complete traceability of the life cycle of the wine, from the wine-grower to the consumer. In particular, Wine Cloud may qualify as an agricultural decision platform that will be used for vine life cycle management in order to predict the occurrence of major risks (vine diseases, grape vine pests, physiological risks, fermentation stoppage, oxidation of vine, etc...). Also to make wine production more rational by offering winegrower a set of recommendation regarding their strategy's of production development. The proposed platform "Wine …
Preventing Overlaps in Agglomerative Hierarchical Conceptual Clustering
Hierarchical Clustering is an unsupervised learning task, whi-ch seeks to build a set of clusters ordered by the inclusion relation. It is usually assumed that the result is a tree-like structure with no overlapping clusters, i.e., where clusters are either disjoint or nested. In Hierarchical Conceptual Clustering (HCC), each cluster is provided with a conceptual description which belongs to a predefined set called the pattern language. Depending on the application domain, the elements in the pattern language can be of different nature: logical formulas, graphs, tests on the attributes, etc. In this paper, we tackle the issue of overlapping concepts in the agglomerative approach of HCC. We …
Analyse Sémantique du Big Data par Classification Hiérarchique Multi-Label
International audience
Automatic Timeline Construction and Analysis For Computer Forensics Purposes
International audience; To determine the circumstances of an incident, investigators need to reconstruct events that occurred in the past. The large amount of data spread across the crime scene makes this task very tedious and complex. In particular, the analysis of the reconstructed timeline, due to the huge quantity of events that occurred on a digital system, is almost impossible and leads to cognitive overload. Therefore, it becomes more and more necessary to develop automatic tools to help or even replace investigators in some parts of the investigation. This paper introduces a multi-layered architecture designed to assist the investigative team in the extraction of information left in…
WINECLOUD: Une ontologie d'événements pour la modélisation sémantique des données de capteurs hétérogènes
International audience
PROFILE REFINEMENT IN ONTOLOGY-BASED RECOMMANDER SYSTEMS FOR ECONOMICAL E-NEWS
International audience; This paper is interested in a recommender system of economic news articles. More precisely, it focuses on automatic profile refinement of customers which is an important task over time by taken into account logs of the user concerning especially his/her actions, reading time, and domain specific knowledge. In our approach, ontologies are used to describe and automatically refine these profiles. This work focuses on one particular type of recommender systems which is content-based recommenders. The aim of these recommender systems is to build a user profile and to improve its precision over time. Several improvements that have been made to these recommender systems ov…
Semantic HMC for Business Intelligence using Cross-Referencing
International audience
A Survey on how to Cross-Reference Web Information Sources
International audience; The goal of giving information a well-defined meaning is currently shared by different research communities. Once information has a well-defined meaning, it can be searched and retrieved more effectively. Therefore, this paper is a survey about the methods that compare different textual information sources in order to determine whether they address a similar information or not. The improvement of the studied methods will eventually lead to increase the efficiency of documentary research. In order to achieve this goal, the first category of methods focuses on semantic measure definitions. A second category of methods focuses on paraphrase identification techniques, an…
Toward Artificial Intuition
AN ONTOLOGY-BASED RECOMMENDER SYSTEM USING HIERARCHICAL MULTICLASSIFICATION FOR ECONOMICAL E-NEWS
International audience; This paper focuses on a recommender system of economic news articles. Its objectives are threefold: (i) automatically multi-classify new economic articles, (ii) recommend articles by comparing profiles of users and multi-classification of articles, and (iii) managing the vocabulary of the economic news domain to improve the system based on seamlessly intervention of documentalists. In this paper we focus on the automatic multi-classification of the articles, managed by inference process of ontologies, and the enrichment of the documentalist-oriented ontology which provides the necessary capabilities to the DL reasoner for automatic multi-classification.
Intelligent Cloud Storage Management for Layered Tiers
Today, the cloud offers a large array of possibilities for storage, with this flexibility comes also complexity. This complexity stems from the variety of storage mediums, such as, blob storage or NoSQL tables, and also from the different cost tiers within these systems. A strategic thinking to navigate this complex cloud storage landscape is important, not only for cost saving but also for prioritizing information, this prioritization has wider implications in other domains such as the Big Data realm, especially for governance and efficiency. In this paper we propose a strategy centered around probabilistic graphical model (PGM), this heuristic oriented management and organizational strate…
De la science à l’expérience, la pluralité des savoirs en viticulture : le cas des maladies
International audience