Search results for "retrieval"
showing 10 items of 1176 documents
Entity Recommendation for Everyday Digital Tasks
2021
| openaire: EC/H2020/826266/EU//CO-ADAPT Recommender systems can support everyday digital tasks by retrieving and recommending useful information contextually. This is becoming increasingly relevant in services and operating systems. Previous research often focuses on specific recommendation tasks with data captured from interactions with an individual application. The quality of recommendations is also often evaluated addressing only computational measures of accuracy, without investigating the usefulness of recommendations in realistic tasks. The aim of this work is to synthesize the research in this area through a novel approach by (1) demonstrating comprehensive digital activity monitor…
Automatic image representation for content-based access to personal photo album
2007
The proposed work exploits methods and techniques for automatic characterization of images for content-based access to personal photo libraries. Several techniques, even if not reliable enough to address the general problem of content-based image retrieval, have been proven quite robust in a limited domain such as the one of personal photo album. In particular, starting from the observation that most personal photos depict a usually small number of people in a relatively small number of different contexts (e.g. Beach, Public Garden, Indoor, Nature, Snow, City, etc...) we propose the use of automatic techniques borrowed from the fields of computer vision and pattern recognition to index imag…
A Neural Turing~Machine for Conditional Transition Graph Modeling
2019
Graphs are an essential part of many machine learning problems such as analysis of parse trees, social networks, knowledge graphs, transportation systems, and molecular structures. Applying machine learning in these areas typically involves learning the graph structure and the relationship between the nodes of the graph. However, learning the graph structure is often complex, particularly when the graph is cyclic, and the transitions from one node to another are conditioned such as graphs used to represent a finite state machine. To solve this problem, we propose to extend the memory based Neural Turing Machine (NTM) with two novel additions. We allow for transitions between nodes to be inf…
ASR performance prediction on unseen broadcast programs using convolutional neural networks
2018
In this paper, we address a relatively new task: prediction of ASR performance on unseen broadcast programs. We first propose an heterogenous French corpus dedicated to this task. Two prediction approaches are compared: a state-of-the-art performance prediction based on regression (engineered features) and a new strategy based on convolutional neural networks (learnt features). We particularly focus on the combination of both textual (ASR transcription) and signal inputs. While the joint use of textual and signal features did not work for the regression baseline, the combination of inputs for CNNs leads to the best WER prediction performance. We also show that our CNN prediction remarkably …
Multilingual Clustering of Streaming News
2018
Clustering news across languages enables efficient media monitoring by aggregating articles from multilingual sources into coherent stories. Doing so in an online setting allows scalable processing of massive news streams. To this end, we describe a novel method for clustering an incoming stream of multilingual documents into monolingual and crosslingual story clusters. Unlike typical clustering approaches that consider a small and known number of labels, we tackle the problem of discovering an ever growing number of cluster labels in an online fashion, using real news datasets in multiple languages. Our method is simple to implement, computationally efficient and produces state-of-the-art …
Investigating label suggestions for opinion mining in German Covid-19 social media
2021
This work investigates the use of interactively updated label suggestions to improve upon the efficiency of gathering annotations on the task of opinion mining in German Covid-19 social media data. We develop guidelines to conduct a controlled annotation study with social science students and find that suggestions from a model trained on a small, expert-annotated dataset already lead to a substantial improvement - in terms of inter-annotator agreement(+.14 Fleiss' $\kappa$) and annotation quality - compared to students that do not receive any label suggestions. We further find that label suggestions from interactively trained models do not lead to an improvement over suggestions from a stat…
Untrue.News: A New Search Engine For Fake Stories
2020
In this paper, we demonstrate Untrue News, a new search engine for fake stories. Untrue News is easy to use and offers useful features such as: a) a multi-language option combining fake stories from different countries and languages around the same subject or person; b) an user privacy protector, avoiding the filter bubble by employing a bias-free ranking scheme; and c) a collaborative platform that fosters the development of new tools for fighting disinformation. Untrue News relies on Elasticsearch, a new scalable analytic search engine based on the Lucene library that provides near real-time results. We demonstrate two key scenarios: the first related to a politician - looking how the cat…
Focusing Knowledge-based Graph Argument Mining via Topic Modeling
2021
Decision-making usually takes five steps: identifying the problem, collecting data, extracting evidence, identifying pro and con arguments, and making decisions. Focusing on extracting evidence, this paper presents a hybrid model that combines latent Dirichlet allocation and word embeddings to obtain external knowledge from structured and unstructured data. We study the task of sentence-level argument mining, as arguments mostly require some degree of world knowledge to be identified and understood. Given a topic and a sentence, the goal is to classify whether a sentence represents an argument in regard to the topic. We use a topic model to extract topic- and sentence-specific evidence from…
Combining a Context Aware Neural Network with a Denoising Autoencoder for Measuring String Similarities
2018
Measuring similarities between strings is central for many established and fast growing research areas including information retrieval, biology, and natural language processing. The traditional approach for string similarity measurements is to define a metric over a word space that quantifies and sums up the differences between characters in two strings. The state-of-the-art in the area has, surprisingly, not evolved much during the last few decades. The majority of the metrics are based on a simple comparison between character and character distributions without consideration for the context of the words. This paper proposes a string metric that encompasses similarities between strings bas…
Transfer Learning with Convolutional Networks for Atmospheric Parameter Retrieval
2018
The Infrared Atmospheric Sounding Interferometer (IASI) on board the MetOp satellite series provides important measurements for Numerical Weather Prediction (NWP). Retrieving accurate atmospheric parameters from the raw data provided by IASI is a large challenge, but necessary in order to use the data in NWP models. Statistical models performance is compromised because of the extremely high spectral dimensionality and the high number of variables to be predicted simultaneously across the atmospheric column. All this poses a challenge for selecting and studying optimal models and processing schemes. Earlier work has shown non-linear models such as kernel methods and neural networks perform w…