Search results for "Information Retrieval"
showing 10 items of 924 documents
Untrue.News: A New Search Engine For Fake Stories
2020
In this paper, we demonstrate Untrue News, a new search engine for fake stories. Untrue News is easy to use and offers useful features such as: a) a multi-language option combining fake stories from different countries and languages around the same subject or person; b) an user privacy protector, avoiding the filter bubble by employing a bias-free ranking scheme; and c) a collaborative platform that fosters the development of new tools for fighting disinformation. Untrue News relies on Elasticsearch, a new scalable analytic search engine based on the Lucene library that provides near real-time results. We demonstrate two key scenarios: the first related to a politician - looking how the cat…
Focusing Knowledge-based Graph Argument Mining via Topic Modeling
2021
Decision-making usually takes five steps: identifying the problem, collecting data, extracting evidence, identifying pro and con arguments, and making decisions. Focusing on extracting evidence, this paper presents a hybrid model that combines latent Dirichlet allocation and word embeddings to obtain external knowledge from structured and unstructured data. We study the task of sentence-level argument mining, as arguments mostly require some degree of world knowledge to be identified and understood. Given a topic and a sentence, the goal is to classify whether a sentence represents an argument in regard to the topic. We use a topic model to extract topic- and sentence-specific evidence from…
Combining a Context Aware Neural Network with a Denoising Autoencoder for Measuring String Similarities
2018
Measuring similarities between strings is central for many established and fast growing research areas including information retrieval, biology, and natural language processing. The traditional approach for string similarity measurements is to define a metric over a word space that quantifies and sums up the differences between characters in two strings. The state-of-the-art in the area has, surprisingly, not evolved much during the last few decades. The majority of the metrics are based on a simple comparison between character and character distributions without consideration for the context of the words. This paper proposes a string metric that encompasses similarities between strings bas…
Open Data Quality Evaluation: A Comparative Analysis of Open Data in Latvia
2020
Nowadays open data is entering the mainstream - it is free available for every stakeholder and is often used in business decision-making. It is important to be sure data is trustable and error-free as its quality problems can lead to huge losses. The research discusses how (open) data quality could be assessed. It also covers main points which should be considered developing a data quality management solution. One specific approach is applied to several Latvian open data sets. The research provides a step-by-step open data sets analysis guide and summarizes its results. It is also shown there could exist differences in data quality depending on data supplier (centralized and decentralized d…
Weakly Supervised Object Detection in Artworks
2018
We propose a method for the weakly supervised detection of objects in paintings. At training time, only image-level annotations are needed. This, combined with the efficiency of our multiple-instance learning method, enables one to learn new classes on-the-fly from globally annotated databases, avoiding the tedious task of manually marking objects. We show on several databases that dropping the instance-level annotations only yields mild performance losses. We also introduce a new database, IconArt, on which we perform detection experiments on classes that could not be learned on photographs, such as Jesus Child or Saint Sebastian. To the best of our knowledge, these are the first experimen…
Multiscale Information Decomposition: Exact Computation for Multivariate Gaussian Processes
2017
Exploiting the theory of state space models, we derive the exact expressions of the information transfer, as well as redundant and synergistic transfer, for coupled Gaussian processes observed at multiple temporal scales. All of the terms, constituting the frameworks known as interaction information decomposition and partial information decomposition, can thus be analytically obtained for different time scales from the parameters of the VAR model that fits the processes. We report the application of the proposed methodology firstly to benchmark Gaussian systems, showing that this class of systems may generate patterns of information decomposition characterized by prevalently redundant or sy…
Exact quantum algorithms have advantage for almost all Boolean functions
2014
It has been proved that almost all $n$-bit Boolean functions have exact classical query complexity $n$. However, the situation seemed to be very different when we deal with exact quantum query complexity. In this paper, we prove that almost all $n$-bit Boolean functions can be computed by an exact quantum algorithm with less than $n$ queries. More exactly, we prove that ${AND}_n$ is the only $n$-bit Boolean function, up to isomorphism, that requires $n$ queries.
Parameterized Quantum Query Complexity of Graph Collision
2013
We present three new quantum algorithms in the quantum query model for \textsc{graph-collision} problem: \begin{itemize} \item an algorithm based on tree decomposition that uses $O\left(\sqrt{n}t^{\sfrac{1}{6}}\right)$ queries where $t$ is the treewidth of the graph; \item an algorithm constructed on a span program that improves a result by Gavinsky and Ito. The algorithm uses $O(\sqrt{n}+\sqrt{\alpha^{**}})$ queries, where $\alpha^{**}(G)$ is a graph parameter defined by \[\alpha^{**}(G):=\min_{VC\text{-- vertex cover of}G}{\max_{\substack{I\subseteq VC\\I\text{-- independent set}}}{\sum_{v\in I}{\deg{v}}}};\] \item an algorithm for a subclass of circulant graphs that uses $O(\sqrt{n})$ qu…
At Your Service: Coffee Beans Recommendation From a Robot Assistant
2020
With advances in the field of machine learning, precisely algorithms for recommendation systems, robot assistants are envisioned to become more present in the hospitality industry. Additionally, the COVID-19 pandemic has also highlighted the need to have more service robots in our everyday lives, to minimise the risk of human to-human transmission. One such example would be coffee shops, which have become intrinsic to our everyday lives. However, serving an excellent cup of coffee is not a trivial feat as a coffee blend typically comprises rich aromas, indulgent and unique flavours and a lingering aftertaste. Our work addresses this by proposing a computational model which recommends optima…
Tag2Risk: Harnessing social music tags for characterizing depression risk
2020
Musical preferences have been considered a mirror of the self. In this age of Big Data, online music streaming services allow us to capture ecologically valid music listening behavior and provide a rich source of information to identify several user-specific aspects. Studies have shown musical engagement to be an indirect representation of internal states including internalized symptomatology and depression. The current study aims at unearthing patterns and trends in the individuals at risk for depression as it manifests in naturally occurring music listening behavior. Mental well-being scores, musical engagement measures, and listening histories of Last.fm users (N=541) were acquired. Soci…