Search results for "Artificial intelligence"
showing 10 items of 6122 documents
FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy
2021
Abstract Background Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. Results We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy …
SOCIAL NETWORKS, BIG DATA AND TRANSPORT PLANNING
2016
[EN] The characteristics of people who are related or tied to each individual affects her activity-travel behavior. That influence is especially associated to social and recreational activities, which are increasingly important. Collecting high quality data from those social networks is very difficult using traditional travel surveys, because respondents are asked about their general social life, which is most demanding to remember that specific facts. On the other hand, currently there are different potential sources of transport data, which is characterized by the huge amount of information available, the velocity with it is obtained and the variety of format in which is presented. This s…
Proposed use of a conversational agent for patient empowerment
2021
Empowerment is a process through which people acquire the necessary knowledge and self-awareness to understand their conditions and treatment options, make informed choices and self-manage their health conditions in daily life, in collaboration with medical professionals. Conversational Agents in healthcare could play an important role in the process of empowering a person but, so far, they have been seldom been used for this purpose. This paper presents the basic principles and preliminary implementation of a conversational health agent for patient empowerment. It dialogues with the user in a "natural" way, collects health data from heterogeneous sources and provides the user wit…
Deep learning and process understanding for data-driven Earth system science
2017
Machine learning approaches are increasingly used to extract patterns and insights from the ever-increasing stream of geospatial data, but current approaches may not be optimal when system behaviour is dominated by spatial or temporal context. Here, rather than amending classical machine learning, we argue that these contextual cues should be used as part of deep learning (an approach that is able to extract spatio-temporal features automatically) to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales, for example. The next step will be a hybri…
Choosing Optimal Seed Nodes in Competitive Contagion.
2019
International audience; In recent years there has been a growing interest in simulating competitive markets to find out the efficient ways to advertise a product or spread an ideology. Along this line, we consider a binary competitive contagion process where two infections, A and B, interact with each other and diffuse simultaneously in a network. We investigate which is the best centrality measure to find out the seed nodes a company should adopt in the presence of rivals so that it can maximize its influence. These nodes can be used as the initial spreaders or advertisers by firms when two firms compete with each other. Each node is assigned a price tag to become an initial advertiser whi…
Mining customer requirements from online reviews: A product improvement perspective
2016
We propose a filtering model to predict helpfulness of reviews for product design.We provide a way to use the KANO model based on online reviews.We explore how to obtain insights from Big Data through knowledge-based view. Big data commerce has become an e-commerce trend. Learning how to extract valuable and real time insights from big data to drive smarter and more profitable business decisions is a main task of big data commerce. Using online reviews as an example, manufacturers have come to value how to select helpful online reviews and what can be learned from online reviews for new product development. In this research, we first proposed an automatic filtering model to predict the help…
Computation of the area in the discrete plane: Green’s theorem revisited
2017
International audience; The detection of the contour of a binary object is a common problem; however, the area of a region, and its moments, can be a significant parameter. In several metrology applications, the area of planar objects must be measured. The area is obtained by counting the pixels inside the contour or using a discrete version of Green's formula. Unfortunately, we obtain the area enclosed by the polygonal line passing through the centers of the pixels along the contour. We present a modified version of Green's theorem in the discrete plane, which allows for the computation of the exact area of a two-dimensional region in the class of polyominoes. Penalties are introduced and …
Fast Algorithms for Pseudoarboricity
2015
The densest subgraph problem, which asks for a subgraph with the maximum edges-to-vertices ratio d∗, is solvable in polynomial time. We discuss algorithms for this problem and the computation of a graph orientation with the lowest maximum indegree, which is equal to ⌈d∗⌉. This value also equals the pseudoarboricity of the graph. We show that it can be computed in O(|E| √ log log d∗) time, and that better estimates can be given for graph classes where d∗ satisfies certain asymptotic bounds. These runtimes are achieved by accelerating a binary search with an approximation scheme, and a runtime analysis of Dinitz’s algorithm on flow networks where all arcs, except the source and sink arcs, hav…
On the Non-uniform Redundancy in Grammatical Evolution
2016
This paper investigates the redundancy of representation in grammatical evolution (GE) for binary trees. We analyze the entire GE solution space by creating all binary genotypes of predefined length and map them to phenotype trees, which are then characterized by their size, depth and shape. We find that the GE representation is strongly non-uniformly redundant. There are huge differences in the number of genotypes that encode one particular phenotype. Thus, it is difficult for GE to solve problems where the optimal tree solutions are underrepresented. In general, the GE mapping process is biased towards short tree structures, which implies high GE performance if the optimal solution requir…
Cluster-based active learning for compact image classification
2010
In this paper, we consider active sampling to label pixels grouped with hierarchical clustering. The objective of the method is to match the data relationships discovered by the clustering algorithm with the user's desired class semantics. The first is represented as a complete tree to be pruned and the second is iteratively provided by the user. The active learning algorithm proposed searches the pruning of the tree that best matches the labels of the sampled points. By choosing the part of the tree to sample from according to current pruning's uncertainty, sampling is focused on most uncertain clusters. This way, large clusters for which the class membership is already fixed are no longer…