Search results for "ComputingMethodologies_PATTERNRECOGNITION"
showing 10 items of 296 documents
Trading off accuracy for efficiency by randomized greedy warping
2016
Dynamic Time Warping (DTW) is a widely used distance measure for time series data mining. Its quadratic complexity requires the application of various techniques (e.g. warping constraints, lower-bounds) for deployment in real-time scenarios. In this paper we propose a randomized greedy warping algorithm for finding similarity between time series instances. We show that the proposed algorithm outperforms the simple greedy approach and also provides very good time series similarity approximation consistently, as compared to DTW. We show that the Randomized Time Warping (RTW) can be used in place of DTW as a fast similarity approximation technique by trading some classification accuracy for ve…
Pathway network inference from gene expression data
2014
[EN] Background: The development of high-throughput omics technologies enabled genome-wide measurements of the activity of cellular elements and provides the analytical resources for the progress of the Systems Biology discipline. Analysis and interpretation of gene expression data has evolved from the gene to the pathway and interaction level, i.e. from the detection of differentially expressed genes, to the establishment of gene interaction networks and the identification of enriched functional categories. Still, the understanding of biological systems requires a further level of analysis that addresses the characterization of the interaction between functional modules. Results: We presen…
Gabor filters in industrial inspection: a review. Application to semiconductor industry
2005
This paper focuses on reviewing some recent works of the use of Gabor filters dealing with industrial applications. After a brief recall of Gabor filter basis, the two usual uses of Gabor filters are recalled: filter bank approach and filter design approach. The third part presents recent published works domain by domain. A fourth part exposes our own work with Gabor Filters for defect detection on semiconductor. A short conclusion summarizes the paper.
Detecting RNA modifications in the epitranscriptome: predict and validate
2017
RNA modifications are emerging players in the field of post-transcriptional regulation of gene expression, and are attracting a comparable degree of research interest to DNA and histone modifications in the field of epigenetics. We now know of more than 150 RNA modifications and the true potential of a few of these is currently emerging as the consequence of a leap in detection technology, principally associated with high-throughput sequencing. This Review outlines the major developments in this field through a structured discussion of detection principles, lays out advantages and drawbacks of new high-throughput methods and presents conventional biophysical identification of modifications …
Ethics of Artificial Intelligence : Research Challenges and Potential Solutions
2020
Artificial Intelligence (AI) is a rapidly emerging paradigm with many applications in healthcare, industries, and smart cities. However, this rise of global interest in AI has fueled a renewed interest from the public sector and global policymakers. As AI networks (e.g., chatbots, automation systems, and helping agents) are paving their way as interactive household items, a critically important research issue is understanding the ethical impact of these autonomous agents. What is the explanation of the AI decision-making process? What are the legal, societal, and moral consequences of these decisions and actions? Should these AI systems be allowed to make decisions for human beings and to w…
Data Mining Algorithms for Knowledge Extraction
2020
In this paper, we study the methods, techniques, and algorithms used in data mining, and from the studied algorithms, we emphasized the clustering algorithms, more precisely on the K-means algorithm. This algorithm was first studied using the Euclidean distance, then modifying the distance between the clusters using the distances Mahalanobis and Canberra. After implementing the algorithms in C/C++, we compared the clustering of the three algorithms, after which we modified them and studied the distance between the clusters.
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases
2019
AbstractThe widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotatio…
Multilingual Clustering of Streaming News
2018
Clustering news across languages enables efficient media monitoring by aggregating articles from multilingual sources into coherent stories. Doing so in an online setting allows scalable processing of massive news streams. To this end, we describe a novel method for clustering an incoming stream of multilingual documents into monolingual and crosslingual story clusters. Unlike typical clustering approaches that consider a small and known number of labels, we tackle the problem of discovering an ever growing number of cluster labels in an online fashion, using real news datasets in multiple languages. Our method is simple to implement, computationally efficient and produces state-of-the-art …
Ensembles of Randomized Time Series Shapelets Provide Improved Accuracy while Reducing Computational Costs
2017
Shapelets are discriminative time series subsequences that allow generation of interpretable classification models, which provide faster and generally better classification than the nearest neighbor approach. However, the shapelet discovery process requires the evaluation of all possible subsequences of all time series in the training set, making it extremely computation intensive. Consequently, shapelet discovery for large time series datasets quickly becomes intractable. A number of improvements have been proposed to reduce the training time. These techniques use approximation or discretization and often lead to reduced classification accuracy compared to the exact method. We are proposin…
Towards Responsible AI for Financial Transactions
2020
Author's accepted manuscript. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The application of AI in finance is increasingly dependent on the principles of responsible AI. These principles-explainability, fairness, privacy, accountability, transparency and soundness form the basis for trust in future AI systems. In this empirical study, we address the first p…