Search results for " processing"
showing 10 items of 7549 documents
Optimized Kernel Entropy Components
2016
This work addresses two main issues of the standard Kernel Entropy Component Analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of by variance as in Kernel Principal Components Analysis. In this work, we propose an extension of the KECA method, named Optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular…
Perceptually Optimized Image Rendering
2017
We develop a framework for rendering photographic images by directly optimizing their perceptual similarity to the original visual scene. Specifically, over the set of all images that can be rendered on a given display, we minimize the normalized Laplacian pyramid distance (NLPD), a measure of perceptual dissimilarity that is derived from a simple model of the early stages of the human visual system. When rendering images acquired with a higher dynamic range than that of the display, we find that the optimization boosts the contrast of low-contrast features without introducing significant artifacts, yielding results of comparable visual quality to current state-of-the-art methods, but witho…
ASR performance prediction on unseen broadcast programs using convolutional neural networks
2018
In this paper, we address a relatively new task: prediction of ASR performance on unseen broadcast programs. We first propose an heterogenous French corpus dedicated to this task. Two prediction approaches are compared: a state-of-the-art performance prediction based on regression (engineered features) and a new strategy based on convolutional neural networks (learnt features). We particularly focus on the combination of both textual (ASR transcription) and signal inputs. While the joint use of textual and signal features did not work for the regression baseline, the combination of inputs for CNNs leads to the best WER prediction performance. We also show that our CNN prediction remarkably …
Analyzing Learned Representations of a Deep ASR Performance Prediction Model
2018
This paper addresses a relatively new task: prediction of ASR performance on unseen broadcast programs. In a previous paper, we presented an ASR performance prediction system using CNNs that encode both text (ASR transcript) and speech, in order to predict word error rate. This work is dedicated to the analysis of speech signal embeddings and text embeddings learnt by the CNN while training our prediction model. We try to better understand which information is captured by the deep model and its relation with different conditioning factors. It is shown that hidden layers convey a clear signal about speech style, accent and broadcast type. We then try to leverage these 3 types of information …
Multilingual Clustering of Streaming News
2018
Clustering news across languages enables efficient media monitoring by aggregating articles from multilingual sources into coherent stories. Doing so in an online setting allows scalable processing of massive news streams. To this end, we describe a novel method for clustering an incoming stream of multilingual documents into monolingual and crosslingual story clusters. Unlike typical clustering approaches that consider a small and known number of labels, we tackle the problem of discovering an ever growing number of cluster labels in an online fashion, using real news datasets in multiple languages. Our method is simple to implement, computationally efficient and produces state-of-the-art …
Polysemy in Controlled Natural Language Texts
2015
Computational semantics and logic-based controlled natural languages (CNL) do not address systematically the word sense disambiguation problem of content words, i.e., they tend to interpret only some functional words that are crucial for construction of discourse representation structures. We show that micro-ontologies and multi-word units allow integration of the rich and polysemous multi-domain background knowledge into CNL thus providing interpretation for the content words. The proposed approach is demonstrated by extending the Attempto Controlled English (ACE) with polysemous and procedural constructs resulting in a more natural CNL named PAO covering narrative multi-domain texts.
Towards the evaluation of automatic simultaneous speech translation from a communicative perspective
2021
In recent years, automatic speech-to-speech and speech-to-text translation has gained momentum thanks to advances in artificial intelligence, especially in the domains of speech recognition and machine translation. The quality of such applications is commonly tested with automatic metrics, such as BLEU, primarily with the goal of assessing improvements of releases or in the context of evaluation campaigns. However, little is known about how the output of such systems is perceived by end users or how they compare to human performances in similar communicative tasks. In this paper, we present the results of an experiment aimed at evaluating the quality of a real-time speech translation engine…
Hybrid blind robust image watermarking technique based on DFT-DCT and Arnold transform
2018
In this paper, a robust blind image watermarking method is proposed for copyright protection of digital images. This hybrid method relies on combining two well-known transforms that are the discrete Fourier transform (DFT) and the discrete cosine transform (DCT). The motivation behind this combination is to enhance the imperceptibility and the robustness. The imperceptibility requirement is achieved by using magnitudes of DFT coefficients while the robustness improvement is ensured by applying DCT to the DFT coefficients magnitude. The watermark is embedded by modifying the coefficients of the middle band of the DCT using a secret key. The security of the proposed method is enhanced by appl…
A Robust Blind 3-D Mesh Watermarking Technique Based on SCS Quantization and Mesh Saliency for Copyright Protection
2019
Due to the recent demand of 3-D meshes in a wide range of applications such as video games, medical imaging, film special effect making, computer-aided design (CAD), among others, the necessity of implementing 3-D mesh watermarking schemes aiming to protect copyright has increased in the last decade. Nowadays, the majority of robust 3-D watermarking approaches have mainly focused on the robustness against attacks while the imperceptibility of these techniques is still a serious challenge. In this context, a blind robust 3-D mesh watermarking method based on mesh saliency and scalar Costa scheme (SCS) for Copyright protection is proposed. The watermark is embedded by quantifying the vertex n…
A blind Robust Image Watermarking Approach exploiting the DFT Magnitude
2019
Due to the current progress in Internet, digital contents (video, audio and images) are widely used. Distribution of multimedia contents is now faster and it allows for easy unauthorized reproduction of information. Digital watermarking came up while trying to solve this problem. Its main idea is to embed a watermark into a host digital content without affecting its quality. Moreover, watermarking can be used in several applications such as authentication, copy control, indexation, Copyright protection, etc. In this paper, we propose a blind robust image watermarking approach as a solution to the problem of copyright protection of digital images. The underlying concept of our method is to a…