Search results for "VISION"
showing 10 items of 5066 documents
Fully automatic face recognition system using a combined audio-visual approach
2005
This paper presents a novel audio and video information fusion approach that greatly improves automatic recognition of people in video sequences. To that end, audio and video information is first used independently to obtain confidence values that indicate the likelihood that a specific person appears in a video shot. Finally, a post-classifier is applied to fuse audio and visual confidence values. The system has been tested on several news sequences and the results indicate that a significant improvement in the recognition rate can be achieved when both modalities are used together.
2015
Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that de…
Eye tracking analysis of minor details in films for audio description
2012
This article focuses on the many instances when minute details found in feature films may have direct implications upon the development of both the visual and plot narratives. The main question we would like to ask examines whether very subtle details which may easily go unnoticed by the viewer should be audio described. To assess the visual consciousness of such minute details, a perception experiment was conducted using eye-tracking technology and questionnaires. Though the result is not conclusive, it shows a clear methodological approach in the field of the audio description of visual details, and does give some indication as to what should be taken into consideration in future studies …
Augmented Mirror: Interactive Augmented Reality System Based on Kinect
2011
Part 1: Long and Short Papers; International audience; In this paper we present a virtual character controlled by an actor in real time, who talks with an audience through an augmented mirror. The application, which integrates video images, the avatar and other virtual objects within an Augmented Reality system, has been implemented using a mixture of technologies: two kinect systems for motion capture, depth map and real images, a gyroscope to detect head movements, and control algorithms to manage avatar emotions.
Fingerprint Quality Evaluation in a Novel Embedded Authentication System for Mobile Users
2015
The way people access resources, data and services, is radically changing using modern mobile technologies. In this scenario, biometry is a good solution for security issues even if its performance is influenced by the acquired data quality. In this paper, a novel embedded automatic fingerprint authentication system (AFAS) for mobile users is described. The goal of the proposed system is to improve the performance of a standard embedded AFAS in order to enable its employment in mobile devices architectures. The system is focused on the quality evaluation of the raw acquired fingerprint, identifying areas of poor quality. Using this approach, no image enhancement process is needed after the …
A Comparative Study on Fuzzy-Clustering-Based Lip Region Segmentation Methods
2011
As the first step of many lip-reading or visual speaker authentication systems, lip region segmentation is of vital importance. And fuzzy clustering based methods have been widely used in lip segmentation. In this paper, four fuzzy clustering based lip segmentation methods have been elaborated with their underlying rationale. Experiments have been carried out evaluate their performance comparatively. From the experimental results, SFCM has the best efficiency and FCMST has the best segmentation accuracy.
Optical security and encryption with totally incoherent light
2001
We present a method for securing and encrypting information optically by use of totally incoherent illumination. Encryption is performed with a multichannel optical processor working under natural (both temporal and spatially incoherent) light. In this way, the information that is to be secured can be codified by use of color signals and self-luminous displays. The encryption key is a phase-only mask, providing high security from counterfeiting. Output encrypted information is recorded as an intensity image that can be easily stored and transmitted optically or electrically. Decryption or authentication can also be performed optically or digitally. Experimental results are presented.
Sequential Lip Region Segmentation Using Fuzzy Clustering with Spatial and Temporal Information
2012
For many visual speech recognition and visual speaker authentication systems, lip region extraction is of vital important. In order to segment the lip region accurately and robustly from a lip sequence, a new fuzzy-clustering based algorithm is proposed. In the proposed method, a new dissimilarity measure is introduced to take all the color, spatial and temporal information into consideration. An iterative optimization method is employed to derive the optimal lip region membership map and the final segmentation result. From the experimental results, it is observed that the proposed algorithm can provide superior results compared with other traditional methods.
Automated Content Analysis of Destination Image: a Case Study
2020
Automated content analysis has become one of the most used approaches to extract “hidden” dimensions from text corpora over the last years. One of the data analysis techniques belonging to this approach is topic modeling, which can be fruitfully used to analyse complex phenomena like tourist destination image. With this aim in mind, this paper discusses the use of topic modeling to identify the main components of the image of cruise holidays spread through a specific type of visual text, i.e. the Television commercial. In order to achieve this goal, the paper presents the methodology and main results of a study carried out over a sample of TV commercials, which have recently been broadcast …
LogDet divergence-based metric learning with triplet constraints and its applications.
2014
How to select and weigh features has always been a difficult problem in many image processing and pattern recognition applications. A data-dependent distance measure can address this problem to a certain extent, and therefore an accurate and efficient metric learning becomes necessary. In this paper, we propose a LogDet divergence-based metric learning with triplet constraints (LDMLT) approach, which can learn Mahalanobis distance metric accurately and efficiently. First of all, we demonstrate the good properties of triplet constraints and apply it in LogDet divergence-based metric learning model. Then, to deal with high-dimensional data, we apply a compressed representation method to learn…