Search results for "computer vision"
showing 10 items of 2353 documents
A Kinect-Based Gesture Acquisition and Reproduction System for Humanoid Robots
2020
The paper illustrates a system that endows an humanoid robot with the capability to mimic the motion of a human user in real time, serving as a basis for further gesture based human-robot interactions. The described approach uses the Microsoft Kinect as a low cost alternative to expensive motion capture devices.
Directive local color transfer based on dynamic look-up table
2019
Abstract Color transfer in image processing usually suffers from misleading color mapping and loss of details. This paper presents a novel directive local color transfer method based on dynamic look-up table (D-DLT) to solve these problems in two steps. First, a directive mapping between the source and the reference image is established based on the salient detection and the color clusters to obtain directive color transfer intention. Then, dynamic look-up tables are created according to the color clusters to preserve the details, which can suppress pseudo contours and avoid detail loss. Subjective and objective assessments are presented to verify the feasibility and the availability of the…
Development of a fast panoramic face mosaicking and recognition system
2005
We present some development results on a system that per- forms mosaicking of panoramic faces. Our objective is to study the fea- sibility of panoramic face construction in real time. To do so, we built a simple acquisition system composed of five standard cameras, which together can take simultaneously five views of a face at different angles. Then, we chose an easily hardware-achievable algorithm, consisting of successive linear transformations, in order to compose a panoramic face from these five views. The method has been tested on a large number of faces. In order to validate our system, we also conducted a preliminary study on panoramic face recognition, based on the principal-compone…
Camera-LiDAR Data Fusion for Autonomous Mooring Operation
2020
Author's accepted manuscript. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The use of camera and LiDAR sensors to sense the environment has gained increasing popularity in robotics. Individual sensors, such as cameras and LiDARs, fail to meet the growing challenges in complex autonomous systems. One such scenario is autonomous mooring, where the ship has to …
Head Pose Estimation for Sign Language Video
2013
We address the problem of estimating three head pose angles in sign language video using the Pointing04 data set as training data. The proposed model employs facial landmark points and Support Vector Regression learned from the training set to identify yaw and pitch angles independently. A simple geometric approach is used for the roll angle. As a novel development, we propose to use the detected skin tone areas within the face bounding box as additional features for head pose estimation. The accuracy level of the estimators we obtain compares favorably with published results on the same data, but the smaller number of pose angles in our setup may explain some of the observed advantage.
Intelligent eye
2010
This paper describes Intelligent Eye, a mobile phone interactive leisure guide that offers location-based multimedia information. The information offered is related to the user's position, so the main goal of this work is the development of an efficient system to detect where the user is pointing his/her camera at by means of a content-based image retrieval algorithm (CBIR). The CBIR procedure uses color histograms in the HS color space extracted from images, and employs Kullback-Leibler divergence as the similarity measure. Intelligent Eye can be used in a wide range of camera-equipped mobile phones; however, efficiency is improved if GPS data is available. In order to outperform other sys…
2D-3D Camera Fusion for Visual Odometry in Outdoor Environments
2014
International audience; Accurate estimation of camera motion is very important for many robotics applications involving SfM and visual SLAM. Such accuracy is attempted by refining the estimated motion through nonlinear optimization. As many modern robots are equipped with both 2D and 3D cameras, it is both highly desirable and challenging to exploit data acquired from both modalities to achieve a better localization. Existing refinement methods, such as Bundle adjustment and loop closing, may be employed only when precise 2D-to-3D correspondences across frames are available. In this paper, we propose a framework for robot localization that benefits from both 2D and 3D information without re…
Learning Flow-Based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision
2020
Despite recent advances in deep learning-based face frontalization methods, photo-realistic and illumination preserving frontal face synthesis is still challenging due to large pose and illumination discrepancy during training. We propose a novel Flow-based Feature Warping Model (FFWM) which can learn to synthesize photo-realistic and illumination preserving frontal images with illumination inconsistent supervision. Specifically, an Illumination Preserving Module (IPM) is proposed to learn illumination preserving image synthesis from illumination inconsistent image pairs. IPM includes two pathways which collaborate to ensure the synthesized frontal images are illumination preserving and wit…
Multiscale Attention-Based Prototypical Network For Few-Shot Semantic Segmentation
2021
International audience; Deep learning-based image understanding techniques require a large number of labeled images for training. Few-shot semantic segmentation, on the contrary, aims at generalizing the segmentation ability of the model to new categories given only a few labeled samples. To tackle this problem, we propose a novel prototypical network (MAPnet) with multiscale feature attention. To fully exploit the representative features of target classes, we firstly extract rich contextual information of labeled support images via a multiscale feature enhancement module. The learned prototypes from support features provide further semantic guidance on the query image. Then we adaptively i…
No-reference mesh visual quality assessment via ensemble of convolutional neural networks and compact multi-linear pooling
2020
Abstract Blind or No reference quality evaluation is a challenging issue since it is done without access to the original content. In this work, we propose a method based on deep learning for the mesh visual quality assessment without reference. For a given 3D model, we first compute its mesh saliency. Then, we extract views from the 3D mesh and the corresponding mesh saliency. After that, the views are split into small patches that are filtered using a saliency threshold. Only the salient patches are selected and used as input data. After that, three pre-trained deep convolutional neural networks are employed for feature learning: VGG, AlexNet, and ResNet. Each network is fine-tuned and pro…