0000000001172983
AUTHOR
Cédric Demonceaux
Étude d'un système de stéréo-vision hybride
National audience; On considère dans ce travail un système de vision hybride fixe composé d'une caméra fisheye et d'une caméra PTZ dans un environnement rigide. Nous souhaitons être en mesure d'orienter la caméra mécanisée sur une cible visible depuis l'image omnidirectionnelle de manière à obtenir une image de bonne définition de l'objet d'intérêt à partir de la caméra PTZ. Nous proposons dans cet article d'utiliser la modélisation sphérique des images ainsi que les propriétés de la géométrie épipolaire afin d'initialiser la localisation de la cible dans la caméra PTZ.
High Quality Reconstruction of Dynamic Objects using 2D-3D Camera Fusion
International audience; In this paper, we propose a complete pipeline for high quality reconstruction of dynamic objects using 2D-3D camera setup attached to a moving vehicle. Starting from the segmented motion trajectories of individual objects, we compute their precise motion parameters, register multiple sparse point clouds to increase the density, and develop a smooth and textured surface from the dense (but scattered) point cloud. The success of our method relies on the proposed optimization framework for accurate motion estimation between two sparse point clouds. Our formulation for fusing it closest-point and it consensus based motion estimations, respectively in the absence and pres…
Optical flow estimation from multichannel spherical image decomposition
International audience; The problem of optical flow estimation is largely discussed in computer vision domain for perspective images. It was also proven that, in terms of optical flow analysis from these images, we have difficulty distinguishing between some motion fields obtained with little camera motion. The omnidirectional cameras provided images with large filed of view. These images contain global information about motion and allow to remove the ambiguity present in perspective case. Nevertheless, these images contain significant radial distortions that is necessary to take into account when treating these images to estimate the motion. In this paper, we shall describe new way to comp…
Adapted Approach for Omnidirectional Egomotion Estimation
Egomotion estimation is based principally on the estimation of the optical flow in the image. Recent research has shown that the use of omnidirectional systems with large fields of view allow overcoming the limitation presented in planar-projection imagery in order to address the problem of motion analysis. For omnidirectional images, the 2D motion is often estimated using methods developed for perspective images. This paper adapts motion field calculated using adapted method which takes into account the distortions existing in the omnidirectional image. This 2D motion field is then used as input to the egomotion estimation process using spherical representation of the motion equation. Expe…
Self-calibration of a PTZ Camera Using New LMI Constraints
In this paper, we propose a very reliable and flexible method for self-calibrating rotating and zooming cameras - generally referred to as PTZ (Pan-Tilt-Zoom) cameras. The proposed method employs a Linear Matrix Inequality (LMI) resolution approach and allows extra tunable constraints on the intrinsic parameters to be taken into account during the process of estimating these parameters. Furthermore, the considered constraints are simultaneously enforced in all views rather than in a single reference view. The results of our experiments show that the proposed approach allows for significant improvement in terms of accuracy and robustness when compared against state of the art methods.
QaQ: Robust 6D Pose Estimation via Quality-Assessed RGB-D Fusion
RGB-D 6D pose estimation has recently drawn great research attention thanks to the complementary depth information. Whereas, the depth and the color image are often noisy in real industrial scenarios. Therefore, it becomes challenging for many existing methods that fuse equally RGB and depth features. In this paper, we present a novel fusion design to adaptively merge RGB-D cues. Specifically, we created a Qualityassessment block that estimates the global quality of the input modalities. This quality represented as an α parameter is then used to reinforce the fusion. We have thus found a simple and effective way to improve the robustness to low-quality inputs in terms of Depth and RGB. Exte…
Modality-Guided Subnetwork for Salient Object Detection
Recent RGBD-based models for saliency detection have attracted research attention. The depth clues such as boundary clues, surface normal, shape attribute, etc., contribute to the identification of salient objects with complicated scenarios. However, most RGBD networks require multi-modalities from the input side and feed them separately through a two-stream design, which inevitably results in extra costs on depth sensors and computation. To tackle these inconveniences, we present in this paper a novel fusion design named modality-guided subnetwork (MGSnet). It has the following superior designs: 1) Our model works for both RGB and RGBD data, and dynamically estimating depth if not availabl…
RGB-Event Fusion for Moving Object Detection in Autonomous Driving
Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving. Despite plausible results of deep learning methods, most existing approaches are only frame-based and may fail to reach reasonable performance when dealing with dynamic traffic participants. Recent advances in sensor technologies, especially the Event camera, can naturally complement the conventional camera approach to better model moving objects. However, event-based works often adopt a pre-defined time window for event representation, and simply integrate it to estimate image intensities from events, neglecting much of the rich temporal information from the available asynchronous ev…
Attitude estimation from polarimetric cameras
International audience; In the robotic field, navigation and path planning applications benefit from a wide range of visual systems (e.g. perspective cameras, depth cameras, catadioptric cameras, etc.). In outdoor conditions, these systems capture information in which sky regions cover a major segment of the images acquired. However, sky regions are discarded and are not considered as visual cue in vision applications. In this paper, we propose to estimate attitude of Unmanned Aerial Vehicle (UAV) from sky information using a polarimetric camera. Theoretically , we provide a framework estimating the attitude from the skylight polarized patterns. We showcase this formulation on both simulate…
Homography based egomotion estimation with a common direction
International audience; In this paper, we explore the different minimal solutions for egomotion estimation of a camera based on homography knowing the gravity vector between calibrated images. These solutions depend on the prior knowledge about the reference plane used by the homography. We then demonstrate that the number of matched points can vary from two to three and that a direct closed-form solution or a Gröbner basis based solution can be derived according to this plane. Many experimental results on synthetic and real sequences in indoor and outdoor environments show the efficiency and the robustness of our approach compared to standard methods.
Incomplete 3D motion trajectory segmentation and 2D-to-3D label transfer for dynamic scene analysis
International audience; The knowledge of the static scene parts and the moving objects in a dynamic scene plays a vital role for scene modelling, understanding, and landmark-based robot navigation. The key information for these tasks lies on semantic labels of the scene parts and the motion trajectories of the dynamic objects. In this work, we propose a method that segments the 3D feature trajectories based on their motion behaviours, and assigns them semantic labels using 2D-to-3D label transfer. These feature trajectories are constructed by using the proposed trajectory recovery algorithm which takes the loss of feature tracking into account. We introduce a complete framework for static-m…
Summarizing Large Scale 3D Point Cloud for Navigation Tasks
International audience; Democratization of 3D sensor devices makes 3D maps building easier especially in long term mapping and autonomous navigation. In this paper we present a new method for summarizing a 3D map (dense cloud of 3D points). This method aims to extract a summary map facilitating the use of this map by navigation systems with limited resources (smartphones, cars, robots...). This Vision-based summarizing process is applied in a fully automatic way using the photometric, geometric and semantic information of the studied environment.
Efficient Dense Disparity Map Reconstruction using Sparse Measurements
International audience; In this paper, we propose a new stereo matching algorithm able to reconstruct efficiently a dense disparity maps from few sparse disparity measurements. The algorithm is initialized by sampling the reference image using the Simple Linear Iterative Clustering (SLIC) superpixel method. Then, a sparse disparity map is generated only for the obtained boundary pixels. The reconstruction of the entire disparity map is obtained through the scanline propagation method. Outliers were effectively removed using an adaptive vertical median filter. Experimental results were conducted on the standard and the new Middlebury datasets show that the proposed method produces high-quali…
Time to Contact Estimation on Paracatadioptric Cameras
International audience; Time to contact or time to collision (TTC) is the time available to a robot before reaching an object. In this paper, we propose to estimate this time using a catadioptric camera embedded on th erobot. Indeed, whereas a lot of works have shown the utility of this kind of cameras in robotic applications (monitoring, locali- sation, motion,...), a few works deal with the problem of time to contact estimation on it. Thus, in this paper, we propose a new work which allows to define and to estimate the TTC on catadioptric camera. This method will be validated on simulated and real data.
OmniFlowNet: a Perspective Neural Network Adaptation for Optical Flow Estimation in Omnidirectional Images
International audience; Spherical cameras and the latest image processing techniques open up new horizons. In particular, methods based on Convolutional Neural Networks (CNNs) now give excellent results for optical flow estimation on perspective images. However, these approaches are highly dependent on their architectures and training datasets. This paper proposes to benefit from years of improvement in perspective images optical flow estimation and to apply it to omnidirectional ones without training on new datasets. Our network, OmniFlowNet, is built on a CNN specialized in perspective images. Its convolution operation is adapted to be consistent with the equirectangular projection. Teste…
Unsupervised learning of category-specific symmetric 3D keypoints from point sets
Lecture Notes in Computer Science, 12370
LMI-based 2D-3D Registration: from Uncalibrated Images to Euclidean Scene
International audience; This paper investigates the problem of registering a scanned scene, represented by 3D Euclidean point coordinates , and two or more uncalibrated cameras. An unknown subset of the scanned points have their image projections detected and matched across images. The proposed approach assumes the cameras only known in some arbitrary projective frame and no calibration or autocalibration is required. The devised solution is based on a Linear Matrix Inequality (LMI) framework that allows simultaneously estimating the projective transformation relating the cameras to the scene and establishing 2D-3D correspondences without triangulating image points. The proposed LMI framewo…
Visual Servoing Based on Shifted Moments
Over the past decade, image moments have been exploited in several visual servoing schemes for their ability to represent object regions, objects defined by contours or a set of discrete points. Moments have also been useful to achieve control decoupling properties and to choose a minimal number of features to control the whole degrees of freedom (DOFs) of a camera. However, the choice of moment-based features to control the rotational motions around the $x$ -axis and $y$ -axis simultaneously with the translational motions along the same axis remains a key issue. In this paper, we introduce new visual features computed from low-order “shifted moments invariant.” Importantly, they allow us 1…
A homography formulation to the 3pt plus a common direction relative pose problem
International audience; In this paper we present an alternative formulation for the minimal solution to the 3pt plus a common direction relative pose prob-lem. Instead of the commonly used epipolar constraint we use the homog-raphy constraint to derive a novel formulation for the 3pt problem. This formulation allows the computation of the normal vector of the plane defined by the three input points without any additional computation in addition to the standard motion parameters of the camera. We show the working of the method on synthetic and real data sets and compare it to the standard 3pt method and the 5pt method for relative pose estima-tion. In addition we analyze the degenerate condi…
A robust evolutionary algorithm for the recovery of rational Gielis curves
International audience; Gielis curves (GC) can represent a wide range of shapes and patterns ranging from star shapes to symmetric and asymmetric polygons, and even self intersecting curves. Such patterns appear in natural objects or phenomena, such as flowers, crystals, pollen structures, animals, or even wave propagation. Gielis curves and surfaces are an extension of Lamé curves and surfaces (superquadrics) which have benefited in the last two decades of extensive researches to retrieve their parameters from various data types, such as range images, 2D and 3D point clouds, etc. Unfortunately, the most efficient techniques for superquadrics recovery, based on deterministic methods, cannot…
Apprentissage de modalités auxiliaires pour la localisation basée vision
In this paper we present a new training with side modality framework to enhance image-based localization. In order to learn side modality information, we train a fully convo-lutional decoder network that transfers meaningful information from one modality to another. We validate our approach on a challenging urban dataset. Experiments show that our system is able to enhance a purely image-based system by properly learning appearance of a side modality. Compared to state-of-the-art methods, the proposed network is lighter and faster to train, while producing comparable results.
Tracking Moving Objects With a Catadioptric Sensor Using Particle Filter
International audience; Visual tracking in video sequences is a widely developed topic in computer vision applications. However, the emergence of panoramic vision using catadioptric sensors has created the need for new approaches in order to track an object in this type of images. Indeed the non-linear resolution and the geometric distortions due to the insertion of the mirror, make tracking in catadioptric images a very challenging task. This paper describes particle filter for tracking moving object over time using a catadioptric sensor. In this work different problems due to the specificities of the catadioptric systems such as geometry are considered. The obtained results demonstrate an…
Local path planning in a complex environment for self-driving car
This paper introduces an local path planning algorithm for the self-driving car in a complex environment. The proposed algorithm is composed of three parts: the novel path representation, the collision detection and the path modification using a voronoi cell. The novel path representation provides convenience for checking the collision and modifying the path and continuous control input for steering wheel rather than way point navigation. The proposed algorithm were applied to the self-driving car, EureCar(KAIST) and its applicability and feasibility of real time use were validated.
Gradient-based time to contact on paracatadioptric camera
International audience; The problem of time to contact or time to collision (TTC) estimation is largely discussed in perspective images. However, a few works have dealt with images of catadioptric sensors despite of their utility in robotics applications. The objective of this paper is to develop a novel model for estimating TTC with catadioptric images relative to a planar surface, and to demonstrate that TTC can be estimated only with derivative brightness and image coordinates. This model, called "gradient based time to contact", does not need high processing such as explicit estimation of optical flow and feature detection/or tracking. The proposed method allows to estimate TTC and give…
Perception for Robotics: Part II
«What is around me?»knowing how to answer to this question is essential for a robot to navigate safely in its environment. In this part, we will try to find some answers by introducing some tools for helping the robot to perceive the world. Thus, we will present different sensors (lasers, radars, cameras, RGBD sensors) which allow to build a 3D map of the scene where the robot is navigating. Then, we will mainly focus on cameras since these sensors give a rich information about the scene. We will see how we can extract features from the image in order to provide a 3D perception of the world. This chapter presents the main concepts required to start computer vision applied to robotics. It is…
Visual tracking with omnidirectional cameras: an efficient approach
International audience; An effective technique for applying visual tracking algorithms to omni- directional image sequences is presented. The method is based on a spherical image representation which allows taking into account the distortions and nonlinear resolution of omnidirectional images. Experimental results show that both deterministic and probabilistic tracking methods can effectively be adapted in order to robustly track an object with an omnidirectional camera.
ACCURATE DENSE STEREO MATCHING FOR ROAD SCENES
International audience; Stereo matching task is the core of applications linked to the intelligent vehicles. In this paper, we present a new variant function of the Census Transform (CT) which is more robust against radiometric changes in real road scenes. We demonstrate that the proposed cost function outperforms the conventional cost functions using the KITTI benchmark. The cost aggregation method is also updated for taking into account the edge information. This enables to improve significantly the aggregated costs especially within homogenous regions. The Winner-Takes-All (WTA) strategy is used to compute disparity values. To further eliminate the remainder matching ambiguities , a post…
Scale invariant line matching on the sphere
International audience; This paper proposes a novel approach of line matching across images captured by different types of cameras, from perspective to omnidirectional ones. Based on the spherical mapping, this method utilizes spherical SIFT point features to boost line matching and searches line correspondences using an affine invariant measure of similarity. It permits to unify the commonest cameras and to process heterogeneous images with the least distortion of visual information.
From Nowhere to Everywhere
International audience; This paper presents a synthetic view of a variety of projects built upon an Erasmuss Mundus Master Course. It highlights double degree programs, European credits transfer, joint PhDs, research collaborations as well as few other related European projects going from Thematic Networks to another Erasmus Mundus Course.
Reconstruction 3D de scènes dynamiques par segmentation au sens du mouvement
National audience; L'objectif de ce travail est de reconstruire les parties sta-tiques et dynamiques d'une scène 3D à l'aide d'un robot mobile équipé d'un capteur 3D. Cette reconstruction né-cessite la classification des points 3D acquis au cours du temps en point fixe et point mobile indépendamment du dé-placement du robot. Notre méthode de segmentation utilise directement les données 3D et étudie les mouvements des objets dans la scène sans hypothèse préalable. Nous déve-loppons un algorithme complet reconstruisant les parties fixes de la scène à chaque acquisition à l'aide d'un RAN-SAC qui ne requiert que 3 points pour recaler les nuages de points. La méthode a été expérimentée sur de la…
Procede de commande de pulverisation, dispositif et programme correspondant
Multimodal 2D Image to 3D Model Registration via a Mutual Alignment of Sparse and Dense Visual Features
International audience; Many fields of application could benefit from an accurate registration of measurements of different modalities over a known 3D model. However, aligning a 2D image to a 3D model is a challenging task and is even more complex when the two have a different modality. Most of the 2D/3D registration methods are based on either geometric or dense visual features. Both have their own advantages and their own drawbacks. We propose, in this paper, to mutually exploit the advantages of one feature type to reduce the drawbacks of the other one. For this, an hybrid registration framework has been designed to mutually align geometrical and dense visual features in order to obtain …
Omnidirectional vision for UAV: applications to attitude, motion and altitude estimation for day and night conditions
International audience; This paper presents the combined applications of omnidirectional vision featuring on its application to aerial robotics. Omnidirectional vision is first used to compute the attitude, altitude and motion not only in rural environment but also in the urban space. Secondly, a combination of omnidirectional and perspective cameras permits to estimate the altitude. Finally we present a stereo system consisting of an omnidirectional camera with a laser pattern projector enables to calculate the altitude and attitude during the improperly illuminated conditions to dark environments. We demonstrate that omnidirectional camera in conjunction with other sensors is suitable cho…
Localisation Basée Vision : de l'hétérogénéité des approches et des données
National audience; De nos jours, nous disposons d'une grande diversité de données sur les lieux qui nous entourent. Ces données peuvent être de natures très différentes : une collection d'images, un modèle 3D, un nuage de points colorisés, etc. Lorsque les GPS font défaut, ces informations peuvent être très utiles pour localiser un agent dans son environnement s'il peut lui-même acquérir des informations à partir d'un système de vision. On parle alors de Localisation Basée Vision (LBV). De par la grande hétérogénéité des données acquises et connues sur l'environnement, il existe de nombreux travaux traitant de ce problème. Cet article a pour objet de passer en revue les différentes méthodes…
Static and Dynamic Objects Analysis as a 3D Vector Field
International audience; In the context of scene modelling, understanding, and landmark-based robot navigation, the knowledge of static scene parts and moving objects with their motion behaviours plays a vital role. We present a complete framework to detect and extract the moving objects to reconstruct a high quality static map. For a moving 3D camera setup, we propose a novel 3D Flow Field Analysis approach which accurately detects the moving objects using only 3D point cloud information. Further, we introduce a Sparse Flow Clustering approach to effectively and robustly group the motion flow vectors. Experiments show that the proposed Flow Field Analysis algorithm and Sparse Flow Clusterin…
Localization of 2D Cameras in a Known Environment Using Direct 2D-3D Registration
International audience; In this paper we propose a robust and direct 2D-to- 3D registration method for localizing 2D cameras in a known 3D environment. Although the 3D environment is known, localizing the cameras remains a challenging problem that is particularly undermined by the unknown 2D-3D correspondences, outliers, scale ambiguities and occlusions. Once the cameras are localized, the Structure-from-Motion reconstruction obtained from image correspondences is refined by means of a constrained nonlinear optimization that benefits from the knowledge of the scene. We also propose a common optimization framework for both localization and refinement steps in which projection errors in one v…
Une approche performante de suivi visuel pour les caméras catadioptriques
Session "Posters"; National audience; Dans cet article, nous proposons une méthode performante permettant d'appliquer des algorithmes de suivi visuel à des images catadioptriques. Cette méthode est basée sur une représentation sphérique de l'image qui permet de prendre en compte les distorsions et la résolution non-uniforme des images catadioptriques. Les résultats expérimentaux proposés démontrent que les méthodes probabilistes et déterministes peuvent être adaptées de manière à suivre un objet avec précision dans une séquence d'images catadioptriques
Efficient Pruning LMI Conditions for Branch-and-Prune Rank and Chirality-Constrained Estimation of the Dual Absolute Quadric
International audience; We present a new globally optimal algorithm for self- calibrating a moving camera with constant parameters. Our method aims at estimating the Dual Absolute Quadric (DAQ) under the rank-3 and, optionally, camera centers chirality constraints. We employ the Branch-and-Prune paradigm and explore the space of only 5 parameters. Pruning in our method relies on solving Linear Matrix Inequality (LMI) feasibility and Generalized Eigenvalue (GEV) problems that solely depend upon the entries of the DAQ. These LMI and GEV problems are used to rule out branches in the search tree in which a quadric not satisfy- ing the rank and chirality conditions on camera centers is guarantee…
OLF : RGB-D Adaptive Late Fusion for Robust 6D Pose Estimation
RGB-D 6D pose estimation has recently gained significant research attention due to the complementary information provided by depth data. However, in real-world scenarios, especially in industrial applications, the depth and color images are often more noisy. Existing methods typically employ fusion designs that equally average RGB and depth features, which may not be optimal. In this paper, we propose a novel fusion design that adaptively merges RGB-D cues. Our approach involves assigning two learnable weight α 1 and α 2 to adjust the RGB and depth contributions with respect to the network depth. This enables us to improve the robustness against low-quality depth input in a simple yet effec…
A Geometrical Approach for Vision Based Attitude and Altitude Estimation for UAVs in Dark Environments
International audience; This paper presents a single camera and laser system dedicated to the realtime estimation of attitude and altitude for unmanned aerial vehicles (UAV) under low illumination conditions to dark environments. The fisheye camera allows to cover a large field of view (FOV). The approach, close to structured light systems, uses the geometrical information obtained by the projection of a laser circle onto the ground plane and perceived by the camera. We propose some experiments based on simulated data and real sequences. The results show good agreement with the ground truth values from the commercial sensors in terms of its accuracy and correctness. The results also prove i…
Stratified Autocalibration of Cameras with Euclidean Image Plane
International audience; This paper tackles the problem of stratified autocalibration of a moving camera with Euclidean image plane (i.e. zero skew and unit aspect ratio) and constant intrinsic parameters. We show that with these assumptions, in addition to the polynomial derived from the so-called modulus constraint, each image pair provides a new quartic polynomial in the unknown plane at infinity. For three or more images, the plane at infinity estimation is stated as a constrained polynomial optimization problem that can efficiently be solved using Lasserre's hierarchy of semidefinite relaxations. The calibration parameters and thus a metric reconstruction are subsequently obtained by so…
Estimation de mouvement d'un drone à partir d'un capteur stéréo hybride
Motion and velocity are two of the most important parameters to be known for an Unmanned Aerial Vehicle (UAV) especially during critical maneuvers such as landing or steady flight. In this paper, we present mixed stereoscopic vision system made of a fish-eye camera and a perspective camera for motion estimation. Contrary to classical stereoscopic systems based on feature matching between cameras, we propose an algorithm which tracks and exploits points in each camera independently. The omnidirectional view estimates the orientation of the motion while the perspective view contribute to estimate the scale of the translation and brings accuracy. By fusing points tracked in each camera and kno…
MOISST: Multimodal Optimization of Implicit Scene for SpatioTemporal calibration
With the recent advances in autonomous driving and the decreasing cost of LiDARs, the use of multimodal sensor systems is on the rise. However, in order to make use of the information provided by a variety of complimentary sensors, it is necessary to accurately calibrate them. We take advantage of recent advances in computer graphics and implicit volumetric scene representation to tackle the problem of multi-sensor spatial and temporal calibration. Thanks to a new formulation of the Neural Radiance Field (NeRF) optimization, we are able to jointly optimize calibration parameters along with scene representation based on radiometric and geometric measurements. Our method enables accurate and …
Structure from motion using a hybrid stereo-vision system
International audience; This paper is dedicated to robotic navigation using an original hybrid-vision setup combining the advantages offered by two different types of camera. This couple of cameras is composed of one perspective camera associated with one fisheye camera. This kind of configuration , is also known under the name of foveated vision system since it is inspired by the human vision system and allows both a wide field of view and a detail front view of the scene. Here, we propose a generic and robust approach for SFM, which is compatible with a very broad spectrum of multi-camera vision systems, suitable for perspective and om-nidirectional cameras, with or without overlapping fi…
Estimation de la pose d'une caméra dans un environnement connu à partir d'un recalage 2D-3D
National audience; Nous proposons une méthode directe de recalage robuste 2D-3D permettant de localiser une caméra dans un environnement 3D connu. Il s'agit d'un problème rendu particulièrement difficile par l'absence de correspondances entre les points 3D du nuage et les points 2D. A cette difficulté, s'ajoute la différence d'échelle entre le nuage 3D connu et le nuage 3D reconstruit à partir d'images qui, de plus, peut contenir des points aberrants et des occultations. Notre méthode consiste en l'optimisation d'une fonctionnelle de manière itérative en deux étapes : estimation de la pose de la caméra et mise en correspondance 2D-3D. Ainsi, nous obtenons une méthode d'estimation conjointe …
Computer vision-based approach for rite decryption in old societies
International audience; This paper presents an approach to determine the spatial arrangement of bones of horses in an excavation site and perform the 3D reconstruction of the scene. The relative 3D positioning of the bones was computed exploiting the information in images acquired at different levels, and used to relocate provided 3D models of the bones. A novel semi-supervised approach was proposed to generate dense point clouds of the bones from sparse features. The point clouds were later matched with the given models using Iterative Closest Point (ICP).
Analysis of Low-Altitude Aerial Sequences for Road Traffic Diagnosis using Graph Partitioning and Markov Hierarchical Models
International audience; This article focuses on an original approach aiming the processing of low-altitude aerial sequences taken from an helicopter (or drone) and presenting a road traffic. Proposed system attempts to extract vehicles from acquired sequences. Our approach begins with detecting the primitives of sequence images. At the time of this step of segmentation, the system computes dominant motion for each pair of images. This motion is computed using wavelets analysis on optical flow equation and robust techniques. Interesting areas (areas not affected by the dominant motion) are detected thanks to a Markov hierarchical model. Primitives stemming from segmentation and interesting a…
Fast Earth Mover's Distance Computation for Catadioptric Image Sequences
International audience; Earth mover's distance is one of the most effective metric for comparing histograms in various image retrieval applications. The main drawback is its computational complexity which hinders its usage in various comparison tasks. We propose fast earth mover's distance computation by providing better initialization to the transportation simplex algorithm. The new approach enables faster EMD computation in Visual Memory (VM) compared to the state of the art methods. The new proposed strategy computes earth mover distance without compromising its accuracy.
Central catadioptric image processing with geodesic metric
International audience; Because of the distortions produced by the insertion of a mirror, catadioptric images cannot be processed similarly to classical perspective images. Now, although the equivalence between such images and spherical images is well known, the use of spherical harmonic analysis often leads to image processing methods which are more difficult to implement. In this paper, we propose to define catadioptric image processing from the geodesic metric on the unitary sphere. We show that this definition allows to adapt very simply classical image processing methods. We focus more particularly on image gradient estimation, interest point detection, and matching. More generally, th…
Vision based attitude and altitude estimation for UAVs in dark environments
This paper presents a system dedicated to the real-time estimation of attitude and altitude for unmanned aerial vehicles (UAV) under low light and dark environment. This system consists in a fisheye camera, which allows to cover a large field of view (FOV), and a laser circle projector mounted on a fixed baseline. The approach, close to structured light systems, uses the geometrical information obtained by the projection of the laser circle onto the ground plane and perceived by the camera. We present a theoretical study of the system in which the camera is modelled as a sphere and show that the estimation of a conic on this sphere allows to obtain the attitude and the altitude of the robot…
Rotation estimation and vanishing point extraction by omnidirectional vision in urban environment
International audience; Rotation estimation is a fundamental step for various robotic applications such as automatic control of ground/aerial vehicles, motion estimation and 3D reconstruction. However it is now well established that traditional navigation equipments, such as global positioning systems (GPSs) or inertial measurement units (IMUs), suffer from several disadvantages. Hence, some vision-based works have been proposed recently. Whereas interesting results can be obtained, the existing methods have non-negligible limitations such as a difficult feature matching (e.g. repeated textures, blur or illumination changes) and a high computational cost (e.g. analyze in the frequency domai…
Depth-Adapted CNN for RGB-D cameras
Conventional 2D Convolutional Neural Networks (CNN) extract features from an input image by applying linear filters. These filters compute the spatial coherence by weighting the photometric information on a fixed neighborhood without taking into account the geometric information. We tackle the problem of improving the classical RGB CNN methods by using the depth information provided by the RGB-D cameras. State-of-the-art approaches use depth as an additional channel or image (HHA) or pass from 2D CNN to 3D CNN. This paper proposes a novel and generic procedure to articulate both photometric and geometric information in CNN architecture. The depth data is represented as a 2D offset to adapt …
Robust RGB-D Fusion for Saliency Detection
Efficiently exploiting multi-modal inputs for accurate RGB-D saliency detection is a topic of high interest. Most existing works leverage cross-modal interactions to fuse the two streams of RGB-D for intermediate features' enhancement. In this process, a practical aspect of the low quality of the available depths has not been fully considered yet. In this work, we aim for RGB-D saliency detection that is robust to the low-quality depths which primarily appear in two forms: inaccuracy due to noise and the misalignment to RGB. To this end, we propose a robust RGB-D fusion method that benefits from (1) layer-wise, and (2) trident spatial, attention mechanisms. On the one hand, layer-wise atten…
Estimation de mouvement d'un système stéréoscopique hybride à partir des droites
We present a motion estimation approach for hybrid stereo rigs using line images. The proposed method can be applied to a hybrid system built up from any single view point (SVP) cameras such as perspective, central catadioptric and fisheye cameras. Such configuration combines advantageous characteristics of different types of cameras. Images captured by SVP imaging devices may be mapped to spherical images using the unified projection model. It is possible to recover the camera orientations using vanishing points of parallel line sets. We then estimate the translations from known rotations and line images on the spheres. The algorithm has been validated on simulated data and real images tak…
Estimation des Cartes du Temps de Collision (TTC) en Vision Para-catadioptrique
National audience; Le temps de contact ou le temps de collision (TTC) est une information importante pour la navigation et l'évitement d'obtacles. Son estimation a largement été étudiée dans le cas des caméras perspectives. Par contre, très peu de travaux ont été effectués sur ce sujet pour les caméras catadioptriques, alors qu'elles sont très utiles, notamment, en navigation des robots mobiles. L'objectif de cet article, est de proposer un nouveau modèle d'estimation du TTC pour les caméras paracatadioptriques basé sur le flot optique, en adaptant celui développé pour les caméras perspectives. Le calcul du TTC en chaque pixel permet d'obtenir la carte des temps de collision. Nous avons val…
2D-3D Camera Fusion for Visual Odometry in Outdoor Environments
International audience; Accurate estimation of camera motion is very important for many robotics applications involving SfM and visual SLAM. Such accuracy is attempted by refining the estimated motion through nonlinear optimization. As many modern robots are equipped with both 2D and 3D cameras, it is both highly desirable and challenging to exploit data acquired from both modalities to achieve a better localization. Existing refinement methods, such as Bundle adjustment and loop closing, may be employed only when precise 2D-to-3D correspondences across frames are available. In this paper, we propose a framework for robot localization that benefits from both 2D and 3D information without re…
Real time UAV altitude, attitude and motion estimation from hybrid stereovision
International audience; Knowledge of altitude, attitude and motion is essential for an Unmanned Aerial Vehicle during crit- ical maneuvers such as landing and take-off. In this paper we present a hybrid stereoscopic rig composed of a fisheye and a perspective camera for vision-based navigation. In contrast to classical stereoscopic systems based on feature matching, we propose methods which avoid matching between hybrid views. A plane-sweeping approach is proposed for estimating altitude and de- tecting the ground plane. Rotation and translation are then estimated by decoupling: the fisheye camera con- tributes to evaluating attitude, while the perspective camera contributes to estimating t…
Dynamic 3D Scene Reconstruction and Enhancement
International audience; In this paper, we present a 3D reconstruction and enhancement approach for high quality dynamic city scene reconstructions. We first detect and segment the moving objects using 3D Motion Segmenta-tion approach by exploiting the feature trajectories' behaviours. Getting the segmentations of both the dynamic scene parts and the static scene parts, we propose an efficient point cloud registration approach which takes the advantages of 3-point RANSAC and Iterative Closest Points algorithms to produce precise point cloud alignment. Furthermore, we proposed a point cloud smoothing and texture mapping framework to enhance the results of reconstructions for both the static a…
Extrinsic calibration of heterogeneous cameras by line images
International audience; The extrinsic calibration refers to determining the relative pose of cameras. Most of the approaches for cameras with non-overlapping fields of view (FOV) are based on mirror reflection, object tracking or rigidity constraint of stereo systems whereas cameras with overlapping FOV can be calibrated using structure from motion solutions. We propose an extrinsic calibration method within structure from motion framework for cameras with overlapping FOV and its extension to cameras with partially non-overlapping FOV. Recently, omnidirectional vision has become a popular topic in computer vision as an omnidirectional camera can cover large FOV in one image. Combining the g…
Perspective-n-Learned-Point: Pose Estimation from Relative Depth
International audience; In this paper we present an online camera pose estimation method that combines Content-Based Image Retrieval (CBIR) and pose refinement based on a learned representation of the scene geometry extracted from monocular images. Our pose estimation method is two-step, we first retrieve an initial 6 Degrees of Freedom (DoF) location of an unknown-pose query by retrieving the most similar candidate in a pool of geo-referenced images. In a second time, we refine the query pose with a Perspective-n-Point (PnP) algorithm where the 3D points are obtained thanks to a generated depth map from the retrieved image candidate. We make our method fast and lightweight by using a commo…
OMNI-DRL: Learning to Fly in Forests with Omnidirectional Images
Perception is crucial for drone obstacle avoidance in complex, static, and unstructured outdoor environments. However, most navigation solutions based on Deep Reinforcement Learning (DRL) use limited Field-Of-View (FOV) images as input. In this paper, we demonstrate that omnidirectional images improve these methods. Thus, we provide a comparative benchmark of several visual modalities for navigation: ground truth depth, ground truth semantic segmentation, and RGB images. These exhaustive comparisons reveal that it is superior to use an omnidirectional camera to navigate with classical DRL methods. Finally, we show in two different virtual forest environments that adapting the convolution to…
N-QGN: Navigation Map from a Monocular Camera using Quadtree Generating Networks
Monocular depth estimation has been a popular area of research for several years, especially since self-supervised networks have shown increasingly good results in bridging the gap with supervised and stereo methods. However, these approaches focus their interest on dense 3D reconstruction and sometimes on tiny details that are superfluous for autonomous navigation. In this paper, we propose to address this issue by estimating the navigation map under a quadtree representation. The objective is to create an adaptive depth map prediction that only extract details that are essential for the obstacle avoidance. Other 3D space which leaves large room for navigation will be provided with approxi…
Etude de caméras sphériques : du traitement des images aux applications en robotique
La vision omnidirectionnelle permet de percevoir l'environnement sur 360◦. C'est un atout considérable pour un robot mobile puisque grâce à cette particularité, il peut tirer parti d'une information globale de la scène à tout moment. Les premières recherches allant dans ce sens remontent en 1990 avec les travaux de Yasushi Yagi qui a proposé d'utiliser un dispositif nommé COPIS en associant une caméra perspective et un miroir conique pour obtenir une vue panoramique de la scène. En 1995, Mouaddib et Pégard présentent un procédé similaire pour localiser un robot dans son environnement. Bien que ce système possède des avantages pour la localisation, il se révèle peu adéquat pour reconstruire …
Optical flow estimation from multichannel spherical image decomposition
The problem of optical flow estimation is largely discussed in computer vision domain for perspective images. It was also proven that, in terms of optical flow analysis from these images, we have difficulty distinguishing between some motion fields obtained with little camera motion. The omnidirectional cameras provided images with large filed of view. These images contain global information about motion and allow to remove the ambiguity present in perspective case. Nevertheless, these images contain significant radial distortions that is necessary to take into account when treating these images to estimate the motion. In this paper, we shall describe new way to compute efficient optical fl…
Extraction d'un graphe de navigabilité à partir d'un nuage de points 3D enrichis
International audience; Ce travail se place dans le cadre général du projet ANR pLaTINUM lié à la navigation autonome et plus parti-culièrement à la génération de cartes pour la navigation basée perception. Il consiste à développer une nouvelle méthode pour résumer une carte 3D (un nuage dense de points 3D) et extraire un graphe de navigabilité facilitant l'utilisation de cette carte par des systèmes de navigation à ressources matérielles limitées (smart-phones, voitures, robots.. .). Cette méthode vise à ex-traire les régions les plus saillantes de l'environnement étudié afin de construire une carte récapitulative. Ce processus de résumé de carte basé sur la vision est appliqué d'une façon…
Globally Optimal Line Clustering and Vanishing Point Estimation in Manhattan World
The projections of world parallel lines in an image intersect at a single point called the vanishing point (VP). VPs are a key ingredient for various vision tasks including rotation estimation and 3D reconstruction. Urban environments generally exhibit some dominant orthogonal VPs. Given a set of lines extracted from a calibrated image, this paper aims to (1) determine the line clustering, i.e. find which line belongs to which VP, and (2) estimate the associated orthogonal VPs. None of the existing methods is fully satisfactory because of the inherent difficulties of the problem, such as the local minima and the chicken-and-egg aspect. In this paper, we present a new algorithm that solves t…
Summarizing Large Scale 3D Mesh
International audience; Recent progress in 3D sensor devices and in semantic mapping allows to build very rich HD 3D maps very useful for autonomous navigation and localization. However , these maps are particularly huge and require important memory capabilities as well computational resources. In this paper, we propose a new method for summarizing a 3D map (Mesh) as a set of compact spheres in order to facilitate its use by systems with limited resources (smartphones, robots, UAVs, ...). This vision-based summarizing process is applied in a fully automatic way using jointly photometric, geometric and semantic information of the studied environment. The main contribution of this research is…
Line reconstruction using prior knowledge in single non-central view
International audience; Line projections in non-central systems contain more geometric information than in central systems. The four degrees of freedom of the 3D line are mapped to the line-image and the 3D line can be theoretically recovered from 4 projecting rays (i.e. line-image points) from a single non-central view. In practice, extraction of line-images is consid- erably more difficult and the resulting reconstruction is imprecise and sensitive to noise. In this paper we present a minimal solution to recover the geometry of the 3D line from only three line-image points when the line is parallel to a given plane. A second minimal solution allows to recover the 3D line from two points w…
3D Reconstruction of Dynamic Vehicles using Sparse 3D-Laser-Scanner and 2D Image Fusion
International audience; Map building becomes one of the most interesting research topic in computer vision field nowadays. To acquire accurate large 3D scene reconstructions, 3D laser scanners are recently developed and widely used. They produce accurate but sparse 3D point clouds of the environments. However, 3D reconstruction of rigidly moving objects along side with the large-scale 3D scene reconstruction is still lack of interest in many researches. To achieve a detailed object-level 3D reconstruction, a single scan of point cloud is insufficient due to their sparsity. For example, traditional Iterative Closest Point (ICP) registration technique or its variances are not accurate and rob…
Visual contact with catadioptric cameras
Abstract Time to contact or time to collision (TTC) is utmost important information for animals as well as for mobile robots because it enables them to avoid obstacles; it is a convenient way to analyze the surrounding environment. The problem of TTC estimation is largely discussed in perspective images. Although a lot of works have shown the interest of omnidirectional camera for robotic applications such as localization, motion, monitoring, few works use omnidirectional images to compute the TTC. In this paper, we show that TTC can be also estimated on catadioptric images. We present two approaches for TTC estimation using directly or indirectly the optical flow based on de-rotation strat…