0000000000801483

AUTHOR

Zongwei Wu

showing 7 related works from this author

Modality-Guided Subnetwork for Salient Object Detection

2021

Recent RGBD-based models for saliency detection have attracted research attention. The depth clues such as boundary clues, surface normal, shape attribute, etc., contribute to the identification of salient objects with complicated scenarios. However, most RGBD networks require multi-modalities from the input side and feed them separately through a two-stream design, which inevitably results in extra costs on depth sensors and computation. To tackle these inconveniences, we present in this paper a novel fusion design named modality-guided subnetwork (MGSnet). It has the following superior designs: 1) Our model works for both RGB and RGBD data, and dynamically estimating depth if not availabl…

FOS: Computer and information sciencesComputer Vision and Pattern Recognition (cs.CV)[INFO.INFO-RB] Computer Science [cs]/Robotics [cs.RO]Computer Science - Computer Vision and Pattern Recognition[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO]
researchProduct

Depth-Adapted CNN for RGB-D cameras

2020

Conventional 2D Convolutional Neural Networks (CNN) extract features from an input image by applying linear filters. These filters compute the spatial coherence by weighting the photometric information on a fixed neighborhood without taking into account the geometric information. We tackle the problem of improving the classical RGB CNN methods by using the depth information provided by the RGB-D cameras. State-of-the-art approaches use depth as an additional channel or image (HHA) or pass from 2D CNN to 3D CNN. This paper proposes a novel and generic procedure to articulate both photometric and geometric information in CNN architecture. The depth data is represented as a 2D offset to adapt …

FOS: Computer and information sciencesOffset (computer science)Computer scienceComputer Vision and Pattern Recognition (cs.CV)Coordinate systemComputer Science::Neural and Evolutionary ComputationComputer Science - Computer Vision and Pattern RecognitionComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technologyConvolutional neural network030218 nuclear medicine & medical imaging03 medical and health sciences0302 clinical medicine0202 electrical engineering electronic engineering information engineering[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO]Computer visionInvariant (mathematics)business.industry[INFO.INFO-RB] Computer Science [cs]/Robotics [cs.RO]020207 software engineeringWeightingSpatial coherenceComputer Science::Computer Vision and Pattern RecognitionRGB color modelArtificial intelligencebusinessLinear filter
researchProduct

Robust RGB-D Fusion for Saliency Detection

2022

Efficiently exploiting multi-modal inputs for accurate RGB-D saliency detection is a topic of high interest. Most existing works leverage cross-modal interactions to fuse the two streams of RGB-D for intermediate features' enhancement. In this process, a practical aspect of the low quality of the available depths has not been fully considered yet. In this work, we aim for RGB-D saliency detection that is robust to the low-quality depths which primarily appear in two forms: inaccuracy due to noise and the misalignment to RGB. To this end, we propose a robust RGB-D fusion method that benefits from (1) layer-wise, and (2) trident spatial, attention mechanisms. On the one hand, layer-wise atten…

FOS: Computer and information sciences[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI]Computer Vision and Pattern Recognition (cs.CV)Computer Science - Computer Vision and Pattern Recognition
researchProduct

Depth Attention for Scene Understanding

2022

Deep learning models can nowadays teach a machine to realize a number of tasks, even with better precision than human beings. Among all the modules of an intelligent machine, perception is the most essential part without which all other action modules have difficulties in safely and precisely realizing the target task under complex scenes. Conventional perception systems are based on RGB images which provide rich texture information about the 3D scene. However, the quality of RGB images highly depends on environmental factors, which further influence the performance of deep learning models. Therefore, in this thesis, we aim to improve the performance and robustness of RGB models with comple…

Multi-Modal fusionApprentissage profond[INFO.INFO-TS] Computer Science [cs]/Signal and Image ProcessingDeep Learning for Computer VisionVision par ordinateurRGB-D FusionComputer visionDeep learningVision par Ordinateur et Intelligence Artificielle[INFO] Computer Science [cs]
researchProduct

QaQ: Robust 6D Pose Estimation via Quality-Assessed RGB-D Fusion

2023

RGB-D 6D pose estimation has recently drawn great research attention thanks to the complementary depth information. Whereas, the depth and the color image are often noisy in real industrial scenarios. Therefore, it becomes challenging for many existing methods that fuse equally RGB and depth features. In this paper, we present a novel fusion design to adaptively merge RGB-D cues. Specifically, we created a Qualityassessment block that estimates the global quality of the input modalities. This quality represented as an α parameter is then used to reinforce the fusion. We have thus found a simple and effective way to improve the robustness to low-quality inputs in terms of Depth and RGB. Exte…

[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
researchProduct

RGB-Event Fusion for Moving Object Detection in Autonomous Driving

2022

Moving Object Detection (MOD) is a critical vision task for successfully achieving safe autonomous driving. Despite plausible results of deep learning methods, most existing approaches are only frame-based and may fail to reach reasonable performance when dealing with dynamic traffic participants. Recent advances in sensor technologies, especially the Event camera, can naturally complement the conventional camera approach to better model moving objects. However, event-based works often adopt a pre-defined time window for event representation, and simply integrate it to estimate image intensities from events, neglecting much of the rich temporal information from the available asynchronous ev…

FOS: Computer and information sciences[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI]Computer Science - Robotics[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]Computer Vision and Pattern Recognition (cs.CV)Computer Science - Computer Vision and Pattern Recognition[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]Robotics (cs.RO)[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
researchProduct

OLF : RGB-D Adaptive Late Fusion for Robust 6D Pose Estimation

2023

RGB-D 6D pose estimation has recently gained significant research attention due to the complementary information provided by depth data. However, in real-world scenarios, especially in industrial applications, the depth and color images are often more noisy. Existing methods typically employ fusion designs that equally average RGB and depth features, which may not be optimal. In this paper, we propose a novel fusion design that adaptively merges RGB-D cues. Our approach involves assigning two learnable weight α 1 and α 2 to adjust the RGB and depth contributions with respect to the network depth. This enables us to improve the robustness against low-quality depth input in a simple yet effec…

Late fusion[SPI] Engineering Sciences [physics]PSNRDeep learningSelf Optimized parameter
researchProduct