6533b86ffe1ef96bd12cce92

RESEARCH PRODUCT

Deep learning techniques for visual object tracking

Giorgio Cruciata

subject

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniDeep LearningVisual Object TrackingFast LearningReinforcement LearningCNN

description

Visual object tracking plays a crucial role in various vision systems, including biometric analysis, medical imaging, smart traffic systems, and video surveillance. Despite notable advancements in visual object tracking over the past few decades, many tracking algorithms still face challenges due to factors like illumination changes, deformation, and scale variations. This thesis is divided into three parts. The first part introduces the visual object tracking problem and discusses the traditional approaches that have been used to study it. We then propose a novel method called Tracking by Iterative Multi-Refinements, which addresses the issue of locating the target by redefining the search for the ideal bounding box. This method utilizes an iterative process to forecast a sequence of bounding box adjustments, enabling the tracking algorithm to handle multiple non-conflicting transformations simultaneously. As a result, it achieves faster tracking and can handle a higher number of composite transformations. In the second part of this thesis we explore the application of reinforcement learning (RL) to visual tracking. Presenting a general RL framework applicable to problems that require a sequence of decisions. We discuss various families of popular RL approaches, including value-based methods, policy gradient approaches, and Actor-Critic Methods. Furthermore, we delve into the application of RL to visual tracking, where an RL agent predicts the target's location, selects hyperparameters, correlation filters, or target appearance. A comprehensive comparison of these approaches is provided, along with a taxonomy of state-of-the-art methods. The third part presents a novel method that addresses the need for online tuning of offline-trained tracking models. Typically, offline-trained models, whether through supervised learning or reinforcement learning, require additional tuning during online tracking to achieve optimal performance. The duration of this tuning process depends on the number of layers that need training for the new target. However, our thesis proposes a pioneering approach that expedites the training of convolutional neural networks (CNNs) while preserving their high performance levels. In summary, this thesis extensively explores the area of visual object tracking and its related domains, covering traditional approaches, novel methodologies like Tracking by Iterative Multi-Refinements, the application of reinforcement learning, and a pioneering method for accelerating CNN training. By addressing the challenges faced by existing tracking algorithms, this research aims to advance the field of visual object tracking and contributes to the development of more robust and efficient tracking systems.

https://hdl.handle.net/10447/595173