Search results for "video"
showing 10 items of 1348 documents
Deep Motion Model for Pedestrian Tracking in 360 Degrees Videos
2019
This paper proposes a deep convolutional neural network (CNN) for pedestrian tracking in 360◦ videos based on the target’s motion. The tracking algorithm takes advantage of a virtual Pan-Tilt-Zoom (vPTZ) camera simulated by means of the 360◦ video. The CNN takes in input a motion image, i.e. the difference of two images taken by using the vPTZ camera at different times by the same pan, tilt and zoom parameters. The CNN predicts the vPTZ camera parameter adjustments required to keep the target at the center of the vPTZ camera view. Experiments on a publicly available dataset performed in cross-validation demonstrate that the learned motion model generalizes, and that the proposed tracking algo…
Design of an Adaptive Bayesian System for Sensor Data Fusion
2014
Many artificial intelligent systems exploit a wide set of sensor devices to monitor the environment. When the sensors employed are low-cost, off-the-shelf devices, such as Wireless Sensor Networks (WSN), the data gathered through the sensory infrastructure may be affected by noise, and thus only partially correlated to the phenomenon of interest. One way of overcoming these limitations might be to adopt a high-level method to perform multi-sensor data fusion. Bayesian Networks (BNs) represent a suitable tool for performing refined artificial reasoning on heterogeneous sensory data, and for dealing with the intrinsic uncertainty of such data. However, the configuration of the sensory infrast…
Real-Time Hand Pose Recognition Based on a Neural Network Using Microsoft Kinect
2013
The Microsoft Kinect sensor is largely used to detect and recognize body gestures and layout with enough reliability, accuracy and precision in a quite simple way. However, the pretty low resolution of the optical sensors does not allow the device to detect gestures of body parts, such as the fingers of a hand, with the same straightforwardness. Given the clear application of this technology to the field of the user interaction within immersive multimedia environments, there is the actual need to have a reliable and effective method to detect the pose of some body parts. In this paper we propose a method based on a neural network to detect in real time the hand pose, to recognize whether it…
Automatic Video Database Indexing and Retrieval
1997
The increasing development of advanced multimedia applications requires new technologies for organizing and retrieving by content databases of still digital images or digital video sequences. To this aim image and image sequence contents must be described and adequately coded. In this paper we describe a system allowing content-based annotation and querying in video databases. No user action is required during the database population step. The system automatically splits a video into a sequence of shots, extracts a few representative frames (said r-frames) from each shot and computes r-frame descriptors based on color, texture and motion. Queries based on one or more features are possible. …
Video object recognition and modeling by SIFT matching optimization
2014
In this paper we present a novel technique for object modeling and object recognition in video. Given a set of videos containing 360 degrees views of objects we compute a model for each object, then we analyze short videos to determine if the object depicted in the video is one of the modeled objects. The object model is built from a video spanning a 360 degree view of the object taken against a uniform background. In order to create the object model, the proposed techniques selects a few representative frames from each video and local features of such frames. The object recognition is performed selecting a few frames from the query video, extracting local features from each frame and looki…
Real-time estimation of geometrical transformation between views in distributed smart-cameras systems
2008
In this paper, we present a method to automatically estimate the geometric relations among the different views of cameras with partially overlapping fields of view in a wireless video-surveillance system. The method uses the locations of the detected moving objects visible at the same time in two or more views. The correspondences among objects are found by comparing their appearance models based on dominant colour descriptors while the geometric transformation are computed iteratively and may be used to solve the consistent labelling problem. As a significant part of the processing is performed on the smart cameras, the method has been conceived by taking into account the limited resources…
A tool to support the creation of datasets of tampered videos
2015
Digital Video Forensics is getting a growing interest from the Multimedia research community, as the need for methods to validate the authenticity of a video content is increasing with the number of videos freely available to the digital users. Unlike Digital Image Forensics, to our knowledge, there are not standard datasets to test video forgery detection techniques. In this paper we present a new tool to support the users in creating datasets of tampered videos. We furthermore present our own dataset and we discuss some remarks about how to create forgeries difficult to be detected by an observer, to the naked eye.
Integrating computer vision techniques and wireless sensor networks in video surveillance systems
2008
Nowadays video-surveillance systems are essential tools to monitor sites and to guarantee the safety of people: automatic detection of moving objects in the scene and recognition of dangerous events are particularly interesting. Our project aims to realize tools and techniques for video surveillance systems in outdoor environment to detect people in an automatic real-time way without the direct control of a human operator. The reference framework consists of distributed stationary cameras coordinated with sensor networks. In particular, wireless sensors are used to sense characteristic quantities of the monitored site, such as variations in temperature, humidity, noise, vibrations, and so o…
Keyword Based Keyframe Extraction in Online Video Collections
2015
Keyframe extraction methods aim to find in a video sequence the most significant frames, according to specific criteria. In this paper we propose a new method to search, in a video database, for frames that are related to a given keyword, and to extract the best ones, according to a proposed quality factor. We first exploit a speech to text algorithm to extract automatic captions from all the video in a specific domain database. Then we select only those sequences (clips), whose captions include a given keyword, thus discarding a lot of information that is useless for our purposes. Each retrieved clip is then divided into shots, using a video segmentation method, that is based on the SURF d…
Object Matching in Distributed Video Surveillance Systems by LDA-Based Appearance Descriptors
2009
Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use ap- pearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the…