Filippo Vella

Sign Languages Recognition Based on Neural Network Architecture

In the last years, many steps forward have been made in speech and natural languages recognition and were developed many virtual assistants such as Apple’s Siri, Google Now and Microsoft Cortana. Unfortunately, not everyone can use voice to communicate to other people and digital devices. Our system is a first step for extending the possibility of using virtual assistants to speech impaired people by providing an artificial sign languages recognition based on neural network architecture.

research product

Mean shift clustering for personal photo album organization

In this paper we propose a probabilistic approach for the automatic organization of pictures in personal photo album. Images are analyzed in term of faces and low-level visual features of the background. The description of the background is based on RGB color histogram and on Gabor filter energy accounting for texture information. The face descriptor is obtained by projection of detected and rectified faces on a common low dimensional eigenspace. Vectors representing faces and background are clustered in an unsupervised fashion exploiting a mean shift clustering technique. We observed that, given the peculiarity of the domain of personal photo libraries where most of the pictures contain fa…

research product

A Quantum Planner for Robot Motion

The possibility of integrating quantum computation in a traditional system appears to be a viable route to drastically improve the performance of systems endowed with artificial intelligence. An example of such processing consists of implementing a teleo-reactive system employing quantum computing. In this work, we considered the navigation of a robot in an environment where its decisions are drawn from a quantum algorithm. In particular, the behavior of a robot is formalized through a production system. It is used to describe the world, the actions it can perform, and the conditions of the robot’s behavior. According to the production rules, the planning of the robot activities is processe…

research product

Automatic image representation for content-based access to personal photo album

The proposed work exploits methods and techniques for automatic characterization of images for content-based access to personal photo libraries. Several techniques, even if not reliable enough to address the general problem of content-based image retrieval, have been proven quite robust in a limited domain such as the one of personal photo album. In particular, starting from the observation that most personal photos depict a usually small number of people in a relatively small number of different contexts (e.g. Beach, Public Garden, Indoor, Nature, Snow, City, etc...) we propose the use of automatic techniques borrowed from the fields of computer vision and pattern recognition to index imag…

research product

Fast Volumetric Reconstruction of Human Body through Superquadrics

This paper describes a technique to reconstruct the volumes of the human body. For this purpose, are introduced mathematical objects able to represent 3d shapes, called super quadrics. These objects are positioned in the space according the captures made by a Microsoft Kinect device and are composed to represent the volumes of the human body. The employment of quaternions provides a relevant speedup for the rotation of the volumes and allows to follow the human movements in real time and reduced computational cost.

research product

Latent Semantic Description of Iconic Scenes

It is proposed an approach for the automatic description of scenes using a LSA–like technique. The described scenes are composed by a set of elements that can be geometric forms or iconic representation of objects. Every icon is characterized by a set of attributes like shape, colour and position. Each scene is related to a set of sentences describing their content. The proposed approach builds a data driven vector semantic space where the scenes and the sentences are mapped. A new scene can be mapped in this created space accordingly to a suitable metric. Preliminary experimental results show the effectiveness of the procedure.

research product

Deep Metric Learning for Transparent Classification of Covid-19 X-Ray Images

This work proposes an interpretable classifier for automatic Covid-19 classification using chest X-ray images. It is based on a deep learning model, in particular, a triplet network, devoted to finding an effective image embedding. Such embedding is a non-linear projection of the images into a space of reduced dimension, where homogeneity and separation of the classes measured by a predefined metric are improved. A K-Nearest Neighbor classifier is the interpretable model used for the final classification. Results on public datasets show that the proposed methodology can reach comparable results with state of the art in terms of accuracy, with the advantage of providing interpretability to t…

research product

An automatic system for humanoid dance creation

Abstract The paper describes a novel approach to allow a robot to dance following musical rhythm. The proposed system generates a dance for a humanoid robot through the combination of basic movements synchronized with the music. The system made up of three parts: the extraction of features from audio file, estimation of movements through the Hidden Markov Models and, finally, the generation of dance. Starting from a set of given movements, the robot choices sequence of movements a suitable Hidden Markov Model, and synchronize them processing musical input. The proposed approach has the advantage that movement execution probabilities could be changed according evaluation of the dance executi…

research product

Clustering techniques for personal photo album management

In this work we propose a novel approach for the automatic representation of pictures achieving at more effective organization of personal photo albums. Images are analyzed and described in multiple representation spaces, namely, faces, background and time of capture. Faces are automatically detected, rectified and represented projecting the face itself in a common low-dimensional eigenspace. Backgrounds are represented with low-level visual features based on RGB histogram and Gabor filter bank. Faces, time and background information of each image in the collection is automatically organized using a mean-shift clustering technique. Given the particular domain of personal photo libraries, wh…

research product

Creation and cognition for humanoid live dancing

Abstract Computational creativity in dancing is a recent and challenging research field in Artificial Intelligence and Robotics. We present a cognitive architecture embodied in a humanoid robot capable to create and perform dances driven by the perception of music. The humanoid robot is able to suitably move, to react to human mate dancers and to generate novel and appropriate sequences of movements. The approach is based on a cognitive architecture that integrates Hidden Markov Models and Genetic Algorithms. The system has been implemented on a NAO robot and tested in public setting-up live performances, obtaining positive feedbacks from the audience.

research product

Automatic Image Annotation Using Random Projection in a Conceptual Space Induced from Data

The main drawback of a detailed representation of visual content, whatever is its origin, is that significant features are very high dimensional. To keep the problem tractable while preserving the semantic content, a dimen- sionality reduction of the data is needed. We propose the Random Projection techniques to reduce the dimensionality. Even though this technique is sub-optimal with respect to Singular Value Decomposition its much lower computational cost make it more suitable for this problem and in par- ticular when computational resources are limited such as in mobile terminals. In this paper we present the use of a "conceptual" space, automatically induced from data, to perform automa…

research product

Unsupervised Clustering in Personal Photo Collections

In this paper we propose a probabilistic approach for the automatic organization of collected pictures aiming at more effective representation in personal photo albums. Images are analyzed and described in two representation spaces, namely, faces and background. Faces are automatically detected, rectified and represented projecting the face itself in a common low dimensional eigenspace. Backgrounds are represented with low-level visual features based on RGB histogram and Gabor filter energy. Face and background information of each image in the collection is automatically organized by mean-shift clustering technique. Given the particular domain of personal photo libraries, where most of the …

research product

A Kinect-Based Gesture Acquisition and Reproduction System for Humanoid Robots

The paper illustrates a system that endows an humanoid robot with the capability to mimic the motion of a human user in real time, serving as a basis for further gesture based human-robot interactions. The described approach uses the Microsoft Kinect as a low cost alternative to expensive motion capture devices.

research product

Automatic Dictionary Creation by Sub-symbolic Encoding of Words

This paper describes a technique for automatic creation of dictionaries using sub-symbolic representation of words in cross-language context. Semantic relationship among words of two languages is extracted from aligned bilingual text corpora. This feature is obtained applying the Latent Semantic Analysis technique to the matrices representing terms co-occurrences in aligned text fragments. The technique allows to find the “best translation” according to a properly defined geometric distance in an automatically created semantic space. Experiments show an interesting correctness of 95% obtained in the best case.

research product

Metric Learning in Histopathological Image Classification: Opening the Black Box

The application of machine learning techniques to histopathology images enables advances in the field, providing valuable tools that can speed up and facilitate the diagnosis process. The classification of these images is a relevant aid for physicians who have to process a large number of images in long and repetitive tasks. This work proposes the adoption of metric learning that, beyond the task of classifying images, can provide additional information able to support the decision of the classification system. In particular, triplet networks have been employed to create a representation in the embedding space that gathers together images of the same class while tending to separate images w…

research product

An Artificial Soft Somatosensory System for a Cognitive Robot

The paper proposes an artificial somatosensory system loosely inspired by human beings' biology and embedded in a cognitive architecture (CA). It enables a robot to receive the stimulation from its embodiment, and use these sensations, we called roboceptions, to behave according to both the external environment and the internal robot status. In such a way, the robot is aware of its body and able to interpret physical sensations can be more effective in the task while maintaining its well being. The robot's physiological urges are tightly bound to the specific physical state of the robot. Positive and negative physical information can, therefore, be processed and let the robot behave in a mo…

research product

Quantum planning for swarm robotics

Computational resources of quantum computing can enhance robotic motion, decision making, and path planning. While the quantum paradigm is being applied to individual robots, its approach to swarms of simple and interacting robots remains largely unexplored. In this paper, we attempt to bridge the gap between swarm robotics and quantum computing, in the framework of a search and rescue mission. We focus on a decision-making and path-planning collective task. Thus, we present a quantum-based path-planning algorithm for a swarm of robots. Quantization enters position and reward information (measured as a robot’s proximity to the target) and path-planning decisions. Pairwise information-exchan…

research product

Image Segmentation through a Hierarchy of Minimum Spanning Trees

Many approaches have been adopted to solve the problem of image segmentation. Among them a noticeable part is based on graph theory casting the pixels as nodes in a graph. This paper proposes an algorithm to select clusters in the images (corresponding to relevant segments in the image) corresponding to the areas induced in the images through the search of the Minimum Spanning Tree (MST). In particular is is based on a clustering algorithm that extracts clusters computing a hierarchy of Minimum Spanning Trees. The main drawback of this previous algorithm is that the dimension of the cluster is not predictable and a relevant portion of found clusters can be composed by micro-clusters that ar…

research product

Composition of SIFT features for robust image representation

In this paper we propose a novel feature based on SIFT (Scale Invariant Feature Transform) algorithm1 for the robust representation of local visual contents. SIFT features have raised much interest for their power of description of visual content characterizing punctual information against variation of luminance and change of viewpoint and they are very useful to capture local information. For a single image hundreds of keypoints are found and they are particularly suitable for tasks dealing with image registration or image matching. In this work we stretched the spatial coverage of descriptors creating a novel feature as composition of keypoints present in an image region while maintaining…

research product

A Robotic Humanoid for Information Exploration on Cultural Heritage Domain

The work presented here illustrates an humanoid robot capable of interacting with an human user within the Cultural Heritage domain. Two different and complementary AI approaches, namely sub-symbolic and symbolic, have been implemented and combined together to design the framework of a robot having both rational and intuitive capabilities. Furthermore, the robot is capable of providing information expressively and of adapting its behavior according to the emotional content of the artworks descriptions. This could make the robot more effective in providing information and entertaining the users.

research product

Roboception and adaptation in a cognitive robot

In robotics, perception is usually oriented at understanding what is happening in the external world, while few works pay attention to what is occurring in the robot’s body. In this work, we propose an artificial somatosensory system, embedded in a cognitive architecture, that enables a robot to perceive the sensations from its embodiment while executing a task. We called these perceptions roboceptions, and they let the robot act according to its own physical needs in addition to the task demands. Physical information is processed by the robot to behave in a balanced way, determining the most appropriate trade-off between the achievement of the task and its well being. The experiments show …

research product

Automatic image representation and clustering on mobile devices.

In this paper a novel approach for the automatic representation of pictures on mobile devices is proposed. With the wide diffusion of mobile digital image acquisition devices, the need of managing a large number of digital images is quickly increasing. In fact the storage capacity of such devices allow users to store hundreds or even thousands, of pictures that, without a proper organization, become useless. Users may be interested in using (i.e., browsing, saving, printing and so on) a subset of stored data according to some particular picture properties. A content-based description of each picture is needed to perform on-board image indexing. In our work the images are analyzed and descri…

research product

A Geometric Approach to Automatic Description of Iconic Scenes

It is proposed a step towards the automatic description of scenes with a geometric approach. The scenes considered are composed by a set of elements that can be geometric forms or iconic representation of objects. Every icon is characterized by a set of attributes like shape, colour, position, orientation. Each scene is related to a set of sentences describing its content. The proposed approach builds a data driven vector semantic space where the scenes and the sentences are mapped. Sentences and scene with the same meaning are mapped in near vectors and distance criteria allow retrieving semantic relations.

research product

Three-domain image representation for personal photo album management

In this paper we present a novel approach for personal photo album management. Pictures are analyzed and described in three representation spaces, namely, faces, background and time of capture. Faces are automatically detected and rectified using a probabilistic feature extraction technique. Face representation is then produced by computing PCA (Principal Component Analysis). Backgrounds are represented with low-level visual features based on RGB histogram and Gabor filter bank. Temporal data is obtained through the extraction of EXIF (Exchangeable image file format) data. Each image in the collection is then automatically organized using a mean-shift clustering technique. While many system…

research product

An ACT-R Based Humanoid Social Robot to Manage Storytelling Activities

This paper describes an interactive storytelling system, accessible through the SoftBank robotic platforms NAO and Pepper. The main contribution consists of the interpretation of the story characters by humanoid robots, obtained through the definition of appropriate cognitive models, relying on the ACT-R cognitive architecture. The reasoning processes leading to the story evolution are based on the represented knowledge and the suggestions of the listener in critical points of the story. They are disclosed during the narration, to make clear the dynamics of the story and the feelings of the characters. We analyzed the impact of such externalization of the internal status of the characters t…

research product

Recognition of Human Actions Through Deep Neural Networks for Multimedia Systems Interaction

Nowadays, interactive multimedia systems are part of everyday life. The most common way to interact and control these devices is through remote controls or some sort of touch panel. In recent years, due to the introduction of reliable low-cost Kinect-like sensing technology, more and more attention has been dedicated to touchless interfaces. A Kinect-like devices can be positioned on top of a multimedia system, detect a person in front of the system and process skeletal data, optionally with RGBd data, to determine user gestures. The gestures of the person can then be used to control, for example, a media device. Even though there is a lot of interest in this area, currently, no consumer sy…

research product

Deep Metric Learning for Histopathological Image Classification

Neural networks demonstrated to be effective in multiple classification tasks with performances that are similar to human capabilities. Notwithstanding, the viability of the application of this kind of tool in real cases passes through the possibility to interpret the provided results and let the human operator take his decision according to the information that is provided. This aspect is much more evident when the field of application is bound to people's health as for biomed-ical image classification. We propose for the classification of histopathological images a convolutional neural network that, through metric learning, learns a representation that gathers in homogeneous clusters the …

research product

A Conceptual Probabilistic Model for the Induction of Image Semantics

In this paper we propose a model based on a conceptual space automatically induced from data. The model is inspired to a well-founded robotics cognitive architecture which is organized in three computational areas: sub-conceptual, linguistic and conceptual. Images are objects in the sub-conceptual area, that become "knoxels" into the conceptual area. The application of the framework grants the automatic emerging of image semantics into the linguistic area. The core of the model is a conceptual space induced automatically from a set of annotated images that exploits and mixes different information concerning the set of images. Multiple low level features are extracted to represent images and…

research product