0000000000008978
AUTHOR
Antonio Gentile
Investigating Avatar Influence on Perceived Cognitive Load and Bimanual Interactions with Touchless Interfaces
In recent years, touchless-enabling technologies have been more and more adopted for providing public displays with gestural interactivity. This has led to the need for novel visual interfaces aimed at solving issues such as communicating interactivity to users, as well as supporting immediate usability and "natural" interactions. In this paper, we focus our investigation on a visual interface based only on the use of in-air direct manipulations. Our study aims at evaluating whether and how the presence of an Avatar that replays user’s movements may decrease the perceived cognitive workload during interactions. Moreover, we conducted a brief evaluation of the relationship between the presen…
XPL, a Presentation Language based on User Interface Design Pattern
The great diversity of presentations in software applications deals with fulfillment of various type of graphic interface constructions related to different programming Languages. Moreover, in the Internet era html pages play a main role because of the increasing complexity of Web applications. In Software Engineering, the use of design patterns is proven remarkable for the design and reuse of software components. Visual Design. Patterns (ViDP) are useful to define interaction schemas between. user and computer. By the some token, visual design. patterns are useful to incorporate common interfaces of interaction, schemas between user and computer. This paper describes the eXtensible Present…
Image Processing Chain For Digital Still Cameras Based On The Simpil Architecture
The new generation of wireless devices herald the development of products for integrated portable image and video communication requiring to image and video applications high computing performance. Portable MultiMedia Supercomputers (PMMS), a new class of architectures, allow to combine high computational performance, needed by multimedia applications, and a big energy efficiency, needed by portable devices. Among PMMS, the SIMPil (SIMD processor pixel) architecture satisfies the above requirements, especially with video and digital images processing tasks. In this paper we, exploit the SIMPil computation and throughput efficiency to implement the whole image processing chain of a digital s…
MLP Neural Network Implementation on a SIMD Architecture
An Automatic Road Sign Recognition System {A(RS)2} is aimed at detection and recognition of one or more road signs from realworld color images. The authors have proposed an A(RS)2 able to detect and extract sign regions from real world scenes on the basis of their color and shape features. Classification is then performed on extracted candidate regions using Multi-Layer Perceptron neural networks. Although system performances are good in terms of both sign detection and classification rates, the entire process requires a large computational time, so real-time applications are not allowed. In this paper we present the implementation of the neural layer on the Georgia Institute of Technology …
Touchless gestural interfaces for networked public displays
In the near future, we can easily imagine a significant increment in diffusion of networked public displays, as well as novel interaction modalities used in their applications. In the following, we present two of the main challenges related to networked displays we are dealing with, with a particular focus on touchless gestural interfaces: overcoming interaction blindness (i.e. enable users to immediately guess the interactivity of the display, and the gestural nature of it) and performing evaluations in-the-wild (i.e. outside any controlled environment).
Exploring usability and accessibility of avatar-based touchless gestural interfaces for autistic people
Many prior works investigated the potential of pervasive technologies and interactive applications to increase access capabilities to digital content for people with disability, particularly Neuro-Developmental Disorders (NDDs). In this paper, we present an exploratory study aimed at understanding if an avatar-based touchless gestural interface is able to foster interest towards digital representations of artworks, e.g. paintings or sculptures usually exhibited in museums, and to make them more accessible for such people. In particular, the study involved three autistic people and a therapist, and allowed us to report the potential of an avatar to communicate the interactivity and stimulate…
Human-Human Interaction: Trends and Challenges for Pervasive and Mobile Computing
Human-to-human interaction (HHI) is a challenging new domain where networked information systems and intelligent environments surrounding people converge for the purpose of better satisfaction of users' requirements and anticipation of their needs. Ubiquitous and pervasive computing are currently considered mature domains that implement and employ networked information systems on many important applications. Ambient intelligence involves the deployment of sensors and devices into the environment to
CliffoSor: A Parallel Embedded Architecture for Geometric Algebra and Computer Graphics
Geometric object representation and their transformations are the two key aspects in computer graphics applications. Traditionally, compute-intensive matrix calculations are involved to model and render 3D scenery. Geometric algebra (a.k.a. Clifford algebra) is gaining growing attention for its natural way to model geometric facts coupled with its being a powerful analytical tool for symbolic calculations. In this paper, the architecture of CliffoSor (Clifford Processor) is introduced. ClifforSor is an embedded parallel coprocessing core that offers direct hardware support to Clifford algebra operators. A prototype implementation on an FPGA board is detailed. Initial test results show more …
GAPPCO: An Easy to Configure Geometric Algebra Coprocessor Based on GAPP Programs
Because of the high numeric complexity of Geometric Algebra, its use in engineering applications relies heavily on tools and devices for efficient implementations. In this article, we present a novel hardware design for a Geometric Algebra coprocessor, called GAPPCO, which is based on Geometric Algebra Parallelism Programs (GAPP). GAPPCO is a design for a coprocessor combining the advantages of optimizing software with a configurable hardware able to implement arbitrary Geometric Algebra algorithms. The idea is to have a fixed hardware easily and fast to be configured for different algorithms. We describe the new hardware design together with the complete tool chain for its configuration.
KIND-DAMA: A modular middleware for Kinect-like device data management
In the last decades, we have witnessed a growing interest toward touchless gestural user interfaces. Among other reasons, this is due to the large availability of different low-cost gesture acquisition hardware (the so-called âKinect-like devicesâ). As a consequence, there is a growing need for solutions that allow to easily integrate such devices within actual systems. In this paper, we present KIND-DAMA, an open and modular middleware that helps in the development of interactive applications based on gestural input. We first review the existing middlewares for gestural data management. Then, we describe the proposed architecture and compare its features against the existing similar so…
Design Space Exploration of Parallel Embedded Architectures for Native Clifford Algebra Operations
In the past few decades, Geometric or Clifford algebra (CA) has received a growing attention in many research fields, such as robotics, machine vision and computer graphics, as a natural and intuitive way to model geometric objects and their transformations. At the same time, the high dimensionality of Clifford algebra and its computational complexity demand specialized hardware architectures for the direct support of Clifford data types and operators. This paper presents the design space exploration of parallel embedded architectures for native execution of four-dimensional (4D) and five-dimensional (5D) Clifford algebra operations. The design space exploration has been described along wit…
Real-Time Hand Pose Recognition Based on a Neural Network Using Microsoft Kinect
The Microsoft Kinect sensor is largely used to detect and recognize body gestures and layout with enough reliability, accuracy and precision in a quite simple way. However, the pretty low resolution of the optical sensors does not allow the device to detect gestures of body parts, such as the fingers of a hand, with the same straightforwardness. Given the clear application of this technology to the field of the user interaction within immersive multimedia environments, there is the actual need to have a reliable and effective method to detect the pose of some body parts. In this paper we propose a method based on a neural network to detect in real time the hand pose, to recognize whether it…
Accelerating Clifford Algebra Operations using GPUs and an OpenCL Code Generator
Clifford Algebra (CA) is a powerful mathematical language that allows for a simple and intuitive representation of geometric objects and their transformations. It has important applications in many research fields, such as computer graphics, robotics, and machine vision. Direct hardware support of Clifford data types and operators is needed to accelerate applications based on Clifford Algebra. This paper proposes a mixed software-hardware system that exploits the computational power of Graphics Processing Units (GPUs) to accelerate Clifford operations. A code generator, namely OpenCLifford, is presented that automatically generates Java and C libraries for the direct support of Clifford ele…
The eXtensible Dynamic Presentation Manager for content adaptation
Human Computer Interaction studies deals with systems and tools that are able to improve user experience during interaction with computer. For this purpose, modern web application are expected to supply multimodal and multi-channel access, adaptivity and transcoding features. We will present in this work the eXtensible Dynamic Presentation Manager (XDPM) which is a set of innovative tools that support the eXtensible Presentation Language (XPL) in the adaptation of contents to different working contexts. The adaptation is performed according to the delivery context information which have been formalized by means of a profiler system. A profile holds information about the specific access devi…
Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks
Speech recognition has become common in many application domains, from dictation systems for professional practices to vocal user interfaces for people with disabilities or hands-free system control. However, so far the performance of automatic speech recognition (ASR) systems are comparable to human speech recognition (HSR) only under very strict working conditions, and in general much lower. Incorporating acoustic-phonetic knowledge into ASR design has been proven a viable approach to raise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as de…
A Multimodal Guide for Virtual 3D Models of Cultural Heritage Artifacts
The area of cultural heritage preservation and fruition has drawn an ever growing attention of artificial intelligence and human-computer interaction research in the last decades. The common aim is to develop systems that can interact with the user in a variety of modes and in the most natural way. In this paper, a multimodal guide for virtual 3D environment navigation is presented. The proposed system integrates X3D environment with a multimodal interface. The application scenario is to provide a visitor assistance and guidance during the visit of one of the halls in the historical Palazzo Steri, the headquarters of the University of Palermo.
VEBO: Validation of E-R diagrams through ontologies and WordNet
In the semantic web vision, ontologies are building blocks for providing applications with a high level description of the operating environment in support of interoperability and semantic capabilities. The importance of ontologies in this respect is clearly stated in many works. Another crucial issue to increase the semantic aspect of web is to enrich the level of expressivity of database related data. Nowadays, databases are the primary source of information for dynamical web sites. The linguistic data used to build the database structure could be relevant for extracting meaningful information. In most cases, this type of information is not used for information retrieval. The work present…
Adaptive voice interaction for 3D representation of Cultural Heritage site
In the area of cultural heritage preservation and fruition the development of electronics and information technologies has opened new scenarios of research in the field of survey, representation and communication of Cultural Heritage. The aim is thus to make the fruition of works of art available to as many users as possible by using survey techniques and multimodal interaction. In this paper we propose a multimodal fruition of a virtual representation of a medieval ceiling, built in the XIV century, which covers the “Sala Magna” in the Steri of Palermo. The research deals with the design of an intelligent relational agent which interacts with the user in a natural way. To address this very…
Exploiting multimodality for intelligent mobile access to pervasive services in cultural heritage sites
In this chapter the role of multimodality in intelligent, mobile guides for cultural heritage environments is discussed. Multimodal access to information contents enables the creation of systems with a higher degree of accessibility and usability. A multimodal interaction may involve several human interaction modes, such as sight, touch and voice to navigate contents, or gestures to activate controls. We first start our discussion by presenting a timeline of cultural heritage system evolution, spanning from 2001 to 2008, which highlights design issues such as intelligence and context-awareness in providing information. Then, multimodal access to contents is discussed, along with problems an…
An Intelligent Multimodal Site-guide for the "Parco Archeologico della Valle dei Templi di Agrigento
Driver Assistance Systems: a Real-Time Road Signs Recognizer
Exploitation of Mobile Access to Context-Based Information in Cultural Heritage Fruition
More than one billion smartphone users are estimated by 2014. With this in mind, visiting cultural heritage sites and exhibits may offer new level of engagement and entertainment just reaching down in our pockets. With orders of magnitude more computational horsepower than a five years-old desktop machine, stuffed with all sorts of sensors, these modern gizmos have a largely untapped potential to gain us access to personalized and on-demand information wherever it is needed. This paper is exactly about this, exploring with several case studies how these devices may become part of a memorable experience during a visit that one may want to share with friends and relatives. Specifically, the p…
Information Assurance and Advanced Human-Computer Interfaces
Impulse noise removal on an embedded, low memory SIMD processor
Vector median filters efficiently reduce noise while preserving image details. However, their high computational complexity for color images makes them impractical for real-time systems. We propose new computationally efficient filtering algorithms, called index mapping filters (IMF). These filtering algorithms are accelerated by implementing them on a massively data parallel processor array. In addition to greater computational efficiency, these algorithms result in robust noise reduction of corrupted color images. Analyses of mean square error, signal-to-noise-ratio, and visual comparison metrics indicate that IMF are competitive with the vector median filter (VMF) in their ability to cor…
Human-to-human interfaces: emerging trends and challenges
We present a new research domain, human-to-human interaction (HHI) that describes how today's human interaction is largely indirect and mediated by a wide variety of technologies and devices. We show how this new and exciting field of design originates from the convergence of a few well-established research areas, such as traditional graphical user interfaces (GUIs), tangible user interfaces (TUIs), touchless gesture user interfaces (TGUIs), voice user interfaces (VUIs), and brain computer interfaces (BCIs). We analyse and describe current research in those areas and offer a first-hand view and presentation of its salient aspects for the human-to human interaction domain.
Efficient rapid prototyping of image and video processing algorithms
Image and video processing tasks are often confined for real-time execution on large size workstations or expensively custom designed hardware. The current availability of mature reconfigurable hardware, like Field Programmable Gate Arrays (FPGAs), coupled with the usage of hardware programming languages offers a good path for porting such applications on portable devices. This paper explores the rapid prototyping of a real-time road sign recognition system on a FPGA, using an algorithmic-like hardware programming language: the Handel-C language. We investigate the relationship between efficient Handel-C data, structures, constructs and the related high level C data, structures, constructs.…
A Dual-Core Coprocessor with Native 4D Clifford Algebra Support
Geometric or Clifford Algebra (CA) is a powerful mathematical tool that is attracting a growing attention in many research fields such as computer graphics, computer vision, robotics and medical imaging for its natural and intuitive way to represent geometric objects and their transformations. This paper introduces the architecture of CliffordCoreDuo, an embedded dual-core coprocessor that offers direct hardware support to four-dimensional (4D) Clifford algebra operations. A prototype implementation on an FPGA board is detailed. Experimental results show a 1.6× average speedup of CliffordCoreDuo in comparison with the baseline mono-core architecture. A potential cycle speedup of about 40× o…
An FPGA Implementation of a Quadruple-Based Multiplier for 4D Clifford Algebra
Geometric or Clifford algebra is an interesting paradigm for geometric modeling in fields as computer graphics, machine vision and robotics. In these areas the research effort is actually aimed at finding an efficient implementation of geometric algebra. The best way to exploit the symbolic computing power of geometric algebra is to support its data types and operators directly in hardware. However the natural representation of the algebra elements as variable-length objects causes some problems in the case of a hardware implementation. This paper proposes a 4D Clifford algebra in which the variable-length elements are mapped into fixed-length elements (quadruples). This choice leads to a s…
A Multimodal Guide for the Augmented Campus
The use of Personal Digital Assistants (PDAs) with ad-hoc built-in information retrieval and auto-localization functionalities can help people navigating an environment in a more natural manner compared to traditional audio/visual pre-recorded guides. In this work we propose and discuss a user-friendly, multi-modal guide system for pervasive context-aware service provision within augmented environments. The proposed system is adaptable to the user needs of mobility within a given environment; it is usable on different mobile devices and in particular on PDAs, which are used as advanced adaptive HEI (human-environment interaction) interfaces. An information retrieval service is provided that…
An Intelligent Multimodal Site-guide for the “Parco Archeologico della Valle dei Templi” in Agrigento
Message from the SEC 2007 Symposium Chairs
An Embedded, FPGA-based Computer Graphics Coprocessor with Native Geometric Algebra Support
The representation of geometric objects and their transformation are the two key aspects in computer graphics applications. Traditionally, computer-intensive matrix calculations are involved in modeling and rendering three-dimensional (3D) scenery. Geometric algebra (aka Clifford algebra) is attracting attention as a natural way to model geometric facts and as a powerful analytical tool for symbolic calculations. In this paper, the architecture of Clifford coprocessor (CliffoSor) is introduced. CliffoSor is an embedded parallel coprocessing core that offers direct hardware support to Clifford algebra operators. A prototype implementation on a programmable gate array (FPGA) board is detailed…
KIND-DAMA: A modular middleware for Kinect-like device data management
Summary In the last decades, we have witnessed a growing interest toward touchless gestural user interfaces. Among other reasons, this is due to the large availability of different low-cost gesture acquisition hardware (the so-called “Kinect-like devices”). As a consequence, there is a growing need for solutions that allow to easily integrate such devices within actual systems. In this paper, we present KIND-DAMA, an open and modular middleware that helps in the development of interactive applications based on gestural input. We first review the existing middlewares for gestural data management. Then, we describe the proposed architecture and compare its features against the existing simila…
XPL and the Synchronization of Multimodal User Interfaces based on Design Pattern
The great diversity of presentations in software applications deals with fulfilment of various types of user interface constructions related to different programming languages. Furthermore, the growing interest for multimodal applications entails that their user interfaces have to support multiple access channels within a single development framework. User Interfaces Design Patterns (UIDPs) are helpful to define interaction schemas between user and computer and they provide remarkable tools for the design and reuse of software components. This paper describes the eXtensible Presentation architecture and Language (XPL), a framework aimed at streamlining multi-channel interface design process…
Knowledge Discovery and Digital Cartography for the ALS (Linguistic Atlas of Sicily) Project
In this paper the latest developments of the ALS (Linguistic Atlas of Sicily) project are presented. The ALS project has the purpose to define methodologies and tools to allow researches in the socio-linguistic field. Different types of variables (both quantitative and qualitative) are involved. The whole framework is based on the definition of ontology-based applications for the creation, retrieval, manipulation and browsing of related data. To this aim, some mapping processes have been defined. The framework eventually shows the result in many ways including spatial maps. The on-going collaboration process is a perfect example a domain hybridizing process, enabling the training on-the-fie…
Neural Classification of HEP Experimental Data
High Energy Physics (HEP) experiments require discrimination of a few interesting events among a huge number of background events generated during an experiment. Hierarchical triggering hardware architectures are needed to perform this tasks in real-time. In this paper three neural network models are studied as possible candidate for such systems. A modified Multi-Layer Perception (MLP) architecture and a E alpha Net architecture are compared against a traditional MLP Test error below 25% is archived by all architectures in two different simulation strategies. E alpha Net performance are 1 to 2% better on test error with respect to the other two architectures using the smaller network topol…
Biologically Inspired Vision Architectures: a Software/Hardware Perspective
Even tough the field of computer vision has seen huge improvement in the last few decades, computer vision systems still lack, in most cases, the efficiency of biological vision systems. In fact biological vision systems routinely accomplish complex visual tasks such as object recognition, obstacle avoidance, and target tracking, which continue to challenge artificial systems. The study of biological vision system remains a strong cue for the design of devices exhibiting intelligent behaviour in visually sensed environments but current artificial systems are vastly different from biological ones for various reasons. First of all, biologically inspired vision architectures, which are continu…
Lessico e e-learning
Portable Video Supercomputing
As inexpensive imaging chips and wireless telecommunications are incorporated into an increasing array, of portable products, the need for high efficiency, high throughput embedded processing will become an important challenge in computer architecture. Videocentric applications, such wireless videoconferencing, real-time video enhancement and analysis, and new, immersive modes of distance education, will exceed the computational capabilities of current microprocessor and digital signal processor (DSP) architectures. A new class of embedded computers, portable video supercomputers, will combine supercomputer performance with the energy efficiency required for deployment in portable systems. …
A Multimodal Interaction Guide for Pervasive Services Access
A pervasive, multimodal virtual guide for a cultural heritage site tour is illustrated. The guide is based on the integration of different technologies such as conversational agents, commonsense reasoning knowledge bases, multimodal interfaces and self-location detection systems. The aim of the work is to offer a more natural, context sensitive access to information with respect to traditional audio/visual pre-recorded guides. A prototype has been developed and implemented on a Qtek 9090 with Windows Mobile 2003 in order to deal with the "Museo Archeologico Regionale di Agrigento" domain.
Embedded Coprocessors for Native Execution of Geometric Algebra Operations
Clifford algebra or geometric algebra (GA) is a simple and intuitive way to model geometric objects and their transformations. Operating in high-dimensional vector spaces with significant computational costs, the practical use of GA requires dedicated software and/or hardware architectures to directly support Clifford data types and operators. In this paper, a family of embedded coprocessors for the native execution of GA operations is presented. The paper shows the evolution of the coprocessor family focusing on the latest two architectures that offer direct hardware support to up to five-dimensional Clifford operations. The proposed coprocessors exploit hardware-oriented representations o…
Exploiting Correlation between Body Gestures and Spoken Sentences for Real-time Emotion Recognition
Humans communicate their affective states through different media, both verbal and non-verbal, often used at the same time. The knowledge of the emotional state plays a key role to provide personalized and context-related information and services. This is the main reason why several algorithms have been proposed in the last few years for the automatic emotion recognition. In this work we exploit the correlation between one's affective state and the simultaneous body expressions in terms of speech and gestures. Here we propose a system for real-time emotion recognition from gestures. In a first step, the system builds a trusted dataset of association pairs (motion data -> emotion pattern), a…
A Specialized Architecture for Color Image Edge Detection Based on Clifford Algebra
Edge detection of color images is usually performed by applying the traditional techniques for gray-scale images to the three color channels separately. However, human visual perception does not differentiate colors and processes the image as a whole. Recently, new methods have been proposed that treat RGB color triples as vectors and color images as vector fields. In these approaches, edge detection is obtained extending the classical pattern matching and convolution techniques to vector fields. This paper proposes a hardware implementation of an edge detection method for color images that exploits the definition of geometric product of vectors given in the Clifford algebra framework to ex…
XPL the Extensible Presentation Language
The last decade has witnessed a growing interest in the development of web interfaces enabling both multiple ways to access contents and, at the same time, fruition by multiple modalities of interaction (point-and-click, contents reading, voice commands, gestures, etc.). In this paper we describe a framework aimed at streamlining the design process of multi-channel, multimodal interfaces enabling full reuse of software components. This framework is called the eXtensible Presentation architecture and Language (XPL), a presentation language based on design pattern paradigm that keeps separated the presentation layer from the underlying programming logic. The language supplies a methodology to…
An Evaluation of HCI and CMC in Information Systems within Highly Crowded Large Events
Pervasive systems are composed of a large variety of networked smart devices that supposedly enrich the environment they are deployed in. The access to services provided by a pervasive system should be as natural and “unconscious” as possible. In a large number of cases, the available interaction modality seems to be more oriented towards showing off technological wonders rather than to the actual usability of the interface. In this paper we evaluate and compare two different versions of an information provision system deployed in two editions of a large fair. In particular, we will focus on the Human-Computer Interaction (HCI) and Computer-Mediated-Communication (CMC) points of view. The a…
Clifford Algebra based Edge Detector for Color Images
Edge detection is one of the most used methods for feature extraction in computer vision applications. Feature extraction is traditionally founded on pattern recognition methods exploiting the basic concepts of convolution and Fourier transform. For color image edge detection the traditional methods used for gray-scale images are usually extended and applied to the three color channels separately. This leads to increased computational requirements and long execution times. In this paper we propose a new, enhanced version of an edge detection algorithm that treats color value triples as vectors and exploits the geometric product of vectors defined in the Clifford algebra framework to extend …
SIMPil-K: a SIMD Reconfigurable Platform Processor for Real-Time Image Processing
SISTEMA DI NAVIGAZIONE VIRTUALE MULTIMODALE
A Multimodal Fruition Model for Graphical Contents in Ancient Books
One of the most common and efficient way to preserve ancient books is to digitize them, and make their content someway browsable. However, this process often does not take into account the fruition point of view. Exploiting the currently available technologies allows for new ways of content fruition, to attract more people and help them to better understand the available information. In this paper, we present a solution for an innovative and multimodal exploration of ancient books contents, using both touch and touchless gestures. This allows people for both near and far distance interaction, which results in different interaction models.
Body Gestures and Spoken Sentences: A Novel Approach for Revealing User’s Emotions
In the last decade, there has been a growing interest in emotion analysis research, which has been applied in several areas of computer science. Many authors have con- tributed to the development of emotion recognition algorithms, considering textual or non verbal data as input, such as facial expressions, gestures or, in the case of multi-modal emotion recognition, a combination of them. In this paper, we describe a method to detect emotions from gestures using the skeletal data obtained from Kinect-like devices as input, as well as a textual description of their meaning. The experimental results show that the correlation existing between body movements and spoken user sentence(s) can be u…
Touchless Interfaces For Public Displays
Public displays have lately become ubiquitous thanks to the decreasing cost of such technology and public policies supporting the development of smart cities. Depending on form factor, those displays might use touchless gestural interfaces that therefore are becoming more often the subject of public and private research. In this paper, we focus on touchless interactions with situated public displays, and introduce a pilot study on comparing two interfaces: an interface based on the Microsoft Human Interface Guidelines (HIG), a de facto standard in the field, and a novel interface, designed by us. Differently from the HIG-based one, our interface displays an avatar, which does not require an…
A touchless gestural system for extended information access within a campus
In the last two decades, we have witnessed a growing spread of touchless interfaces, facilitated by higher performances of computational systems, as well as the increased availability of cheaper sensors and devices. Putting the focus on gestural input, several researchers and designers used Kinect-like devices to implement touchless gestural interfaces. The latter extends the possible deployments and usage of public interactive displays. For example, wall-sized displays may become interactive even if they are unreachable by touch. Moreover, billboard-sized displays may be placed in safe cases to avoid vandalism, while still maintaining their interactivity. Finally, people with temporary or …
Real-Time Body Gestures Recognition Using Training Set Constrained Reduction
Gesture recognition is an emerging cross-discipline research field, which aims at interpreting human gestures and associating them to a well-defined meaning. It has been used as a mean for supporting human to machine interaction in several applications of robotics, artificial intelligence, and machine learning. In this paper, we propose a system able to recognize human body gestures which implements a constrained training set reduction technique. This allows the system for a real-time execution. The system has been tested on a publicly available dataset of 7,000 gestures, and experimental results have highlighted that at the cost of a little decrease in the maximum achievable recognition ac…
Multimodal and Agent-Based Human–Computer Interaction in Cultural Heritage Applications: an Overview
One of the most recent and interesting applications of human–computer interaction technologies is the provision of advanced information services within public places, such as cultural heritage sites or schools and university campuses. In such contexts, concurrent technologies used in smart mobile devices can be used to satisfy the mobility need of users allowing them to access relevant resources in a context-dependent manner. Of course, most of the constraints to be taken into account when designing a pervasive information providing system are given by the actual domain where they are deployed.
Human-to-Human Interaction: The Killer Application of Ubiquitous Computing?
Twenty-five years past the Weiser’s vision of Ubiquitous Computing, and there is not a clear understanding of what is or is not a pervasive system. Due to the loose boundaries of such paradigm, almost any kind of remotely ac-cessible networked system is classified as a pervasive system. We think that that is mainly due to the lack of killer applications that could make this vi-sion clearer. Actually, we think that the most promising killer application is already here, but we are so used to it that we do not see it, as a perfect fitting of the Weiser’s vision: the Human-to-Human Interaction mediated by com-puters.
Continuous hand openness detection using a Kinect-like device
This paper presents a novel method to reproduce in real time the opening and closing gestures of a human hand, animating a three-dimensional model of it. In other works, this result can be achieved by mapping a set of significant points of a real hand on the corresponding points of the model to animate. We propose an alternative way to produce the same effect without mapping points, but using a level-based estimation of the degree of opening of the hand. The experiments have been executed using Microsoft KinectTM, but the method would work on any other Kinect-like devices (as defined herein). The results obtained are particularly encouraging and demonstrate real-time performance of the syst…
ConformalALU: A Conformal Geometric Algebra Coprocessor for Medical Image Processing
Medical imaging involves important computational geometric problems, such as image segmentation and analysis, shape approximation, three-dimensional (3D) modeling, and registration of volumetric data. In the last few years, Conformal Geometric Algebra (CGA), based on five-dimensional (5D) Clifford Algebra, is emerging as a new paradigm that offers simple and universal operators for the representation and solution of complex geometric problems. However, the widespread use of CGA has been so far hindered by its high dimensionality and computational complexity. This paper proposes a simplified formulation of the conformal geometric operations (reflections, rotations, translations, and uniform …
A Study of Perceptron Mapping Capability to Design Speech Event Detectors
Event detection is a fundamental yet critical component in automatic speech recognition (ASR) systems that attempt to extract knowledge-based features at the front-end level. In this context, it is common practice to design the detectors inside well-known frameworks based on artificial neural network (ANN) or support vector machine (SVM). In the case of ANN, speech scientists often design their detector architecture relying on conventional feed-forward multi-layer perceptron (MLP) with sigmoidal activation function. The aim of this paper is to introduce other ANN architectures inside the context of detection-based ASR. In particular, a bank of feed-forward MLPs using sinusoidal activation f…
CliifoSor: a Parallel embedded architecture for Geometric Algebra and Computer Graphics
Collective Reasoning over Shared Concepts for the Linguistic Atlas of Sicily
In this chapter, collective intelligence principles are applied in the context of the Linguistic Atlas of Sicily (ALS - Atlante Linguistico Siciliano), an interdisciplinary research focusing on the study of the Italian language as it is spoken in Sicily, and its correlation with the Sicilian dialect and other regional varieties spoken in Sicily. The project has been developed over the past two decades and includes a complex information system supporting linguistic research; recently it has grown to allow research scientists to cooperate in an integrated environment to produce significant scientific advances in the field of ethnologic and sociolinguistic research. An interoperable infrastruc…
A New Min-Max Optimisation Approach for Fast Learning Convergence of Feed-Forward Neural Networks
One of the most critical aspect for a wide use of neural networks to real world problems is related to the learning process which is known to be computational expensive and time consuming.
Investigating how user avatar in touchless interfaces affects perceived cognitive load and two-handed interactions
In recent years, touchless-enabling technologies have been more and more adopted for providing public displays with gestural interactivity. This has led to the need for novel visual interfaces aimed at solving issues such as communicating interactivity to users, as well as supporting immediate usability and "natural" interactions. In this paper, we focus our investigation on a visual interface based only on the use of in-air direct manipulations. Our study aims at evaluating whether and how the presence of an Avatar that replays user's movements may decrease the perceived cognitive workload during interactions. Moreover, we conducted a brief evaluation of the relationship between the presen…
Design and Implementation of an Efficient Fingerprint Features Extractor
Biometric recognition systems are rapidly evolving technologies and their use in embedded devices for accessing and managing data and resources is a very challenging issue. Usually, they are composed of three main modules: Acquisition, Features Extraction and Matching. In this paper the hardware design and implementation of an efficient fingerprint features extractor for embedded devices is described. The proposed architecture, designed for different acquisition sensors, is composed of four blocks: Image Pre-processor, Macro-Features Extractor, Micro- Features Extractor and Master Controller. The Image Pre- processor block increases the quality level of the input raw image and performs an a…
A Family of Embedded Coprocessors with Native Geometric Algebra Support
Clifford Algebra or Geometric Algebra (GA) is a simple and intuitive way to model geometric objects and their transformations. Operating in high-dimensional vector spaces with significant computational costs, the practical use of GA requires, however, dedicated software and/or hardware architectures to directly support Clifford data types and operators. In this paper, a family of embedded coprocessors for the native execution of GA operations is presented. The paper shows the evolution of the coprocessor family focusing on the latest two architectures that offer direct hardware support to up to five-dimensional Clifford operations. The proposed coprocessors exploit hardware-oriented represe…
Designing Touchless Gestural Interactions for Public Displays In-the-Wild
Public displays, typically equipped with touchscreens, are used for interactions in public spaces, such as streets or fairs. Currently low-cost visual sensing technologies, such as Kinect-like devices and high quality cameras, allow to easily implement touchless interfaces. Nevertheless, the arising interactions have not yet been fully investigated for public displays in-the-wild (i.e. in appropriate social contexts where public displays are typically deployed). Different audiences, cultures and social settings strongly affect users and their interactions. Besides gestures for public displays must be guessable to be easy to use for a wide audience. Issues like these could be solved with use…
MAGA: A Mobile Archaeological Guide at Agrigento
A New Embedded Coprocessor for Clifford Algebra based Software Intensive Systems
Computer graphics applications require efficient tools to model geometric objects and their transformations. Clifford algebra (also known as geometric algebra) is receiving a growing attention in many research fields, such as computer graphics, machine vision and robotics, as a new, interesting computational paradigm that offers a natural and intuitive way to perform geometric calculations. At the same time, compute-intensive graphics algorithms require the execution of million Clifford operations. Clifford algebra based software intensive systems need therefore the support of specialized hardware architectures capable of accelerating Clifford operations execution. In this paper the archite…
The ALSWEB Framework: A Web-based Framework for the Linguistic Atlas of Sicily Project
In this work the ALSWEB framework is presented. The ALSWEB is a virtual linguistic laboratory for linguistic research developed as a web application. The purpose of the framework is to model the entire process regarding the different steps of data acquisition, data transformation, information acquisition from different data and research hypotheses verification in the ALS (Linguistic Atlas of Sicily) project. The nature of the ALS research involves different type of data. The socio-linguistic researcher that is the main actor of the proposed framework has to acquire information in many formats: multimedia data, audio data, question-answer (textual) from particular questionnaires. In this wor…
Surveying, modeling and communication techniques for the documentation of medieval wooden painted ceilings in the Mediterranean area
Wooden painted ceilings of the Mediterranean area in the middle age have their origin in the islamic culture and were then spread in the countries under the dominion of the Arabs; some of the surviving ceilings are now located in Sicily and Spain. In the historic centre of Palermo two well preserved medieval ceilings are still surviving; the first, built in the XII century, is located in the Palatine chapel; the second one, built in the XIV century covers the “Sala Magna” in the Steri of Palermo. The research, focused on the ceiling in the Steri, deals with the definition of a process for the integration of surveying techniques (photogrammetry, laser scanning), modelling processes and commu…
Interacting with Augmented Environments
Pervasive systems augment environments by integrating information processing into everyday objects and activities. They consist of two parts: a visible part populated by animate (visitors, operators) or inanimate (AI) entities interacting with the environment through digital devices, and an invisible part composed of software objects performing specific tasks in an underlying framework. This paper shows an ongoing work from the University of Palermo''s Department of Computer Science and Engineering that addresses two issues related to simplifying and broadening augmented environment access.
Internet of things: why we are not there yet
Twenty-one years past since Weiser's vision of ubiquitous computing (UbiComp) has been written, and it is yet to be fully fulfilled despite of almost all the needed technologies already available. Still, the widespread interest in UbiComp and the results in some of its fields pose a question: why we are not there yet? It seems we miss the 'octopus' head. In this paper, we will try to depict the reasons why we are not there yet, from three different points of view: interaction media, device integration and applications.
A Dynamic System for Personal Communications: The Opportunistic Chat
most of currently available inter-personal instant messaging systems are client-server based. Users of such systems need to be connected to some centralized entity which goal is to supply them with information needed to make the communication possible, such as the list of connected users, their status, and the address of their devices. Our proposed system for an “Opportunistic Chat” allows people to exchange written messages over Bluetooth and (if necessary) TCP/IP connections, with no need of any kind of centralized entity, by using ad-hoc procedures, automatically selected and operated. Such system can be accessed almost from any kind of device, either mobile or not, ranging from personal…
Enabling Multimodal Interaction in XPL – the eXtensible Presentation Language
This paper introduces the multimodal extension of the eXtensible Presentation architecture and Language (XPL), a framework aimed at streamlining multi-channel interface design process and enabling full component reuse. XPL incorporates a presentation language based on design pattern paradigm, which supplies a clear distinction between the presentation layer and the corresponding programming logic, promoting contents aggregation and a variety of event handlers described without relying on a (procedural) scripting language. In this paper, the design pattern concept is extended to voice-based interaction, and two verbal design pattern (VeDP) are introduced along to their visual counterparts. T…
Platforms for Human-Human Interaction in Large Social Events
In this paper we present the evolution of QRouteMe, an information system built to provide people with rich user experiences when attending museums or exhibits. QRouteMe is a platform for indirect, mediated, and facilitated interactions among humans during large social events, by means of a wide variety of concurrent technologies and devices. The system evolution is analyzed according to the new human-to-human interaction (HHI) research domain. In particular, we show how QRouteMe has been adapted for the different events in which it has been used. We analyze and describe social interaction aspects during the events and we discuss some data and results. Finally we outline some possible futur…
Multimodal virtual navigation of a cultural heritage site: The medieval ceiling of Steri in Palermo
The advance of information technology has enabled in recent years new fruition scenarios for cultural heritage sites. Multidisciplinary approaches integrate survey techniques with multimodal interfaces to allow enhanced fruition for larger group of users. In this paper we propose a multimodal interface to a virtual representation of a medieval ceiling, built in the XIV century, which covers the “Sala Magna” of Steri, the historical headquarters of the University of Palermo, in Italy. This research deals with the definition of a process for the integration of surveying techniques, modelling processes and communication technologies for the documentation of such artifacts. This is a two-stage …
Multimodal Mean Adaptive Backgrounding for Embedded Real-Time Video Surveillance
Automated video surveillance applications require accurate separation of foreground and background image content. Cost sensitive embedded platforms place realtime performance and efficiency demands on techniques to accomplish this task. In this paper we evaluate pixel-level foreground extraction techniques for a low cost integrated surveillance system. We introduce a new adaptive technique, multimodal mean (MM), which balances accuracy, performance, and efficiency to meet embedded system requirements. Our evaluation compares several pixel-level foreground extraction techniques in terms of their computation and storage requirements, and functional accuracy for three representative video sequ…
Experiences with CiceRobot, a Museum Guide Cognitive Robot
The paper describes CiceRobot, a robot based on a cognitive architecture for robot vision and action. The aim of the architecture is to integrate visual perception and actions with knowledge representation, in order to let the robot to generate a deep inner understanding of its environment. The principled integration of perception, action and of symbolic knowledge is based on the introduction of an intermediate representation based on Gardenfors conceptual spaces. The architecture has been tested on a RWI B21 autonomous robot on tasks related with guided tours in the Archaeological Museum of Agrigento. Experimental results are presented.
A real-time network architecture for biometric data delivery in Ambient Intelligence
Ambient Intelligent applications involve the deployment of sensors and hardware devices into an intelligent environment surrounding people, meeting users’ requirements and anticipating their needs (Ambi- ent Intelligence-AmI). Biometrics plays a key role in surveillance and security applications. Fingerprint, iris and voice/speech traits can be acquired by contact, contact-less, and at-a-distance sensors embedded in the environment. Biometric traits transmission and delivery is very critical and it needs real-time transmission net- work with guaranteed performance and QoS. Wireless networks become suitable for AmI if they are able to satisfy real-time communication and security system requi…
Modular Middleware for Gestural Data and Devices Management
In the last few years, the use of gestural data has become a key enabler for human-computer interaction (HCI) applications. The growing diffusion of low-cost acquisition devices has thus led to the development of a class of middleware aimed at ensuring a fast and easy integration of such devices within the actual HCI applications. The purpose of this paper is to present a modular middleware for gestural data and devices management. First, we describe a brief review of the state of the art of similar middleware. Then, we discuss the proposed architecture and the motivation behind its design choices. Finally, we present a use case aimed at demonstrating the potential uses as well as the limit…
An RFID framework for multimodal service provision
In recent years there has been a growing interest toward the development of pervasive and context-aware services, and RFID technology played a relevant role in the context sensing task. We propose the use of RFID technology together with a conversational agent in order to implement a multimodal information retrieval service we call SensorMesh. The information acquired from RFID tags about the nearest point of interest is processed by the conversational agent that carries a more natural interaction with the user, also exploiting a common sense ontology. The service is accessible using a multimodal browser on Personal Digital Assistants (PDAs); the browser allows the user to interact with the…
GAPP Compiler for Hardware Accelerated Geometric Algebra Computing
Because of the high numeric complexity of Geometric Algebra, its use in engineering applications relies heavily on tools for ecient implementations. In this article, we introduce a new quality of Geometric Algebra Computing solutions based on a new compiler for Geometric Algebra Parallelism Programs (GAPP). These programs are already optimized in a sense that only the really needed computations are left. The GAPP compiler is able to generate two output formats leading to advanced hardware accelerated Geometric Algebra Computing. On one hand, there is the direct generation of HSAIL code, in order to more eciently support the solutions of the broad range of heterogeneous computing architectur…
A neural network based automatic road signs recognizer
Automatic road sign recognition systems are aimed at detection and recognition of one or more road signs from real-world color images. In this research, road signs are detected and extracted from real world scenes on the basis of their color and shape features. A dynamic region growing technique is adopted to enhance color segmentation results obtained in the HSV color space. The technique is based on a dynamic threshold that reduces the effect of hue instability in real scenes due to external brightness variation. Classification is then performed on extracted candidate regions using multilayer perceptron neural networks. The obtained results show good detection and recognition rates of the…
Multimodal and Agent-Based Human–Computer Interaction in Cultural Heritage Applications: an Overview
A Virtual Shopper Customer Assistant in Pervasive Environments
In this work we propose a smart, human-like PDA-based personal shopper assistant. The system is able to understand the user needs through a spoken natural language interaction and then stores the preferences of the potential customer. Subsequently the personal shopper suggests the most suitable items and shops that match the user profile. The interaction is given by automatic speech recognition and text-to-speech technologies; localization is allowed by the use of Wireless technologies, while the interaction is performed by an Alice-based chat-bot endowed with reasoning capabilities. Besides, being implemented on a PDA, the personal shopper satisfies the user needs of mobility and it is als…
Recent Advances in Mobile and Multimedia Applications
Information Organization and Visualization for the ALS Project
This work presents the adopted methodology and the realized tools for data visualization in the Linguistic Atlas Of Sicily (ALS) Project. The ALS Project has the purpose to discovery new trends and variables to track the linguistic evolution over time and space in Sicily. The project is focused on how linguistic variables are related to social and economic aspects of the evolution of population and vice-versa. The visualization tools are a relevant aspect of the project to support decisions and disseminate results. One of the major factors is the geographic dependence of data evolution. To this aim complex set of tools has been developed: some tools are stand-alone while others are collabor…
XML-based Knowledge Discovery for Linguistic Atlas of Sicily (ALS) Project
The identification of new useful patterns in data is a core process for intelligent systems. Information overflow is directly related to this problem. In this work we propose a knowledge discovery methodology to retrieve useful and novel information from raw data stored in a DBMS. We used ALSDB, a database that has been built suitably to access structured information obtained from the questionnaires produced in the Linguistic Atlas of Sicily (ALS) project. The ALS project is a decennal joint effort led by researchers at the Dipartimento di Scienze Filologiche e Linguistiche of the University of Palermo that has the purpose to track and study the geo-linguistic and lexicographic processes ab…
Novel Human-to-Human Interactions from the Evolution of HCI
The interaction ways made available by the evolution of the human-computer interfaces, led to novel Human-to-Human Interaction (HHI) modes, enabling people to cooperate for almost any task any time and any where. HHI nowadays is largely indirect and mediated by a wide variety of technologies and devices. This new and exciting field of design originates from the convergence of a few well-established research fields within the HCI area, such as traditional Graphical User Interfaces (GUI), Tangible User Interfaces (TUI), Touchless Gesture User Interface (TGUI), Voice User Interfaces (VUI), and Brain Computer Interfaces (BCI). We analyze and describe the evolution of the HCI in those fields, an…
Multimodal fruition of 3D virtual environment for Cultural Heritage sites
A Virtual Shopper Customer Assistant in Pervasive Environments.
Fixed-size Quadruples for a New, Hardware-Oriented Representation of the 4D Clifford Algebra
Clifford algebra (geometric algebra) offers a natural and intuitive way to model geometry in fields as robotics, machine vision and computer graphics. This paper proposes a new representation based on fixed-size elements (quadruples) of 4D Clifford algebra and demonstrates that this choice leads to an algorithmic simplification which in turn leads to a simpler and more compact hardware implementation of the algebraic operations. In order to prove the advantages of the new, quadruple-based representation over the classical representation based on homogeneous elements, a coprocessing core supporting the new fixed-size Clifford operands, namely Quad-CliffoSor (Quadruple-based Clifford coproces…
Application of EαNets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic knowledge into Automatic Speech Recognition (ASR) systems design has been proven a viable approach to rise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as detectors for manner of articulation attributes starting from representations of speech signal frames. In this paper, a set of six detectors for the above mentioned attributes is designed based on the E-αNet model of neural networks. This model was chosen for its capability to learn hidden acti…
Midground Object Detection in Real World Video Scenes,
Traditional video scene analysis depends on accurate background modeling to identify salient foreground objects. However, in many important surveillance applications, saliency is defined by the appearance of a new non-ephemeral object that is between the foreground and background. This midground realm is defined by a temporal window following the object's appearance; but it also depends on adaptive background modeling to allow detection with scene variations (e.g., occlusion, small illumination changes). The human visual system is ill-suited for midground detection. For example, when surveying a busy airline terminal, it is difficult (but important) to detect an unattended bag which appears…
The impact of grain size on the efficiency of embedded SIMD image processing architectures
Pixel-per-processing element (PPE) ratio-the amount of image data directly mapped to each processing element-has a significant impact on the area and energy efficiency of embedded SIMD architectures for image processing applications. This paper quantitatively evaluates the impact of PPE ratio on system performance and efficiency for focal-plane SIMD image processing architectures by comparing throughput, area efficiency, and energy efficiency for a range of common application kernels using architectural and workload simulation. While the impact of grain size is affected by the mix of executed instructions within an application program, the most efficient PPE ratio often does not occur at PE…
A Java-based Wrapper for Wireless Communications
The increasing number of new applications for mobile devices in pervasive environments, do not cope with changes in the wireless communications. Developers of such applications have to deal with problems arising from the available wireless connections in the given environment. A middleware is a solution that allows to overcome some of these problems. It provides to the applications a set of functions that facilitate their development. In this paper we present a Java-based communication wrapper, called SmartTraffic, which allows programmers to seamlessly use TCP or UDP protocols over Bluetooth or any IP-based wireless network. Developers can use SmartTraffic within their Java applications, t…
Design and implementation of an embedded coprocessor with native support for 5D, quadruple-based Clifford algebra
Geometric or Clifford algebra (CA) is a powerful mathematical tool that offers a natural and intuitive way to model geometric facts in a number of research fields, such as robotics, machine vision, and computer graphics. Operating in higher dimensional spaces, its practical use is hindered, however, by a significant computational cost, only partially addressed by dedicated software libraries and hardware/software codesigns. For low-dimensional algebras, several dedicated hardware accelerators and coprocessing architectures have been already proposed in the literature. This paper introduces the architecture of CliffordALU5, an embedded coprocessing core conceived for native execution of up t…
QRouteMe: A Multichannel Information System to Ensure Rich User-Experience in Exhibits and Museums
In this article the QRouteMe system is presented. QRouteMe is a multichannel information system built to ensure rich user experiences in exhibits and museums. The system starts from basic information about a particular exhibit or museum while delivering a wide user experience based on different distribution channels. The organization of the systems’ components allow to build different solutions that can be simultaneously delivered on different media. A wide range of media from touch-screen installations to portable devices like smartphones have been used. The used devices can communicate each others to increase the usability and the user experience for the visitors. Another important featur…
Application of Enets to Feature Recognition of Articulation Manner in Knowledge-based Automatic Speech Recognition
A Multichannel Information System to Build and Deliver Rich User-Experiences in Exhibits and Museums
In this article a multichannel information system to build and deliver rich user experiences in exhibits and museums is presented. The system was designed to use information about a particular exhibit or museum while delivering a wide user experience based on different distribution channels. The overall information is used to build different solutions that can be delivered simultaneously on different media from touch-screen installations to portable devices like smart phones. Moreover, all the devices signed in the environment are able to communicate to each others to increase the level of the usability of the system. A case study and analysis of experimental results are also provided.
LAVORARE CON I LINGUISTI: ESPERIENZA SUL CAMPO DI UN INGEGNERE INFORMATICO
Efficient FPGA Implementation of a Knowledge-Based Automatic Speech Classifier
Speech recognition has become common in many application domains, from dictation systems for professional practices to vocal user interfaces for people with disabilities or hands-free system control. However, so far the performance of Automatic Speech Recognition (ASR) systems are comparable to Human Speech Recognition (HSR) only under very strict working conditions, and in general far lower. Incorporating acoustic-phonetic knowledge into ASR design has been proven a viable approach to rise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as dete…
Embedded Real-Time Surveillance Using Multimodal Mean Background Modeling
Automated video surveillance applications require accurate separation of foreground and background image content. Cost-sensitive embedded platforms place real-time performance and efficiency demands on techniques to accomplish this task. In this chapter, we evaluate pixel-level foreground extraction techniques for a low-cost integrated surveillance system. We introduce a new adaptive background modeling technique, multimodal mean (MM), which balances accuracy, performance, and efficiency to meet embedded system requirements. Our evaluation compares several pixel-level foreground extraction techniques in terms of their computation and storage requirements, and functional accuracy for three r…