0000000000162097

AUTHOR

Giosué Lo Bosco

On the Suitability of Neural Networks as Building Blocks for the Design of Efficient Learned Indexes

With the aim of obtaining time/space improvements in classic Data Structures, an emerging trend is to combine Machine Learning techniques with the ones proper of Data Structures. This new area goes under the name of Learned Data Structures. The motivation for its study is a perceived change of paradigm in Computer Architectures that would favour the use of Graphics Processing Units and Tensor Processing Units over conventional Central Processing Units. In turn, that would favour the use of Neural Networks as building blocks of Classic Data Structures. Indeed, Learned Bloom Filters, which are one of the main pillars of Learned Data Structures, make extensive use of Neural Networks to improve…

research product

Learned Sorted Table Search and Static Indexes in Small-Space Data Models

Machine-learning techniques, properly combined with data structures, have resulted in Learned Static Indexes, innovative and powerful tools that speed up Binary Searches with the use of additional space with respect to the table being searched into. Such space is devoted to the machine-learning models. Although in their infancy, these are methodologically and practically important, due to the pervasiveness of Sorted Table Search procedures. In modern applications, model space is a key factor, and a major open question concerning this area is to assess to what extent one can enjoy the speeding up of Binary Searches achieved by Learned Indexes while using constant or nearly constant-space mod…

research product

Standard Vs Uniform Binary Search and Their Variants in Learned Static Indexing: The Case of the Searching on Sorted Data Benchmarking Software Platform

Learned Indexes are a novel approach to search in a sorted table. A model is used to predict an interval in which to search into and a Binary Search routine is used to finalize the search. They are quite effective. For the final stage, usually, the lower_bound routine of the Standard C++ library is used, although this is more of a natural choice rather than a requirement. However, recent studies, that do not use Machine Learning predictions, indicate that other implementations of Binary Search or variants, namely k-ary Search, are better suited to take advantage of the features offered by modern computer architectures. With the use of the Searching on Sorted Sets SOSD Learned Indexing bench…

research product

Automatic classification of acoustically detected krill aggregations: A case study from Southern Ocean

Acoustic surveys represent the standard methodology to assess the spatial distribution and abundance of pelagic organisms characterized by aggregative behaviour. The species identification of acoustically observed aggregations is usually performed by taking into account the biological sampling and according to expert-based knowledge. The precision of survey estimates, such as total abundance and spatial distribution, strongly depends on the efficiency of acoustic and biological sampling as well as on the species identification. In this context, the automatic identification of specific groups based on energetic and morphological features could improve the species identification process, allo…

research product

DeepEva: A deep neural network architecture for assessing sentence complexity in Italian and English languages

Abstract Automatic Text Complexity Evaluation (ATE) is a research field that aims at creating new methodologies to make autonomous the process of the text complexity evaluation, that is the study of the text-linguistic features (e.g., lexical, syntactical, morphological) to measure the grade of comprehensibility of a text. ATE can affect positively several different contexts such as Finance, Health, and Education. Moreover, it can support the research on Automatic Text Simplification (ATS), a research area that deals with the study of new methods for transforming a text by changing its lexicon and structure to meet specific reader needs. In this paper, we illustrate an ATE approach named De…

research product

Learning from Data to Speed-up Sorted Table Search Procedures: Methodology and Practical Guidelines

Sorted Table Search Procedures are the quintessential query-answering tool, with widespread usage that now includes also Web Applications, e.g, Search Engines (Google Chrome) and ad Bidding Systems (AppNexus). Speeding them up, at very little cost in space, is still a quite significant achievement. Here we study to what extend Machine Learning Techniques can contribute to obtain such a speed-up via a systematic experimental comparison of known efficient implementations of Sorted Table Search procedures, with different Data Layouts, and their Learned counterparts developed here. We characterize the scenarios in which those latter can be profitably used with respect to the former, accounting …

research product

A Pipeline for the Implementation of Immersive Experience in Cultural Heritage Sites in Sicily

Modern digital technologies allow potentially to explore Cultural Heritage sites in immersive virtual environments. This is surely an advantage for the users that can better experiment and understand a specific site, also before a real visit. This specific approach has gained increasing attention during the extreme conditions of the recent COVID-19 pandemic. In this work, we present the processes that lead to the implementation of an immersive app for different kinds of low and high cost devices, which have been attained in the context of the 3dLab-Sicilia project. 3dLab-Sicilia’s main objective is to sponsor the creation, development, and validation of a sustainable infrastructure that int…

research product