Search results for " Informatica"
showing 10 items of 978 documents
Characteristic Sturmian words are extremal for the Critical Factorization Theorem
2012
We prove that characteristic Sturmian words are extremal for the Critical Factorization Theorem (CFT) in the following sense. If p x ( n ) denotes the local period of an infinite word x at point n , we prove that x is a characteristic Sturmian word if and only if p x ( n ) is smaller than or equal to n + 1 for all n ≥ 1 and it is equal to n + 1 for infinitely many integers n . This result is extremal with respect to the \{CFT\} since a consequence of the \{CFT\} is that, for any infinite recurrent word x, either the function p x is bounded, and in such a case x is periodic, or p x ( n ) ≥ n + 1 for infinitely many integers n . As a byproduct of the techniques used in the paper we extend a r…
Classes of Colors and Timbres: A Clustering Approach
2022
Similarities between different sensory dimensions can be addressed considering common “movements” as causes, and emotional responses as effects. An imaginary movement toward the “dark” produces “dark sounds” and “dark colors,” or, toward the “bright,” “brighter colors” and “brighter sounds.” Following this line of research, we draw upon the confluence of mathematics and cognition, extending to colors and timbres the gestural similarity conjecture, a development of the mathematical theory of musical gestures. Visual “gestures” are seen here as paths in the space of colors, compared with paths in the space of orchestral timbres. We present an approach based on clustering algorithm to evaluate…
Deep Learning Architectures for DNA Sequence Classification
2016
DNA sequence classification is a key task in a generic computational framework for biomedical data analysis, and in recent years several machine learning technique have been adopted to successful accomplish with this task. Anyway, the main difficulty behind the problem remains the feature selection process. Sequences do not have explicit features, and the commonly used representations introduce the main drawback of the high dimensionality. For sure, machine learning method devoted to supervised classification tasks are strongly dependent on the feature extraction step, and in order to build a good representation it is necessary to recognize and measure meaningful details of the items to cla…
Criminal networks analysis in missing data scenarios through graph distances.
2021
Data collected in criminal investigations may suffer from: (i) incompleteness, due to the covert nature of criminal organisations; (ii) incorrectness, caused by either unintentional data collection errors and intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyse nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data and to determine which network type is most affected by it. The networks are firstly pruned following two specific methods: …
Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics
2019
Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g., Apache Hadoop and Spark) does not necessarily produce satisfactory results, in terms of both efficiency and effectiveness. We discuss how the development of distributed and Big Data management technologies has affected the analysis of large datasets of biological sequences. Moreover, we show how the choice of different parameter configurations and the careful engineering of the …
Reverse-Safe Text Indexing
2021
We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matc…
PDB: A pictorial database oriented to data analysis
1993
The paper describes a new pictorial database oriented to image analysis, implemented inside the MIDAS data analysis system. Pictorial databases need expressive data structures in order to represent a wide class of information from the numerical to the visual. The model of the database is relational; however, a full normalization is not achievable, owing to the complexity of the visual information. The paper reports the general design and notes on the software implementation. Preliminary experiments show the performance of the pictorial database. Copyright © 1993 John Wiley & Sons, Ltd
Distributed Image Databases: Hybrid Similarity Functions
1998
FABC: Retinal Vessel Segmentation Using AdaBoost
2010
This paper presents a method for automated vessel segmentation in retinal images. For each pixel in the field of view of the image, a 41-D feature vector is constructed, encoding information on the local intensity structure, spatial properties, and geometry at multiple scales. An AdaBoost classifier is trained on 789 914 gold standard examples of vessel and nonvessel pixels, then used for classifying previously unseen images. The algorithm was tested on the public digital retinal images for vessel extraction (DRIVE) set, frequently used in the literature and consisting of 40 manually labeled images with gold standard. Results were compared experimentally with those of eight algorithms as we…
An automated image analysis methodology for classifying megakaryocytes in chronic myeloproliferative disorders
2008
This work describes an automatic method for discrimination in microphotographs between normal and pathological human megakaryocytes and between two kinds of disorders of these cells. A segmentation procedure has been developed, mainly based on mathematical morphology and wavelet transform, to isolate the cells. The features of each megakaryocyte (e.g. area, perimeter and tortuosity of the cell and its nucleus, and shape complexity via elliptic Fourier transform) are used by a regression tree procedure applied twice: the first time to find the set of normal megakaryocytes and the second to distinguish between the pathologies. The output of our classifier has been compared to the interpretati…