Search results for " Informatica"
showing 10 items of 978 documents
A NEW COMPLEXITY FUNCTION FOR WORDS BASED ON PERIODICITY
2013
Motivated by the extension of the critical factorization theorem to infinite words, we study the (local) periodicity function, i.e. the function that, for any position in a word, gives the size of the shortest square centered in that position. We prove that this function characterizes any binary word up to exchange of letters. We then introduce a new complexity function for words (the periodicity complexity) that, for any position in the word, gives the average value of the periodicity function up to that position. The new complexity function is independent from the other commonly used complexity measures as, for instance, the factor complexity. Indeed, whereas any infinite word with bound…
Burrows-Wheeler Transform on Purely Morphic Words
2022
The study of the compressibility of repetitive sequences is an issue that is attracting great interest. We consider purely morphic words, which are highly repetitive sequences generated by iterating a morphism φ that admits a fixed point (denoted by φ^∞(a) ) starting from a given character a belonging to the finite alphabet A , i.e. φ^∞(a)=lim_{i→∞}φ^i(a) . Such morphisms are called prolongable on a . Here we focus on the compressibility via the Burrows-Wheeler Transform (BWT) of infinite families of finite sequences generated by morphisms. In particular, denoted by r(w) the number of equal-letter runs of a word w , we provide new upper bounds on r(bwt(φ^i(a))) , i.e. the number of equal-le…
On Balancing of a Direct Product
2009
A direct product of two sequences is a naturally defined sequence on the alphabet of pairs of symbols. By taking inspiration from [Pavel Salimov. On uniform recurrence of a direct product. In AutoMathA, 2009], where the author investigates the case of uniformly recurrent words, here, we study when the product of two balanced sequences on binary alphabet is also balanced.
A Collaborative Filtering Approach for Drug Repurposing
2022
A recommendation system is proposed based on the construction of Knowledge Graphs, where physical interaction between proteins and associations between drugs and targets are taken into account. The system suggests new targets for a given drug depending on how proteins are linked each other in the graph. The framework adopted for the implementation of the proposed approach is Apache Spark, useful for loading, managing and manipulating data by means of appropriate Resilient Distributed Datasets (RDD). Moreover, the Alternating Least Square (ALS) machine learning algorithm, a Matrix Factorization algorithm for distributed and parallel computing, is applied. Preliminary obtained results seem to…
FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy
2021
Abstract Background Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. Results We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy …
Proposed use of a conversational agent for patient empowerment
2021
Empowerment is a process through which people acquire the necessary knowledge and self-awareness to understand their conditions and treatment options, make informed choices and self-manage their health conditions in daily life, in collaboration with medical professionals. Conversational Agents in healthcare could play an important role in the process of empowering a person but, so far, they have been seldom been used for this purpose. This paper presents the basic principles and preliminary implementation of a conversational health agent for patient empowerment. It dialogues with the user in a "natural" way, collects health data from heterogeneous sources and provides the user wit…
Utenti precoci di Pokémon Go - un report pilota sui correlati di personalità
2016
Pokémon Go è una applicazione ludica gratuita per smartphone basata su tecnologie di relatà aumentata e di geolocazione lanciata sul mercato a fine di luglio del 2016 dalla società americana Niantic (Wilson, 2016).
A big data approach for sequences indexing on the cloud via burrows wheeler transform
2020
Indexing sequence data is important in the context of Precision Medicine, where large amounts of "omics"data have to be daily collected and analyzed in order to categorize patients and identify the most effective therapies. Here we propose an algorithm for the computation of Burrows Wheeler transform relying on Big Data technologies, i.e., Apache Spark and Hadoop. Our approach is the first that distributes the index computation and not only the input dataset, allowing to fully benefit of the available cloud resources. Copyright © 2020 for this paper by its authors.
MicroRNA Interaction Networks
2021
La tesi di Giorgio Bertolazzi è incentrata sullo sviluppo di nuovi algoritmi per la predizione dei legami miRNA-mRNA. In particolare, un algoritmo di machine-learning viene proposto per l'upgrade del web tool ComiR; la versione originale di ComiR considerava soltanto i siti di legame dei miRNA collocati nella regione 3'UTR dell'RNA messaggero. La nuova versione di ComiR include nella ricerca dei legami la regione codificante dell'RNA messaggero. Bertolazzi’s thesis focuses on developing and applying computational methods to predict microRNA binding sites located on messenger RNA molecules. MicroRNAs (miRNAs) regulate gene expression by binding target messenger RNA molecules (mRNAs). Therefo…
Mapreduce in computational biology - A synopsis
2017
In the past 20 years, the Life Sciences have witnessed a paradigm shift in the way research is performed. Indeed, the computational part of biological and clinical studies has become central or is becoming so. Correspondingly, the amount of data that one needs to process, compare and analyze, has experienced an exponential growth. As a consequence, High Performance Computing (HPC, for short) is being used intensively, in particular in terms of multi-core architectures. However, recently and thanks to the advances in the processing of other scientific and commercial data, Distributed Computing is also being considered for Bioinformatics applications. In particular, the MapReduce paradigm, to…