0000000000037355

AUTHOR

Luca Pinello

showing 26 related works from this author

Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM

2019

Single-cell transcriptomic assays have enabled the de novo reconstruction of lineage differentiation trajectories, along with the characterization of cellular heterogeneity and state transitions. Several methods have been developed for reconstructing developmental trajectories from single-cell transcriptomic data, but efforts on analyzing single-cell epigenomic data and on trajectory visualization remain limited. Here we present STREAM, an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. We have tested STREAM on several synthetic and real datasets generated with different single-cell techno…

0301 basic medicineEpigenomicsMultifactor Dimensionality ReductionComputer scienceGeneral Physics and Astronomy02 engineering and technologyOmics dataMyoblastsMiceSingle-cell analysisGATA1 Transcription FactorMyeloid CellsLymphocyteslcsh:ScienceData processingMultidisciplinaryQGene Expression Regulation DevelopmentalRNA sequencingCell DifferentiationGenomics021001 nanoscience & nanotechnologyData processingDNA-Binding ProteinsInterferon Regulatory FactorsSingle-Cell Analysis0210 nano-technologyAlgorithmsOmics technologiesSignal TransductionLineage differentiationScienceComputational biologyGeneral Biochemistry Genetics and Molecular BiologyArticle03 medical and health sciencesErythroid CellsAnimalsCell LineageGeneral Chemistrydevelopmental trajectories visualizationHematopoietic Stem CellsPipeline (software)Visualization030104 developmental biologyTheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGESCellular heterogeneitySingle cell analysilcsh:QGene expressionTranscriptomeTranscription FactorsNature Communications
researchProduct

A one class KNN for signal identification: a biological case study

2009

The paper describes an application of a one class KNN to identify different signal patterns embedded in a noise structured background. The problem becomes harder whenever only one pattern is well-represented in the signal; in such cases, one class classifier techniques are more indicated. The classification phase is applied after a preprocessing phase based on a multi layer model (MLM) that provides preliminary signal segmentation in an interval feature space. The one class KNN has been tested on synthetic and real (Saccharomyces cerevisiae) microarray data in the specific problem of DNA nucleosome and linker regions identification. Results have shown, in both cases, a good recognition rate.

Computer sciencebusiness.industryFeature vectorPattern recognitionmulti layer methodone class classifierPreprocessorSegmentationnucleosome positioning.Artificial intelligenceK nearest neighbourbusinessClassifier (UML)Multi layer
researchProduct

A MULTI-LAYER MODEL TO STUDY GENOME-SCALE POSITIONS OF NUCLEOSOMES

2007

The positioning of nucleosomes along chromatin has been implicated in the regulation of gene expression in eukaryotic cells, because packaging DNA into nucleosomes affects sequence accessibility. In this paper we propose a new model (called MLM) for the identification of nucleosomes and linker regions across DNA, consisting in a thresholding technique based on cut-set conditions. For this purpose we have defined a method to generate synthetic microarray data fully inspired from the approach that has been used by Yuan et al. Results have shown a good recognition rate on synthetic data, moreover, the $MLM$ shows a good agreement with the recently published method based on Hidden Markov Model …

Settore INF/01 - InformaticaComputer scienceMicroarray analysis techniquesSettore BIO/10 - BiochimicaGenome scaleNucleosomeComputational biologyMulti layerMulti Layer Method Nucleosome PositioningModelling and Simulation in Science
researchProduct

Distance Functions, Clustering Algorithms and Microarray Data Analysis

2010

Distance functions are a fundamental ingredient of classification and clustering procedures, and this holds true also in the particular case of microarray data. In the general data mining and classification literature, functions such as Euclidean distance or Pearson correlation have gained their status of de facto standards thanks to a considerable amount of experimental validation. For microarray data, the issue of which distance function works best has been investigated, but no final conclusion has been reached. The aim of this extended abstract is to shed further light on that issue. Indeed, we present an experimental study, involving several distances, assessing (a) their intrinsic sepa…

Clustering high-dimensional dataFuzzy clusteringSettore INF/01 - Informaticabusiness.industryCorrelation clusteringMachine learningcomputer.software_genrePearson product-moment correlation coefficientRanking (information retrieval)Euclidean distancesymbols.namesakeClustering distance measuressymbolsArtificial intelligenceData miningbusinessCluster analysiscomputerMathematicsDe facto standard
researchProduct

A multi-layer method to study genome-scale positions of nucleosomes

2009

AbstractThe basic unit of eukaryotic chromatin is the nucleosome, consisting of about 150 bp of DNA wrapped around a protein core made of histone proteins. Nucleosomes position is modulated in vivo to regulate fundamental nuclear processes. To measure nucleosome positions on a genomic scale both theoretical and experimental approaches have been recently reported. We have developed a new method, Multi-Layer Model (MLM), for the analysis of nucleosome position data obtained with microarray-based approach. The MLM is a feature extraction method in which the input data is processed by a classifier to distinguish between several kinds of patterns. We applied our method to simulated-synthetic and…

Feature extractionNucleosome positioningGenomicsSaccharomyces cerevisiaeComputational biologyHidden Markov Modelchemistry.chemical_compoundSettore BIO/10 - BiochimicaNucleosome positioning Hidden Markov Model Classification Multi-layer methodGeneticsHumansNucleosomeMulti-layer methodHidden Markov modelBase PairingMulti layerOligonucleotide Array Sequence AnalysisGeneticsBase SequenceSettore INF/01 - InformaticabiologyGenome HumanClassificationMarkov ChainsNucleosomesChromatinHistonechemistrybiology.proteinDNAGenomics
researchProduct

STREAM: Single-cell Trajectories Reconstruction, Exploration And Mapping of omics data

2018

AbstractSingle-cell transcriptomic assays have enabled the de novo reconstruction of lineage differentiation trajectories, along with the characterization of cellular heterogeneity and state transitions. Several methods have been developed for reconstructing developmental trajectories from single-cell transcriptomic data, but efforts on analyzing single-cell epigenomic data and on trajectory visualization remain limited. Here we present STREAM, an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data.

Omics dataCellular heterogeneityLineage differentiationComputer scienceGenomicsComputational biologyPipeline (software)Visualization
researchProduct

A one class classifier for Signal identification: a biological case study

2008

The paper describes an application of a one-class KNN to identify different signal patterns embedded in a noise structured background. The problem become harder whenever only one pattern is well represented in the signal, in such cases one class classifier techniques are more indicated. The classification phase is applied after a preprocessing phase based on a Multi Layer Model (MLM) that provides a preliminary signal segmentation in an interval feature space. The one-class KNN has been tested on synthetic data that simulate microarray data for the identification of nucleosomes and linker regions across DNA. Results have shown a good recognition rate on synthetic data for nucleosome and lin…

business.industryComputer scienceFeature vectorOne-class classificationPattern recognitionSegmentationArtificial intelligencebusinessMulti Layer Method One Class classification Bioinformatics Nucleosome Positioning.Classifier (UML)Synthetic data
researchProduct

A New Dissimilarity Measure for Clustering Seismic Signals

2011

Hypocenter and focal mechanism of an earthquake can be determined by the analysis of signals, named waveforms, related to the wave field produced and recorded by a seismic network. Assuming that waveform similarity implies the similarity of focal parameters, the analysis of those signals characterized by very similar shapes can be used to give important details about the physical phenomena which have generated an earthquake. Recent works have shown the effectiveness of cross-correlation and/or cross-spectral dissimilarities to identify clusters of seismic events. In this work we propose a new dissimilarity measure between seismic signals whose reliability has been tested on real seismic dat…

Focal mechanismSimilarity (geometry)Cross-correlationHypocenterSettore INF/01 - InformaticaComputer sciencebusiness.industryHomogeneity (statistics)Pattern recognitioncomputer.software_genreMeasure (mathematics)Physics::GeophysicsSettore GEO/11 - Geofisica ApplicataWaveformArtificial intelligenceData miningbusinessCluster analysiscomputerDissimilarity measure Clustering Seismic Signals
researchProduct

The Three Steps of Clustering In The Post-Genomic Era

2013

This chapter descibes the basic algorithmic components that are involved in clustering, with particular attention to classification of microarray data.

Clustering high-dimensional dataSettore INF/01 - Informaticabusiness.industryCorrelation clusteringPattern recognitioncomputer.software_genreBiclusteringCURE data clustering algorithmClustering Classification Biological Data MiningConsensus clusteringArtificial intelligenceData miningbusinessCluster analysiscomputerMathematics
researchProduct

A methodology to assess the intrinsic discriminative ability of a distance function and its interplay with clustering algorithms for microarray data …

2013

Abstract Background Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. Following Handl et al., it can be summarized as a three step process: (1) choice of a distance function; (2) choice of a clustering algorithm; (3) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Results A procedure is proposed for the assessment of the discriminative ability of a distance functi…

Computer sciencecomputer.software_genreBiochemistrysymbols.namesakeDiscriminative modelStructural BiologyCluster AnalysisRelevance (information retrieval)Cluster analysisMolecular BiologyOligonucleotide Array Sequence AnalysisClustering discriminative ability of a distance function external validation indicesSettore INF/01 - InformaticaResearchApplied MathematicsMutual informationPearson product-moment correlation coefficientComputer Science ApplicationsHierarchical clusteringEuclidean distanceRange (mathematics)Metric (mathematics)symbolsData miningTranscriptomecomputerAlgorithmsBMC Bioinformatics
researchProduct

A motif-independent metric for DNA sequence specificity

2011

Abstract Background Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity. Results We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM). By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF) binding motifs. We…

Biologylcsh:Computer applications to medicine. Medical informaticsDNA-binding proteinGenomeBiochemistryDNA sequencingCell Line03 medical and health scienceschemistry.chemical_compound0302 clinical medicineStructural BiologyHumansTranscription factorMolecular Biologylcsh:QH301-705.5Sequence Specificity Epigenomics Bioinformatics030304 developmental biologyEpigenomicsGenetics0303 health sciencesBase SequenceSettore INF/01 - InformaticaGenome HumanApplied MathematicsMethodology ArticleDNAComputer Science ApplicationsDNA-Binding Proteinschemistrylcsh:Biology (General)lcsh:R858-859.7Human genomeDNA microarray030217 neurology & neurosurgeryDNAAlgorithmsSoftwareGenome-Wide Association StudyProtein BindingTranscription FactorsBMC Bioinformatics
researchProduct

Interval Length Analysis in Multi Layer Model

2009

In this paper we present an hypothesis test of randomness based on the probability density function of the symmetrized Kulback-Leibler distance estimated, via a Monte Carlo simulation, by the distributions of the interval lengths detected using the Multi-Layer Model (MLM). The $MLM$ is based on the generation of several sub-samples of an input signal; in particular a set of optimal cut-set thresholds are applied to the data to detect signal properties. In this sense MLM is a general pattern detection method and it can be considered a preprocessing tool for pattern discovery. At the present the test has been evaluated on simulated signals which respect a particular tiled microarray approach …

Hypothesis test Multi layer method BioinformaticsSet (abstract data type)Signal-to-noise ratioTheoretical computer scienceSettore INF/01 - InformaticaComputer scienceMonte Carlo methodProbability density functionInterval (mathematics)SignalAlgorithmRandomnessStatistical hypothesis testing
researchProduct

A New Feature Selection Methodology for K-mers Representation of DNA Sequences

2015

DNA sequence decomposition into k-mers and their frequency counting, defines a mapping of a sequence into a numerical space by a numerical feature vector of fixed length. This simple process allows to compare sequences in an alignment free way, using common similarities and distance functions on the numerical codomain of the mapping. The most common used decomposition uses all the substrings of a fixed length k making the codomain of exponential dimension. This obviously can affect the time complexity of the similarity computation, and in general of the machine learning algorithm used for the purpose of sequence analysis. Moreover, the presence of possible noisy features can also affect the…

k-mers DNA sequence similarity feature selection DNA sequence classification.Settore INF/01 - InformaticaComputer scienceSequence analysisbusiness.industryFeature vectorPattern recognitionFeature selectionDNA sequencingSubstringExponential functionArtificial intelligencebusinessAlgorithmTime complexity
researchProduct

Erratum to: A New Feature Selection Methodology for K-mers Representation of DNA Sequences

2017

Computer sciencebusiness.industryRepresentation (systemics)Pattern recognitionFeature selectionArtificial intelligencebusinessDNA sequencing
researchProduct

The Three Steps of Clustering in the Post-Genomic Era: A Synopsis

2011

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particul…

cluster validation indicesSettore INF/01 - InformaticaProcess (engineering)Computer sciencebusiness.industryGenomic datadistance functionMachine learningcomputer.software_genreObject (computer science)ClusteringCluster algorithmPredictive powerRelevance (information retrieval)Artificial intelligenceHigh dimensionalitybusinessCluster analysiscomputer
researchProduct

Applications of alignment-free methods in epigenomics

2013

Epigenetic mechanisms play an important role in the regulation of cell type-specific gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have supported a role of DNA sequences in recruitment of epigenetic regulators. Alignment-free methods have been applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic profiles. Here, we review recent advances in such applications, including the methods to map DNA sequence to feature space, sequence comparison and prediction models. Computational studies using these methods have provided important insights into the epigenetic reg…

EpigenomicsSupport Vector MachineDNA sequenceSequence alignmentComputational biologyBiologyDNA sequencingEpigenesis GeneticArtificial IntelligenceSequence comparisonHumansNucleosomeEpigeneticsMolecular BiologyGeneEpigenomicsSequence (medicine)GeneticsModels GeneticSettore INF/01 - InformaticanucleosomeChromosome MappingComputational BiologySequence Analysis DNAmachine learningPapersSequence Alignmentepigeneticalignment-free methodInformation SystemsBriefings in Bioinformatics
researchProduct

A Fuzzy One Class Classifier for Multi Layer Model

2009

The paper describes an application of a fuzzy one-class classifier (FOC ) for the identification of different signal patterns embedded in a noise structured background. The classification phase is applied after a preprocessing phase based on a Multi Layer Model (MLM ) that provides a preliminary signal segmentation in an interval feature space. The FOC has been tested on synthetic and real microarray data in the specific problem of DNA nucleosome and linker regions identification. Results have shown, in both cases, a good recognition rate.

Settore INF/01 - InformaticaComputer sciencebusiness.industryFeature vectorPattern recognitionHide markov modelcomputer.software_genreFuzzy logicComputingMethodologies_PATTERNRECOGNITIONMulti Layer Method Nucleosome Positioning BioinformaticsPreprocessorSegmentationData miningArtificial intelligencebusinesscomputerClassifier (UML)Multi layer
researchProduct

Genome-wide characterization of chromatin binding and nucleosome spacing activity of the nucleosome remodelling ATPase ISWI

2011

The evolutionarily conserved ATP-dependent nucleosome remodelling factor ISWI can space nucleosomes affecting a variety of nuclear processes. In Drosophila, loss of ISWI leads to global transcriptional defects and to dramatic alterations in higher-order chromatin structure, especially on the male X chromosome. In order to understand if chromatin condensation and gene expression defects, observed in ISWI mutants, are directly correlated with ISWI nucleosome spacing activity, we conducted a genome-wide survey of ISWI binding and nucleosome positioning in wild-type and ISWI mutant chromatin. Our analysis revealed that ISWI binds both genic and intergenic regions. Remarkably, we found that ISWI…

GeneticsRegulation of gene expressionGeneral Immunology and MicrobiologyGeneral NeuroscienceChromatin bindingBiologyDNA-binding proteinGeneral Biochemistry Genetics and Molecular BiologyChromatinProphaseNucleosomeMolecular BiologyTranscription factorChromatin immunoprecipitationThe EMBO Journal
researchProduct

Assessment of computational methods for the analysis of single-cell ATAC-seq data

2019

Abstract Background Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans), lead to inherent data sparsity (1–10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10–45% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level. Results We present a benchmarking framework that …

Epigenomicslcsh:QH426-470Test data generationComputer scienceCellATAC-seqComputational biologyBiologyClusteringTranscriptomeMice03 medical and health scienceschemistry.chemical_compound0302 clinical medicinemedicineAnimalsHumansProfiling (information science)scATAC-seqnatural sciencesEpigeneticsFeature matrixCluster analysislcsh:QH301-705.5GeneTransposaseVisualization030304 developmental biologySparse matrix0303 health sciencesFeaturizationDimensionality reductionResearchComputational BiologySequence Analysis DNADimensionality reductionChromatinBenchmarkinglcsh:Geneticsmedicine.anatomical_structurelcsh:Biology (General)chemistryRegulatory genomicsSingle-Cell AnalysisPeak calling030217 neurology & neurosurgeryDNA
researchProduct

A new Multi-Layers Method to Analyze Gene Expression

2007

In the paper a new Multi-Layers approach (called Multi-Layers Model MLM) for the analysis of stochastic signals and its application to the analysis of gene expression data is presented. It consists in the generation of sub-samples from the input signal by applying a threshold technique based on cut-set optimal conditions. The MLM has been applied on synthetic and real microarray data for the identification of particular regions across DNA called nucleosomes and linkers. Nucleosomes are the fundamental repeating subunits of all eukaryotic chromatin, and their positioning provides useful information regarding the regulation of gene expression in eukaryotic cells. Results have shown a good rec…

Regulation of gene expressionbiologySettore INF/01 - InformaticaComputer scienceMicroarray analysis techniquesSaccharomyces cerevisiaeChromosomeComputational biologybiology.organism_classificationBioinformaticsSynthetic dataBioinformatics Nucleosome positioning Multi layer methods.ChromatinIdentification (information)chemistry.chemical_compoundchemistrySettore BIO/10 - BiochimicaGene expressionNucleosomeHidden Markov modelDNA
researchProduct

MOESM3 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

2019

Additional file 3: Review history.

researchProduct

Omic-based strategies reveal novel links between primary metabolism and antibiotic production

2008

Settore BIO/19 - Microbiologia GeneraleProteome Transcriptome Actinomycetes
researchProduct

Multi layer analysis.

2011

Settore INF/01 - Informaticamulti layer analysis.
researchProduct

Genome-wide characterization of chromatin binding and nucleosome spacing activity of the nucleosome remodelling ATPase ISWI.

2010

The evolutionarily conserved ATP-dependent nucleosome remodelling factor ISWI can space nucleosomes affecting a variety of nuclear processes. In Drosophila, loss of ISWI leads to global transcriptional defects and to dramatic alterations in higher-order chromatin structure, especially on the male X chromosome. In order to understand if chromatin condensation and gene expression defects, observed in ISWI mutants, are directly correlated with ISWI nucleosome spacing activity, we conducted a genome-wide survey of ISWI binding and nucleosome positioning in wild-type and ISWI mutant chromatin. Our analysis revealed that ISWI binds both genic and intergenic regions. Remarkably, we found that ISWI…

Adenosine TriphosphatasesMaleChromatin ImmunoprecipitationX ChromosomeD. melanogasterSettore INF/01 - Informaticachromatin remodellingGenomicsChromatin Assembly and DisassemblyArticleNucleosomesDNA-Binding ProteinsISWInucleosome spacingGene Expression RegulationSettore BIO/10 - BiochimicaAnimalsDrosophila ProteinsDrosophilaPromoter Regions GeneticCrosses GeneticProtein BindingTranscription FactorsThe EMBO journal
researchProduct

MOESM2 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

2019

Additional file 2: Code to reproduce the analyses.

researchProduct

MOESM1 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

2019

Additional file 1: Figures S1–S24, Tables S1-S21, Supplementary Notes, and Supplementary figure legends

researchProduct