0000000001308518

AUTHOR

Tommaso Andreani

MOESM3 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

Additional file 3: Review history.

research product

Computational identification of cell-specific variable regions in ChIP-seq data.

ABSTRACT Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is used to identify genome-wide DNA regions bound by proteins. Several sources of variation can affect the reproducibility of a particular ChIP-seq assay, which can lead to a misinterpretation of where the protein under investigation binds to the genome in a particular cell type. Given one ChIP-seq experiment with replicates, binding sites not observed in all the replicates will usually be interpreted as noise and discarded. However, the recent discovery of high-occupancy target (HOT) regions suggests that there are regions where binding of multiple transcription factors can be identified. To investigate these regions,…

research product

Interpretable machine learning models for single-cell ChIP-seq imputation

AbstractMotivationSingle-cell ChIP-seq (scChIP-seq) analysis is challenging due to data sparsity. High degree of data sparsity in biological high-throughput single-cell data is generally handled with imputation methods that complete the data, but specific methods for scChIP-seq are lacking. We present SIMPA, a scChIP-seq data imputation method leveraging predictive information within bulk data from ENCODE to impute missing protein-DNA interacting regions of target histone marks or transcription factors.ResultsImputations using machine learning models trained for each single cell, each target, and each genomic region accurately preserve cell type clustering and improve pathway-related gene i…

research product

Single-cell ChIP-seq imputation with SIMPA by leveraging bulk ENCODE data

Abstract Single-cell ChIP-seq analysis is challenging due to data sparsity. We present SIMPA ( https://github.com/salbrec/SIMPA ), a single-cell ChIP-seq data imputation method leveraging predictive information within bulk ENCODE data to impute missing protein-DNA interacting regions of target histone marks or transcription factors. Machine learning models trained for each single cell, each target, and each genomic region enable drastic improvement in cell types clustering and genes identification.

research product

Expression and subcellular localization of USH1C/harmonin in the human retina provide insights into pathomechanisms and therapy

AbstractUsher syndrome (USH) is the most common form of hereditary deafness-blindness in humans. USH is a complex genetic disorder, assigned to three clinical subtypes differing in onset, course, and severity, with USH1 being the most severe. Rodent USH1 models do not reflect the ocular phenotype observed in human patients to date; hence, little is known about the pathophysiology of USH1 in the human eye. One of the USH1 genes, USH1C, exhibits extensive alternative splicing and encodes numerous harmonin protein isoforms that function as scaffolds for organizing the USH interactome. RNA-seq analysis of human retinas uncovered harmonin_a1 as the most abundant transcript of USH1C. Bulk RNA-seq…

research product

RNA Sequencing of Human Peripheral Blood Cells Indicates Upregulation of Immune-Related Genes in Huntington's Disease

Huntington's disease (HD) is an autosomal dominantly inherited neurodegenerative disorder caused by a trinucleotide repeat expansion in the Huntingtin gene. As disease-modifying therapies for HD are being developed, peripheral blood cells may be used to indicate disease progression and to monitor treatment response. In order to investigate whether gene expression changes can be found in the blood of individuals with HD that distinguish them from healthy controls, we performed transcriptome analysis by next-generation sequencing (RNA-seq). We detected a gene expression signature consistent with dysregulation of immune-related functions and inflammatory response in peripheral blood from HD ca…

research product

cis-regulatory variation modulates susceptibility to enteric infection in the Drosophila genetic reference panel

Abstract Background Resistance to enteric pathogens is a complex trait at the crossroads of multiple biological processes. We have previously shown in the Drosophila Genetic Reference Panel (DGRP) that resistance to infection is highly heritable, but our understanding of how the effects of genetic variants affect different molecular mechanisms to determine gut immunocompetence is still limited. Results To address this, we perform a systems genetics analysis of the gut transcriptomes from 38 DGRP lines that were orally infected with Pseudomonas entomophila. We identify a large number of condition-specific, expression quantitative trait loci (local-eQTLs) with infection-specific ones located …

research product

Assessment of computational methods for the analysis of single-cell ATAC-seq data

Abstract Background Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans), lead to inherent data sparsity (1–10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10–45% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level. Results We present a benchmarking framework that …

research product

MOESM2 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

Additional file 2: Code to reproduce the analyses.

research product

MOESM1 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

Additional file 1: Figures S1–S24, Tables S1-S21, Supplementary Notes, and Supplementary figure legends

research product