0000000000037353

AUTHOR

Miguel Andrade-navarro

MOESM3 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

Additional file 3: Review history.

research product

MOESM5 of 7C: Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 5: Figure S2. 7C model parameters and optimal cut-offs for binary prediction. (A) Parameter values of the logistic regression model in 7C for different features (columns), separated for different models (rows). Average of model parameters of model training in 10-fold cross-validation is shown with error bars indicating the standard deviations. While the first six rows represent the models with the indicated TF ChIP-seq data and the genomic features, “Avg. all TF” is the average across all 124 TFs analyzed and “Avg. best 10 TF” is the average across the best ten performing TF models. (B) Prediction performance as f1 score (y-axis) for different cutoffs on the prediction proba…

research product

Additional file 1 of MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

Literature-curated marker genes. This file includes marker genes collected from the literature. (104KB PDF)

research product

Additional file 3: of Evolutionary stability of topologically associating domains is associated with conserved gene regulation

Figure S3. Distance between rearrangement breakpoints and random controls to closest TAD boundary. For each species (y-axis) and fill size threshold (vertical panels) the distances from all identified rearrangement breakpoints to its closest TAD boundary (x-axis) are compared between actual rearrangements (blue) and 100 times randomized background controls (gray). The left panel shows distances to next hESC TAD boundary and the right panel distances to closest GM12878 contact domain boundary. P-values according to Wilcoxonâ s rank-sum test. (PDF 14 kb)

research product

MOESM2 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 2: Figure S2. Distribution of the distances from the TSS of the genes to their closest TAD borders depending on the gene association with disease. The TAD border is represented with a vertical black line. Blue and salmon color represent genes associated and not with disease, respectively. If the TSS is within a TAD a negative distance is calculated, otherwise the distance is positive. a. HK genes. b. non-HK genes. Insets: The densities for the same data is shown. Genes not associated with disease have higher preference for TAD borders but this is only significant for non-HK genes (p-value = 9 × 10−11, Wilcoxon rank test).

research product

MOESM9 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 9: Figure S8. Distribution of TAD lengths depending on the number of TSSs they contain. An horizontal black line indicates the median for each TAD category.

research product

MOESM5 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 5: Figure S4. Fraction of genes for HK and non-HK genes associated with disease (ordinates) depending on the number of genes contained within the TADs (n; abscissas); the numbers have been aggregated for n ≥ 6. The lower the number of genes inside the TAD the higher fraction of the genes associated with disease: a. HK genes; a p-value = 3.6 × 10−5 from a Chi-square test, comparing the number of genes associated and non-associated with disease for the six TAD categories, was obtained. The green dotted line represents the genome-wide fraction of HK genes associated with disease (0.309). b. non-HK genes; a p-value = 1.2 × 10−43 from a Chi-square test has been obtained. The gree…

research product

Additional file 2: of Evolutionary stability of topologically associating domains is associated with conserved gene regulation

Figure S2. Distribution of evolutionary rearrangement breakpoints between human and 12 vertebrate genomes around domains. Relative breakpoint numbers from human and different species (horizontal panels) around hESC TADs (left), GM12878 contact domains (center), and GRBs (left). Blue color scale represents breakpoints from different fill-size thresholds. Dotted lines in gray show simulated background controls of randomly placed breakpoints. (PDF 42 kb)

research product

MOESM8 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 8: Figure S7. Mean ratios of the number of enhancers per gene within the TADs versus the number of genes within the TAD associated with disease (0 ≤ k ≤ n), where n is the total number of genes within the TAD. The value of n, which determines the TAD category, is represented for TADs with n = 1, 2, 3, and 4 genes (red, blue, green and purple lines, respectively). TADs with fewer TSSs have higher ratios of enhancers to TSSs. Moreover, for each TAD category, the higher the number of genes associated with disease, the higher the average number of enhancers per gene.

research product

Additional file 5 of MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

Primer sequences. This file includes the list of all primer sequences used by PCR. (55.7KB PDF)

research product

Additional file 2 of MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

Plots of Precision/Recall comparing our method to t -test. This file includes Plots of Precision/Recall comparing MGFM to t-test. (462KB PDF)

research product

MOESM3 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 3: Figure S3. Number of TADs depending on the number of genes within the TADs. The counts are displayed behind each bar. Many TADs contain few genes and from a total of 9274 TADs, 2017 TADs (21.7%) have no gene within them.

research product

MOESM1 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 1: Figure S1. Distribution of the distances from the TSS of genes to their closest TAD borders. The TAD borders are represented with a vertical black line. Blue and salmon color represent HK and non-HK genes, respectively. If the TSS is within a TAD a negative distance is calculated, otherwise the distance is positive. Each bin represents 500 nt. Inset: the density for the same data is shown. The preference of HKs toward the TAD borders is significant (p-value = 3 × 10−4, Wilcoxon rank test).

research product

MOESM6 of 7C: Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 6: Figure S3. High resolution Hi-C map with 7C loop predictions. The red color intensity shows Hi-C interaction frequencies at an example locus of chromosome 1. The blue squares indicate 7C loop predictions using a Rad21 ChIP-seq experiment. The figure was created using the Juicebox tool by loading the combined Hi-C data set in GM12878 from [13] with mapping quality MAPQ ≥30 at a resolution of 5 kb.

research product

Assessing the reliability of gene expression measurements in very-low-numbers of human monocyte-derived macrophages

Abstract Tumor-derived primary cells are essential for in vitro and in vivo studies of tumor biology. The scarcity of this cellular material limits the feasibility of experiments or analyses and hence hinders basic and clinical research progress. We set out to determine the minimum number of cells that can be analyzed with standard laboratory equipment and that leads to reliable results, unbiased by cell number. A proof-of-principle study was conducted with primary human monocyte-derived macrophages, seeded in decreasing number and constant cell density. Gene expression of cells stimulated to acquire opposite inflammatory states was analyzed by quantitative PCR. Statistical analysis indicat…

research product

Additional file 4 of MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

Description of the predicted marker genes. (126KB PDF)

research product

MOESM7 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 7: Figure S6. Distribution of the ratios of the number of enhancers to genes depending on the number of genes within a TAD. Mean and median values of each boxplot are shown by white diamonds and black horizontal lines, respectively.

research product

Additional file 1: of Evolutionary stability of topologically associating domains is associated with conserved gene regulation

Figure S1. Breakpoint identification accuracy as compared to gene synteny. Considered are adjacent pairs of human genes with one-to-one orthologs and intergenic distance below a size threshold. (A) Positive predicted value as the fraction of non-syntenic gene pairs with breakpoint from all considered gene pairs (syntenic and non-syntenic) with breakpoint. (B) False positive rate as the percent of syntenic gene pairs with breakpoint from the sum of syntenic pairs with breakpoint and non-syntenic gene pairs without breakpoint. (PDF 21 kb)

research product

MOESM4 of 7C: Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 4: Figure S1. Hi-C and ChIA-PET interactions and their overlap with CTCF motif pairs. (A) Number of genome-wide CTCF motifs by motif hit significance cutoff. (B) Number of CTCF motif pairs within 1 Mb distance by motif hit significance. (C) Percent of CTCF motif pairs that overlap with experimentally measured Hi-C and ChIA-PET loops by the motif hit significance. (D) Upset plot of true loop data sets (rows) and their size (horizontal bars) with their intersections (columns, and vertical bars) based on the number of overlapping CTCF motif pairs. (E) Distribution of interaction span (distance between anchors) of Hi-C loops and ChIA-PET loops in GM12878 that are used as gold st…

research product

Additional file 3 of MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

Gel electrophoresis images. This file includes the gel electrophoresis images (Figures S1â S11). (981KB PDF)

research product

MOESM7 of 7C: Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 7: Figure S4. (A) Prediction performance (auPRC) of 7C when trained and evaluated on different datasets of experimentally measured loops as gold standard. Rao_GM12878 refers to Hi-C loops from [13], Tang2015_GM12878_CTCF, and Tang2015_GM12878_RNAPII to ChIA-PET loops using CTCF or Polymerase II as the target [16]. In Union, all datasets were taken together, and in Intersection, only those CTCF motif pairs that were measured in all datasets were considered positive. (B) Prediction performance (auPRC) of 7C compared to a logistic regression model that uses only the the total coverage signal within +/− 500 bp around the motif center at both loop anchor sites separately. In both…

research product

MOESM6 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 6: Figure S5. Distribution of the number of enhancers within TADs versus the number of genes contained within the TADs. Mean and median values of each boxplot are shown by white diamonds and black horizontal lines, respectively. The more genes within a TAD, the larger the number of enhancers.

research product

MOESM1 of 7C:Â Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 1: Table S1. Metadata of ChIP-seq experiments from ENCODE in human GM12878 cells with accession ID and download link.

research product

Additional file 4: of Evolutionary stability of topologically associating domains is associated with conserved gene regulation

Table S1. Matching tissues and samples with CAGE expression data in human and mouse. (TSV 2 kb)

research product

MOESM12 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 12: Table S3. Distance of each TSS to the closest TAD border. The distance (negative) has been calculated for each TAD where the TSS is contained. If the TSS is within no TAD the closest distance (positive) to a TAD border has been calculated. Each entry of the table displays the following information by columns: geneId, gene strand, gene locus, TSS of gene, distance to the TAD border, and TAD.

research product

Additional file 5: of Evolutionary stability of topologically associating domains is associated with conserved gene regulation

Table S2. Ortholog genes in human and mouse with gene expression correlation across tissues. (TSV 1036 kb)

research product

MOESM2 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

Additional file 2: Code to reproduce the analyses.

research product

Additional file 2: of Automated selection of homologs to track the evolutionary history of proteins

Figure S1. Number of orthology pairwise relationships calculated with OrthoMCL, ProteinPathTracker and Reciprocal Best Hit Blast (RBHB) in 15 species, using the proteomes provided by OrthoMCL in the default species from the default path in ProteinPathTracker, and taking E. coli proteins as reference. a) All OrthoMCL pairs. b) Only the best 25% scored OrthoMCL pairs. (PNG 388Â kb)

research product

MOESM3 of 7C:Â Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 3: Table S3. Accession numbers and download URLs for data sets used in data type comparisons.

research product

MOESM11 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 11: Table S2. The 3650 different protein coding HKs.

research product

MOESM4 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 4: Table S4. TADs that contain only one gene.

research product

Additional file 1: of Automated selection of homologs to track the evolutionary history of proteins

List of complete reference proteomes used in the web tool, organised by evolutionary path. (XLSX 13Â kb)

research product

MOESM1 of Proteome-wide comparison between the amino acid composition of domains and linkers

Additional file 1. List of proteomes used for the analyses. Each proteome is described by the name of the species, abbreviation as used in the manuscript, UniProt organism ID, number of proteins, and percentage of amino acids from domains/linkers against the total amino acid composition of the proteome.

research product

MOESM1 of Assessment of computational methods for the analysis of single-cell ATAC-seq data

Additional file 1: Figures S1–S24, Tables S1-S21, Supplementary Notes, and Supplementary figure legends

research product

MOESM10 of The distributions of protein coding genes within chromatin domains in relation to human disease

Additional file 10: Table S1. The 18,141 different protein coding genes. Each row has the following information in the columns: geneid, gene locus, transcription starting site (TSS), and CTD gene association or not with disease.

research product

MOESM2 of 7C:Â Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs

Additional file 2: Table S2. Metadata of ChIP-seq experiments from ENCODE human HeLa cells with accession ID and download link.

research product