0000000001315256
AUTHOR
Joaquín Dopazo
A parallel and sensitive software tool for methylation analysis on multicore platforms.
Abstract Motivation: DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. Results: We present a new software tool, called HPG-Methyl, which efficiently maps bis…
Assessment of Targeted Next-Generation Sequencing as a Tool for the Diagnosis of Charcot-Marie-Tooth Disease and Hereditary Motor Neuropathy
Charcot-Marie-Tooth disease is characterized by broad genetic heterogeneity with >50 known disease-associated genes. Mutations in some of these genes can cause a pure motor form of hereditary motor neuropathy, the genetics of which are poorly characterized. We designed a panel comprising 56 genes associated with Charcot-Marie-Tooth disease/hereditary motor neuropathy. We validated this diagnostic tool by first testing 11 patients with pathological mutations. A cohort of 33 affected subjects was selected for this study. The DNAJB2 c.352+1G>A mutation was detected in two cases; novel changes and/or variants with low frequency (50 known disease-associated genes. Mutations in some of these gene…
PyCellBase, an efficient python package for easy retrieval of biological data from heterogeneous sources.
Background Biological databases and repositories are incrementing in diversity and complexity over the years. This rapid expansion of current and new sources of biological knowledge raises serious problems of data accessibility and integration. To handle the growing necessity of unification, CellBase was created as an integrative solution. CellBase provides a centralized NoSQL database containing biological information from different and heterogeneous sources. Access to this information is done through a RESTful web service API, which provides an efficient interface to the data. Results In this work we present PyCellBase, a Python package that provides programmatic access to the rich RESTfu…
Split decomposition A technique to analyze viral evolution
A clustering technique allowing a restricted amount of overlapping and based on an abstract theory of coherent decompositions of finite metrics is used to analyze the evolution of foot-and-mouth disease viruses. The emerging picture is compatible with the existence of viral populations with a quasispecies structure and illustrates various forms of evolution of this virus family. In addition, it allows the correlation of these forms with geographic occurrence.
Dysfunctional mitochondrial fission impairs cell reprogramming
We have recently shown that mitochondrial fission is induced early in reprogramming in a Drp1-dependent manner; however, the identity of the factors controlling Drp1 recruitment to mitochondria was unexplored. To investigate this, we used a panel of RNAi targeting factors involved in the regulation of mitochondrial dynamics and we observed that MiD51, Gdap1 and, to a lesser extent, Mff were found to play key roles in this process. Cells derived from Gdap1-null mice were used to further explore the role of this factor in cell reprogramming. Microarray data revealed a prominent down-regulation of cell cycle pathways in Gdap1-null cells early in reprogramming and cell cycle profiling uncovered…
VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy
The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use. Here we present VISMapper, a vector integration site analysis web server, to analyze next-generation sequencing data for retroviral vector integration sites. VISMapper can be found at: http://vismapper.babelomics.org . Because it uses novel mapping algorithms VISMapper is remarkably faster t…
Sequences of isopenicillin N synthetase genes suggest horizontal gene transfer from prokaryotes to eukaryotes
Evolutionary distances between bacterial and fungal isopenicillin N synthetase (IPNS) genes have been compared to distances between the corresponding 5S rRNA genes. The presence of sequences homologous to the IPNS gene has been examined in DNAs from representative prokaryotic organisms and Ascomycotina. The results of both analyses strongly support two different events of horizontal transfer of the IPNS gene from bacteria to filamentous fungi. This is the first example of such a type of transfer from prokaryotes to eukaryotes.
Understanding disease mechanisms with models of signaling pathway activities
Background Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine. Results Here we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation s…
Mutations in theMORC2gene cause axonal Charcot–Marie–Tooth disease
Charcot-Marie-Tooth disease (CMT) is a complex disorder with wide genetic heterogeneity. Here we present a new axonal Charcot-Marie-Tooth disease form, associated with the gene microrchidia family CW-type zinc finger 2 (MORC2). Whole-exome sequencing in a family with autosomal dominant segregation identified the novel MORC2 p.R190W change in four patients. Further mutational screening in our axonal Charcot-Marie-Tooth disease clinical series detected two additional sporadic cases, one patient who also carried the same MORC2 p.R190W mutation and another patient that harboured a MORC2 p.S25L mutation. Genetic and in silico studies strongly supported the pathogenicity of these sequence variant…
Defining the genomic signature of totipotency and pluripotency during early human development.
The genetic mechanisms governing human pre-implantation embryo development and the in vitro counterparts, human embryonic stem cells (hESCs), still remain incomplete. Previous global genome studies demonstrated that totipotent blastomeres from day-3 human embryos and pluripotent inner cell masses (ICMs) from blastocysts, display unique and differing transcriptomes. Nevertheless, comparative gene expression analysis has revealed that no significant differences exist between hESCs derived from blastomeres versus those obtained from ICMs, suggesting that pluripotent hESCs involve a new developmental progression. To understand early human stages evolution, we developed an undifferentiation netw…
Comparison of vaccine strains and the virus causing the 1986 foot-and-mouth disease outbreak in Spain: epizootiological analysis
RNAs of the most recent foot-and-mouth disease virus isolated in Spain (A5Sp86) during the 1986 outbreak, and of the three vaccine strains in use at that time in that country, have been compared. Although these viruses are serologically indistinguishable, differences have been found among them by T1 fingerprinting. This genetic heterogeneity affects the immunogenic VP1 gene, with amino acid changes located at the carboxyterminal end of the molecule. VP1-coding sequences obtained have been compared with those previously reported for European A5 FMDVs and it has been possible to trace their phylogenetic origin. The most parsimonious evolutionary tree obtained shows that the viruses analyzed a…
Deciphering genomic heterogeneity and the internal composition of tumour activities through a hierarchical factorisation model
Genomic heterogeneity constitutes one of the most distinctive features of cancer diseases, limiting the efficacy and availability of medical treatments. Tumorigenesis emerges as a strongly stochastic process, producing a variable landscape of genomic configurations. In this context, matrix factorisation techniques represent a suitable approach for modelling such complex patterns of variability. In this work, we present a hierarchical factorisation model conceived from a systems biology point of view. The model integrates the topology of molecular pathways, allowing to simultaneously factorise genes and pathways activity matrices. The protocol was evaluated by using simulations, showing a hi…
Characterization of the Proteomic and Genomic Profiles of Chronic Lymphocytic Leukemia Patients with Distinct Clinical Prognosis According to the Mutational Status of the IgVH and BCL6 and Expression Level of CD38 and ZAP70.
Abstract Introduction: In recent years several molecular prognostic factors have been identified in Chronic Lymphocytic Leukemia B (B-CLL). These include mutations in the variable region of the immunoglobulin genes (IgVH), somatic mutations in the BCL6 gene (Leukemia (2004) 18, 743–746) and the expression level of CD38 and ZAP70. However its biological significance is not clear. In order to identify novel molecular markers with prognostic and therapeutic value we have analyzed the proteomic and genomic profile of 40 B-CLL patients (Binet stage A). Material and methods: 100 μg of total PBMC proteins were used for IEF followed by 2D electrophoresis. Image analysis of scanned gels was used to …
Gene encoding capsid protein VP1 of foot-and-mouth disease virus A quasispecies model of molecular evolution
A phylogenetic tree relating the VP1 gene of 15 isolates of foot-and-mouth disease virus (FMDV) of serotypes A, C, and O has been constructed. The most parsimonious tree shows that FMDV subtypes and isolates within subtypes constitute sets of related, nonidentical genomes, in agreement with a quasispecies mode of evolution of this virus. The average number of nucleotide replacements per site for all possible pairs of VP1 coding segments is higher among representatives of serotype A than serotype C or O. In comparing amino acid sequences, the values of dispersion index (variance/mean value) are greater than 1, with the highest values scored when all sequences are considered. This indicates a…
MamaPred: A new and innovative approach to determine recurrence risk in HR+/HER2- early-stage breast cancer using HTG EdgeSeq technology.
558 Background: Genomic platforms, such as Mammaprint (Agendia) (MP) and OncoType (Genomic Health) (OT), have been validated to determine the risk of relapse in therapeutic decision-making in early-stage hormone receptor positive (HR+), epidermal growth factor receptor 2 (HER2) negative breast cancer (BC). Discordances in risk allocation between these platforms affect up to 30% of patients. This study aims to develop the MamaPred test to improve the diagnostic performance of recurrence risk in HR+/HER2- early-stage BC. Methods: A total of 606 HR+/HER2- early-stage BC previously tested with OT [n = 287; Low Risk (LR) = 165, Intermediate Risk (IR) = 103 and High Risk (HR) = 19] and MP (n = 3…
Multiple sequence editing by spreadsheet.
Spreadsheets have several functions and facilities that make them good candidates to be used as multiple sequence editors. They can be easily programmed (even by non-programmers) with macros that allow them to fit the needs of the user, free of the restrictions that programs written by other people have. Here I present a sheet containing a set of macros written for Lotus 1-2-3
Fixation of mutations at the VP1 gene of foot-and-mouth disease virus. Can quasispecies define a transient molecular clock?
The number of nucleotide (nt) substitutions found in the VP1 gene (encoding viral capsid protein) between any two of 16 closely related isolates of foot-and-mouth disease virus (FMDV) has been quantified as a function of the time interval between isolations [Villaverde et al.,J. Mol. Biol. 204(1988)771-776]. One of them (isolate C-S12) includes some replacements found in isolates that preceded it and other replacements found in later isolates. The study has revealed alternating periods of rapid evolution and of relative genetic stability of VP1. During a defined period of acute disease, the rate of fixation of replacements at the VP1 coding segment was 6 × 10-3 substitutions per nt per year…
Reducing the effect of the data order in algorithms for constructing phylogenetic trees.
Phylogeny of viroids, viroidlike satellite RNAs, and the viroidlike domain of hepatitis delta virus RNA.
We report a phylogenetic study of viroids, some plant satellite RNAs, and the viroidlike domain of human hepatitis delta virus RNA. Our results support a monophyletic origin of these RNAs and are consistent with the hypothesis that they may be "living fossils" of a precellular RNA world. Moreover, the viroidlike domain of human hepatitis delta virus RNA appears closely related to the viroidlike satellite RNAs of plants, with which it shares some structural and functional properties. On the basis of our phylogenetic analysis, we propose a taxonomic classification of these RNAs.
A method for determining the position and size of optimal sequence regions for phylogenetic analysis.
The availability of fast and accurate sequencing procedures along with the use of PCR has led to a proliferation of studies of variability at the molecular level in populations. Nevertheless, it is often impractical to examine long genomic stretches and a large number of individuals at the same time. In order to optimize this kind of study, we suggest a heuristic procedure for detection of the shortest region whose informational content can be considered sufficient for significant phylogenetic reconstruction. The method is based on the comparison of the pairwise genetic distances obtained from a set of sequences of reference to those obtained for different windows of variable size and posit…
Controlled Ovarian Stimulation Induces a Functional Genomic Delay of the Endometrium with Potential Clinical Implications
Context: Controlled ovarian stimulation induces morphological, biochemical, and functional genomic modifications of the human endometrium during the window of implantation. Objective: Our objective was to compare the gene expression profile of the human endometrium in natural vs. controlled ovarian stimulation cycles throughout the early-mid secretory transition using microarray technology. Method: Microarray data from 49 endometrial biopsies obtained from LH+1 to LH+9 (n = 25) in natural cycles and from human chorionic gonadotropin (hCG) +1 to hCG+9 in controlled ovarian stimulation cycles (n = 24) were analyzed using different methods, such as clustering, profiling of biological processes…
HPG pore: an efficient and scalable framework for nanopore sequencing data.
The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput. Despite the flood of data expected from this technology, the data analysis solutions currently available are only designed to manage small projects and are not scalable. Here we present HPG Pore, a toolkit for …
IL1β Induces Mesenchymal Stem Cells Migration and Leucocyte Chemotaxis Through NF-κB
Mesenchymal stem cells are often transplanted into inflammatory environments where they are able to survive and modulate host immune responses through a poorly understood mechanism. In this paper we analyzed the responses of MSC to IL-1β: a representative inflammatory mediator. Microarray analysis of MSC treated with IL-1β revealed that this cytokine activateds a set of genes related to biological processes such as cell survival, cell migration, cell adhesion, chemokine production, induction of angiogenesis and modulation of the immune response. Further more detailed analysis by real-time PCR and functional assays revealed that IL-1β mainly increaseds the production of chemokines such as CC…
Genetic Variability and Antigenic Diversity of Foot-and-Mouth Disease Virus
Foot-and-mouth disease (FMD) is an acute systemic disease of cloven-hooved animals, including cattle, swine, sheep, and goats. Despite mortality rates being generally below 5%, FMD severely decreases livestock productivity and trade. It is considered the economically most important disease of farm animals. Near two thousand million doses of vaccine are used annually to try to control FMD, which, nevertheless, is enzootic in most South American and African countries, parts of Asia, the Middle East, and the south of Europe. The causative agent, foot-and-mouth disease virus (FMDV), is an aphthovirus of the family Picornaviridae, a historically important virus as it was the first recognized vir…
FM19G11, a New Hypoxia-inducible Factor (HIF) Modulator, Affects Stem Cell Differentiation Status
The biology of the alpha subunits of hypoxia-inducible factors (HIF alpha) has expanded from their role in angiogenesis to their current position in the self-renewal and differentiation of stem cells. The results reported in this article show the discovery of FM19G11, a novel chemical entity that inhibits HIF alpha proteins that repress target genes of the two alpha subunits, in various tumor cell lines as well as in adult and embryonic stem cell models from rodents and humans, respectively. FM19G11 inhibits at nanomolar range the transcriptional and protein expression of Oct4, Sox2, Nanog, and Tgf-alpha undifferentiating factors, in adult rat and human embryonic stem cells, FM19G11 activit…
Acceleration of short and long DNA read mapping without loss of accuracy using suffix array
HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20 for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies.
The transcriptomics of an experimentally evolved plant-virus interaction
[EN] Models of plant-virus interaction assume that the ability of a virus to infect a host genotype depends on the matching between virulence and resistance genes. Recently, we evolved tobacco etch potyvirus (TEV) lineages on different ecotypes of Arabidopsis thaliana, and found that some ecotypes selected for specialist viruses whereas others selected for generalists. Here we sought to evaluate the transcriptomic basis of such relationships. We have characterized the transcriptomic responses of five ecotypes infected with the ancestral and evolved viruses. Genes and functional categories differentially expressed by plants infected with local TEV isolates were identified, showing heterogene…
Monte Carlo simulation in phylogenies: an application to test the constancy of evolutionary rates.
Monte Carlo simulation has commonly been used in phylogenetic studies to test different tree-reconstruction methods, and consequently, its application for testing evolutionary models can be considered as a natural extension of this usage. Repetitive simulation of a given evolutionary process, under the restrictions imposed by the model to be tested, along a determinate tree topology allow the estimate of probability distributions for the desired parameters. Next, the phylogenetic tree can be reconstructed again without the constraints of the model, and the parameter of interest, derived from this tree, can be compared to the corresponding probability distribution derived from the restricted…
Serum metabolomic profiling facilitates the non-invasive identification of metabolic biomarkers associated with the onset and progression of non-small cell lung cancer
Lung cancer (LC) is responsible for most cancer deaths. One of the main factors contributing to the lethality of this disease is the fact that a large proportion of patients are diagnosed at advanced stages when a clinical intervention is unlikely to succeed. In this study, we evaluated the potential of metabolomics by H-1-NMR to facilitate the identification of accurate and reliable biomarkers to support the early diagnosis and prognosis of non-small cell lung cancer (NSCLC). We found that the metabolic profile of NSCLC patients, compared with healthy individuals, is characterized by statistically significant changes in the concentration of 18 metabolites representing different amino acids…
The EGR2 gene is involved in axonal Charcot-Marie-Tooth disease
Background and purpose A three-generation family affected by axonal Charcot−Marie−Tooth disease (CMT) was investigated with the aim of discovering genetic defects and to further characterize the phenotype. Methods The clinical, nerve conduction studies and muscle magnetic resonance images of the patients were reviewed. A whole exome sequencing was performed and the changes were investigated by genetic studies, in silico analysis and luciferase reporter assays. Results A novel c.1226G>A change (p.R409Q) in the EGR2 gene was identified. Patients presented with a typical, late-onset axonal CMT phenotype with variable severity that was confirmed in the ancillary tests. The in silico studies sho…
Reference genome assessment from a population scale perspective: an accurate profile of variability and noise.
Abstract Motivation Current plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions …
Functional Genomics of 5-to 8-Cell Stage Human Embryos by Blastomere Single-Cell cDNA Analysis
Blastomere fate and embryonic genome activation (EGA) during human embryonic development are unsolved areas of high scientific and clinical interest. Forty-nine blastomeres from 5- to 8-cell human embryos have been investigated following an efficient single-cell cDNA amplification protocol to provide a template for high-density microarray analysis. The previously described markers, characteristic of Inner Cell Mass (ICM) (n = 120), stemness (n = 190) and Trophectoderm (TE) (n = 45), were analyzed, and a housekeeping pattern of 46 genes was established. All the human blastomeres from the 5- to 8-cell stage embryo displayed a common gene expression pattern corresponding to ICM markers (e.g., …
Pathway network inference from gene expression data
[EN] Background: The development of high-throughput omics technologies enabled genome-wide measurements of the activity of cellular elements and provides the analytical resources for the progress of the Systems Biology discipline. Analysis and interpretation of gene expression data has evolved from the gene to the pathway and interaction level, i.e. from the detection of differentially expressed genes, to the establishment of gene interaction networks and the identification of enriched functional categories. Still, the understanding of biological systems requires a further level of analysis that addresses the characterization of the interaction between functional modules. Results: We presen…
Pazopanib for treatment of typical solitary fibrous tumours: a multicentre, single-arm, phase 2 trial
[Background] Solitary fibrous tumour is an ultra-rare sarcoma, which encompasses different clinicopathological subgroups. The dedifferentiated subgroup shows an aggressive course with resistance to pazopanib, whereas in the malignant subgroup, pazopanib shows higher activity than in previous studies with chemotherapy. We designed a trial to test pazopanib activity in two different cohorts of solitary fibrous tumour: the malignant-dedifferentiated cohort, which was previously published, and the typical cohort, which is presented here.
A new parallel pipeline for DNA methylation analysis of long reads datasets
Background DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. Results In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while…
Quantitative characterization of antigens using monoclonal antibody reactivities
A multipurpose program that empirically relates antigenic reactivities with monoclonal antibodies (MAbs) to genetic distances is presented. The program uses a set of known genetic pairwise distances to weigh each MAb depending on its capacity to define groups of taxonomically related antigens. This allows highly accurate identification and classification of unknown antigens. Also, the weights obtained constitute a quantitative measure of epitope conservation and can be used for improved vaccine design. © 1993 Oxford University Press.
Analysis of chronic lymphotic leukemia transcriptomic profile: differences between molecular subgroups
B cell chronic lymphocytic leukemia (CLL) is a lymphoproliferative disorder with a variable clinical course. Patients with unmutated IgV(H) gene show a shorter progression-free and overall survival than patients with immunoglobulin heavy chain variable regions (IgV(H)) gene mutated. In addition, BCL6 mutations identify a subgroup of patients with high risk of progression. Gene expression was analysed in 36 early-stage patients using high-density microarrays. Around 150 genes differentially expressed were found according to IgV(H) mutations, whereas no difference was found according to BCL6 mutations. Functional profiling methods allowed us to distinguish KEGG and gene ontology terms showing…
Dysfunctional mitochondrial fission impairs cell reprogramming
We have recently shown that mitochondrial fission is induced early in reprogramming in a Drp1-dependent manner; however, the identity of the factors controlling Drp1 recruitment to mitochondria was unexplored. To investigate this, we used a panel of RNAi targeting factors involved in the regulation of mitochondrial dynamics and we observed that MiD51, Gdap1 and, to a lesser extent, Mff were found to play key roles in this process. Cells derived from Gdap1-null mice were used to further explore the role of this factor in cell reprogramming. Microarray data revealed a prominent down-regulation of cell cycle pathways in Gdap1-null cells early in reprogramming and cell cycle profiling uncovered…
Additional file 1 of A new parallel pipeline for DNA methylation analysis of long reads datasets
Text document containing an example of the command launched to execute each of the tools. (TXT 2 kb)